When testing web applications, the understanding and use of various encoding schemes is a fundamental skill. In particular, we often see Base64, URL encoding, and HTML encoding used across many applications both as part of the application’s general functionality and offensively by pentesters or attackers.
Encoding and decoding schemes
Encoding is a process that transforms data from one form to another according to a specified scheme, whereas decoding reverts this process, returning the data to its original form. These schemes are designed to ensure that data can be safely transmitted, stored, and interpreted by different systems.
This binary-to-text encoding scheme represents binary data in an ASCII string format by translating it into a radix-64 representation. It’s frequently used to transmit data over media that are designed to handle text.
Also known as Percent-encoding, URL encoding is used to represent unreserved characters and reserved characters in URLs, which do not allow certain characters. It replaces unsafe ASCII characters with a “%” followed by two hexadecimal digits.
HTML encoding is used to convert characters that are not allowed in HTML into entities that can be used in HTML. The primary use of HTML encoding is to protect the webpage from certain types of attacks, such as Cross-site Scripting (XSS).
Uses during a pentest
It can be used for obfuscating payloads and bypassing input validation filters. For instance, instead of directly using <script>prompt(‘XSS’)</script> in an XSS attack, you can encode it using Base64, making it more likely to bypass security filters. To execute payloads like this we can use functions such as atob().
This is often employed when testing for injection flaws, including SQL Injection and Command Injection. Special characters in a malicious payload can be URL encoded to evade security measures. For instance, the payload ‘; DROP TABLE users; — could be URL encoded to %27%3B%20DROP%20TABLE%20users%3B%20– to bypass security filters.
HTML encoding can be used in scenarios where HTML tags might be filtered out to prevent XSS attacks. For example, if you’re testing for an XSS vulnerability and find that the < and > tags are being filtered, you could use HTML encoding to encode these characters as < and >, respectively. So, an XSS payload like <script>alert(‘XSS’)</script> becomes <script>alert(‘XSS’)</script>.
Unearthing sensitive data
Sensitive data might include personal user information, passwords, API keys, or proprietary application details. While encoding does obfuscate this data, it doesn’t provide robust protection because it can be easily reversed.
There are many places we can look for this kind of information:
- HTTP headers
- URL parameters
- Source code
- JS files
- API responses
- Hidden form fields
- Server logs
Understanding encoding and decoding schemes and their applications is critical. They can often be the key to unlocking vulnerabilities that might otherwise remain hidden. Aside from this, we should always be on the lookout for the misuse of encoding schemes as a way of obfuscating sensitive information.