When testing web applications, the understanding and use of various encoding schemes is a fundamental skill. In particular, we often see Base64, URL encoding, and HTML encoding used across many applications both as part of the application’s general functionality and offensively by pentesters or attackers.

Encoding and decoding schemes

Encoding is a process that transforms data from one form to another according to a specified scheme, whereas decoding reverts this process, returning the data to its original form. These schemes are designed to ensure that data can be safely transmitted, stored, and interpreted by different systems.

Base64 encoding:

This binary-to-text encoding scheme represents binary data in an ASCII string format by translating it into a radix-64 representation. It’s frequently used to transmit data over media that are designed to handle text.

cheesecake Y2hlZXNlY2FrZQ==

URL encoding:

Also known as Percent-encoding, URL encoding is used to represent unreserved characters and reserved characters in URLs, which do not allow certain characters. It replaces unsafe ASCII characters with a “%” followed by two hexadecimal digits.

tiramasu %74%69%72%61%6d%61%73%75

HTML encoding:

HTML encoding is used to convert characters that are not allowed in HTML into entities that can be used in HTML. The primary use of HTML encoding is to protect the webpage from certain types of attacks, such as Cross-site Scripting (XSS).

macaroon macaroon

Uses during a pentest

Base64 encoding

It can be used for obfuscating payloads and bypassing input validation filters. For instance, instead of directly using <script>prompt(‘XSS’)</script> in an XSS attack, you can encode it using Base64, making it more likely to bypass security filters. To execute payloads like this we can use functions such as atob().

<img onload="eval(atob('cHJvbXB0KCdwYW5jYWtlcycp'))">

URL encoding

This is often employed when testing for injection flaws, including SQL Injection and Command Injection. Special characters in a malicious payload can be URL encoded to evade security measures. For instance, the payload ‘; DROP TABLE users; — could be URL encoded to %27%3B%20DROP%20TABLE%20users%3B%20– to bypass security filters.

HTML encoding

HTML encoding can be used in scenarios where HTML tags might be filtered out to prevent XSS attacks. For example, if you’re testing for an XSS vulnerability and find that the < and > tags are being filtered, you could use HTML encoding to encode these characters as &lt; and &gt;, respectively. So, an XSS payload like <script>alert(‘XSS’)</script> becomes &lt;script&gt;alert(‘XSS’)&lt;/script&gt;.

Unearthing sensitive data

Sensitive data might include personal user information, passwords, API keys, or proprietary application details. While encoding does obfuscate this data, it doesn’t provide robust protection because it can be easily reversed.

There are many places we can look for this kind of information:

  • HTTP headers
  • URL parameters
  • Source code
  • JS files
  • API responses
  • Cookies
  • Hidden form fields
  • Server logs
  • Metadata


Wrapping up

Understanding encoding and decoding schemes and their applications is critical. They can often be the key to unlocking vulnerabilities that might otherwise remain hidden. Aside from this, we should always be on the lookout for the misuse of encoding schemes as a way of obfuscating sensitive information.