One of the most overlooked yet effective techniques in our toolkit when pentesting web applications is code review. Unlike automated scanning and some black-box testing, code review digs into an application’s logic, uncovers subtle or blind vulnerabilities, and reveals the nuances of the application’s behavior. Today, we will explore what code review is and how you can get started with a basic methodology. We’ll also walk through a few practical examples that demonstrate issues commonly found during a review.
Watch a video version of this article:
What is Code Review?
Code review is the systematic examination of source code with the aim of identifying security vulnerabilities and deviations from best practices. In the context of pentesting, it’s about understanding how the code works and pinpointing where assumptions may lead to exploitable weaknesses, and rather than just relying on dynamic testing. Reviewing the code helps you see potential problems before they are deployed in a production environment.
For example, during a code review, you might discover that a function processes user input without proper validation or that sensitive data is being logged insecurely. Automated scanners often miss these kinds of issues, making manual review an invaluable step.
Code Review: Basic Methodology
So, starting out, let’s break down a simple methodology you can follow to start your journey into code review. Whilst every project is a little different in terms of the objectives, size of the code base, time available to carry out testing, etc., these steps can guide you through a thorough analysis:
Note: I’ve provided a checklist for you to follow for each section. This is meant as a starting point and not a comprehensive checklist for every assessment that is conducted.
1. Understanding the Target
We start by getting an overview of the codebase, the tech stack, and the general purpose of the application. We want to understand the main components, libraries, and flow of data. If we’re working on smaller pieces of code, such as commits, then we should also be looking at things like commit messages; otherwise, we’ll read any documentation that has been made available.
- Identify the tech stack and programming languages used.
- Review any available documentation, architecture diagrams, and commit messages.
- Understand the high-level flow of data and major application components.
- Note any known critical functionalities or security-sensitive areas.
- Confirm the boundaries between different modules or services.
2. Understanding Sources and Sinks
We now want to identify where user input enters the application and trace its journey through the system. This will show us where data is manipulated, sanitized, and eventually processed. Before we go deeper, we need to take a minute to understand sources and sinks.
Sources: Where Untrusted Data Enters
Sources are the points in the code where data comes into your system. This is usually data that originates from an external or untrusted input. Common examples include:
- User Inputs: Data from web forms, query parameters ($_GET in PHP, req.query in NodeJS), or request bodies ($_POST in PHP, req.body in NodeJS).
- External APIs: Data received from third-party services or APIs.
- Cookies and Headers: Information coming in via HTTP headers or cookies.
When reviewing the code, you want to spot these sources first because they’re the starting points of any data that might eventually be manipulated or, worse, exploited if mishandled.
Sinks: Where Data Gets Used in a Critical Way
Sinks are the spots in the code where the data is used in ways that might have security implications if that data hasn’t been properly checked or sanitized. Think of sinks as the “danger zones” where untrusted data could lead to vulnerabilities. Some common sinks include:
- Database Queries: Using user input in SQL queries (e.g., in PHP’s PDO or NodeJS’s query methods) without proper parameterization.
- System Commands: Passing data into system-level calls like PHP’s exec() or NodeJS’s child_process.exec().
- File Operations: Writing data to files or using it in file path operations where path traversal could be an issue.
- Dynamic Code Execution: Functions like eval() in JavaScript or PHP can execute code directly, so any data reaching these functions without being properly vetted is a major red flag.
When we conduct code review, we might take a few different approaches. For example, we could first try to identify a dangerous sink (e.g. an eval() function) and then work backward to understand the source and how we can get our input into that function to gain arbitrary code execution. We may also go the other way and follow sources into an application and try to understand where and how that data is handled and attempt to identify issues along the way, so identifying sources and sinks is key to understanding how data flows through an application.
- List all potential sources of user input.
- Identify and map the data flows from these sources through the code.
- Mark all dangerous sinks (e.g. eval, exec, direct SQL queries).
- Verify that data is validated or sanitized before reaching any sink.
- Document any discrepancies or potential bypasses in data handling.
3. Searching for Common Patterns
Certain code patterns can be found when reviewing code that helps us uncover weaknesses within the application that may then be exploitable as a vulnerability or combined with other weaknesses to create an exploit chain. For example, we may have a function that creates a session token in a predictable way, although this may not be easily identified without the source code, dangerous sinks (as we discussed in the previous section), or common weaknesses such as improper error handling.
- Identify dangerous functions (like eval, exec, unserialize).
- Look for repeated patterns that indicate weak error handling or logging practices.
- Check for insecure session or token generation.
- Identify any custom utility functions that process user input without sufficient validation.
- Look for legacy code patterns or deprecated functions that might pose risks.
4. Dependencies
Next, we need to verify that third-party libraries and modules are up to date and don’t have known vulnerabilities. Sometimes, the code itself is secure, but an outdated dependency can open the door to exploitation. This is usually a seemingly straightforward process to conduct (e.g. we can simply run npm audit), but on larger applications with many dependencies, the results can sometimes be overwhelming. In this case, we need to better understand how dependencies are used and if they come with breaking changes. If you’re conducting a pentest, it’s worth focusing on the results that come back as critical and looking to see if they are actually exploitable rather than just adding hundreds of results, many of which likely have no impact, to your report.
- Run dependency audit tools (e.g. npm audit, Composer Audit, etc.).
- Check for outdated or deprecated packages.
- Review vulnerability databases for known issues with used dependencies.
- Evaluate whether flagged vulnerabilities are exploitable in the given context.
- Document any dependency issues that could impact security.
5. Local Testing
When conducting code review, it’s important to test the ideas and theories you come up with. Sometimes, it’s possible to spin up a local instance of the application or have access to an environment where you can test payloads; however, you may also have to simulate attacks locally by running snippets of code and proving they behave in the way you expect. During this step, it’s important to consider the context of your attack, for example, if there is a compensating control elsewhere that’s not considered locally or if the versions of the production system are different from your local environment. For example, when testing PHP for type juggling issues, the version of PHP will play a big role in determining whether the code is vulnerable or not.
- Set up a local environment that closely matches production.
- Develop and run proof-of-concept payloads for suspected vulnerabilities.
- Verify the application’s behavior under different configurations or versions.
- Check for any compensating controls that may not be present in the local setup.
- Document test results and compare them with your expectations from the code review.
- Revisit and refine your theories based on test outcomes.
Code Review Challenge
Next is a beginner-friendly example to get started with. I’ve provided the code, and the solutions are documented below. If you’d like a curated list of challenges, then check out Florian Walter’s repo here: https://github.com/dub-flow/secure-code-review-challenges
const express = require('express');
const app = express();
const { MongoClient } = require('mongodb');
app.use(express.json());
app.post('/login', async (req, res) => {
const { username, password } = req.body;
// connect to db
const client = await MongoClient.connect('mongodb://localhost:27017', { useUnifiedTopology: true });
const db = client.db('myapp');
const user = await db.collection('users').findOne({ username: username, password: password });
if (user) {
res.send("Login successful");
} else {
res.send("Login failed");
}
client.close();
});
app.listen(3000, () => console.log('Server running on port 3000'));
An issue that can occur in applications using MongoDB is the direct insertion of untrusted user input into query objects. Even though MongoDB isn’t SQL, it’s still vulnerable to injection attacks if you’re not careful with how you handle user-supplied data.
In this application, the user’s username and password are passed directly into the MongoDB query. If we submit the following JSON payload, the query will match any document where both fields are greater than an empty string, bypassing the intended authentication. This is a classic NoSQL injection scenario.
{
"username": {"$gt": ""},
"password": {"$gt": ""}
}
Part of the code review process is to also find weaknesses and areas where best practices are not followed. Even though this is a simple example, after looking at this code, we should also consider:
- The password is being stored in plaintext
- There is no session management
- The DB connection doesn’t use authentication
- Lack of error handling
- Lack of rate limiting
- Etc
Following these ideas, I’ve written a “secure” implementation, which could still be improved because it allows for a mistake to be made when the application is deployed and would make it vulnerable to attack. I’ll add this after the code, so feel free to review it and see if you come to the same conclusion. Of course, there might be other things that I’ve overlooked or not written securely as well.
Note: There is currently no logout or other endpoints – this is intentional and out of scope ;) for this challenge.
const express = require('express');
const bcrypt = require('bcrypt');
const { MongoClient } = require('mongodb');
const validator = require('validator');
const rateLimit = require('express-rate-limit');
const helmet = require('helmet');
const jwt = require('jsonwebtoken');
const cors = require('cors');
const cookieParser = require('cookie-parser');
const csrf = require('csurf');
const winston = require('winston');
require('dotenv').config();
// configure logging
const logger = winston.createLogger({
level: 'info',
format: winston.format.combine(
winston.format.timestamp(),
winston.format.json()
),
transports: [
new winston.transports.File({ filename: 'error.log', level: 'error' }),
new winston.transports.File({ filename: 'combined.log' })
]
});
const app = express();
// db connection pool
let dbClient;
const initDbConnection = async () => {
try {
dbClient = await MongoClient.connect(process.env.MONGODB_URI, {
useUnifiedTopology: true,
useNewUrlParser: true,
maxPoolSize: 50
});
logger.info('Database connection established');
} catch (error) {
logger.error('Database connection error:', error);
process.exit(1);
}
};
// middleware
app.use(helmet());
app.use(express.json({ limit: '10kb' }));
app.use(cookieParser());
app.use(cors({
origin: process.env.ALLOWED_ORIGINS.split(','),
credentials: true
}));
app.use(csrf({ cookie: true }));
// rate limiting
const loginLimiter = rateLimit({
windowMs: 15 * 60 * 1000, // 15 mins
max: 5,
message: 'Too many login attempts, please try again later'
});
const isPasswordValid = (password) => {
const minLength = 8;
const hasUpperCase = /[A-Z]/.test(password);
const hasLowerCase = /[a-z]/.test(password);
const hasNumbers = /\d/.test(password);
const hasSpecialChar = /[!@#$%^&*(),.?":{}|<>]/.test(password);
return password.length >= minLength &&
hasUpperCase &&
hasLowerCase &&
hasNumbers &&
hasSpecialChar;
};
// JWT
const generateToken = (user) => {
return jwt.sign(
{ id: user._id, username: user.username },
process.env.JWT_SECRET,
{ expiresIn: '1h' }
);
};
app.post('/login', loginLimiter, async (req, res) => {
try {
const { username, password } = req.body;
// input validation
if (!username || !password) {
return res.status(400).json({
error: 'All fields are required'
});
}
// username validation
if (!validator.isAlphanumeric(username) || username.length < 3) {
return res.status(400).json({
error: 'Invalid username format'
});
}
const sanitizedUsername = validator.escape(username);
const db = dbClient.db(process.env.DB_NAME);
// grab user by username
const user = await db.collection('users').findOne({
username: sanitizedUsername
});
if (!user) {
logger.warn(`Failed login attempt for username: ${sanitizedUsername}`);
return res.status(401).json({
error: 'Invalid credentials'
});
}
// compare password hash
const validPassword = await bcrypt.compare(password, user.password);
if (!validPassword) {
logger.warn(`Invalid password for username: ${sanitizedUsername}`);
return res.status(401).json({
error: 'Invalid credentials'
});
}
// generate token
const token = generateToken(user);
res.cookie('token', token, {
httpOnly: true,
secure: process.env.NODE_ENV === 'production',
sameSite: 'strict',
maxAge: 3600000 // 1 hour
});
// logging
logger.info(`Successful login for user: ${sanitizedUsername}`);
res.status(200).json({
message: 'Login successful',
csrfToken: req.csrfToken()
});
} catch (error) {
logger.error('Login error:', error);
res.status(500).json({
error: 'Internal server error'
});
}
});
// error handling middleware
app.use((err, req, res, next) => {
logger.error(err.stack);
res.status(500).json({
error: 'Something went wrong!'
});
});
// init db and start server
const PORT = process.env.PORT || 3000;
initDbConnection().then(() => {
app.listen(PORT, () => {
logger.info(`Server running on port ${PORT}`);
});
});
Following the initial code review and going through a list of checks that I use to make an application ready for deployment, the changes I have made are:
- Password hashing with bcrypt
- Input validation and sanitization
- Rate limiting
- CSRF token
- Session handling
- Security headers with helmet
- CORS
- DB connection pooling
- Logging and error logging
- Using environment vars instead of hard-coded values
- Password complexity
Final thoughts:
A weak JWT secret (e.g. secret123) could be used when deploying this app; for more info on this, you can check out this video https://www.youtube.com/watch?v=2RKCDhH6dyA
Additional Tips for Effective Code Reviews
Know the Environment
Understanding the framework and libraries in use can give you a heads-up on common pitfalls. For instance, NodeJS applications might misuse asynchronous patterns, while PHP applications could be prone to session management issues.
Automate Where Possible
While manual review is essential, don’t shy away from using static analysis tools. They can help flag common mistakes or help you identify areas that need closer manual inspection.
Learn From Real-World Examples
More and more information is becoming readily available, from bug bounty reports and blog posts to simply searching for git commits tied to a specific bug or vulnerability. When you encounter an issue or want to learn more about the code behind a specific vulnerability, search for examples and review them thoroughly.
Consistency and Practice
If you’re new to code review, practice on open-source projects or past code snippets. Over time, you’ll start recognizing patterns more quickly, making your reviews more efficient and effective.
Conclusion
Code review is a valuable skill for web app pentesters. It provides a deeper insight into how applications function and uncovers vulnerabilities that automated scanners or black box testing might miss. By understanding what code review is, recognizing the best times to perform it, and following a structured methodology, you can significantly enhance your pentesting capabilities.
If you are looking for training in some complex and sophisticated pentesting methods, take a look at the Advanced Web Hacking course at the TCM Academy, which includes a detailed module that covers code review. For certification of those advanced web pentesting skills, check out the Practical Web Penetration Professional (PWPP) exam.

About the Author: Alex Olsen
Alex is a Web Application Security specialist with experience working across multiple sectors, from single-developer applications all the way up to enterprise web apps with tens of millions of users. He enjoys building applications almost as much as breaking them and has spent many years supporting the shift-left movement by teaching developers, infrastructure engineers, architects, and anyone who would listen about cybersecurity. He created many of the web hacking courses in TCM Security Academy, as well as the PWPA and PWPP certifications.
Alex holds a Master’s Degree in Computing, as well as the PNPT, CEH, and OSCP certifications.
About TCM Security
TCM Security is a veteran-owned, cybersecurity services and education company founded in Charlotte, NC. Our services division has the mission of protecting people, sensitive data, and systems. With decades of combined experience, thousands of hours of practice, and core values from our time in service, we use our skill set to secure your environment. The TCM Security Academy is an educational platform dedicated to providing affordable, top-notch cybersecurity training to our individual students and corporate clients including both self-paced and instructor-led online courses as well as custom training solutions. We also provide several vendor-agnostic, practical hands-on certification exams to ensure proven job-ready skills to prospective employers.
Pentest Services: https://tcm-sec.com/our-services/
Follow Us: Email List | LinkedIn | YouTube | Twitter | Facebook | Instagram | TikTok
Contact Us: [email protected]
See How We Can Secure Your Assets
Let’s talk about how TCM Security can solve your cybersecurity needs. Give us a call, send us an e-mail, or fill out the contact form below to get started.