fbpx
Black Friday Sale Starts Now! Save 20% on certifications & live training- no code required. Save 50% on Academy memberships with code GOBBLE24

In our roles as cybersecurity professionals, we often find ourselves drowning in a sea of data logs, unable to extract meaning and insight from the enormous amount of data. This is where the great trio of sed, awk, and grep can be put into action. In spite of the fact that these three command-line utilities may seem like relics from the past, they are the unsung heroes of the log analysis world.

Sed: The Stream Editor

In the command line environment, Sed stands for Stream Editor, which is a powerful tool for manipulating and processing text data. Sed is like the Swiss Army knife of text manipulation. Streams of text can be edited in real-time with this application, making it a useful tool for analyzing log files in real-time. The following are some of the things you can do with sed:

  • Filter out irrelevant log entries
  • Extract specific fields or patterns
  • Perform complex text substitutions

History of Sed

Sed was developed in the early days of Unix, when text processing was a crucial aspect of what a computer could do. As a member of the Bell Labs team, Lee McMahon, came up with sed to replace the existing editor in a more flexible and efficient manner. Over time, sed evolved into a standard part of Unix systems, along with various implementations and extensions that were developed as a result.

Sed: Key Features

Sed’s core functionality revolves around text manipulation, offering a range of features, including:

Text filtering: Sed can select specific lines or patterns from input streams.

Text transformation: Sed can modify text by substituting, deleting, or inserting characters.

Regular expressions: Sed supports regular expressions for advanced pattern matching.

Scriptability: Sed allows users to write scripts to automate complex text processing tasks.

Here are a few examples of what you can do with sed:

Extract all log entries containing the word "error"

sed -n '/error/p' log.txt

Replace all occurrences of "oldstring" with "newstring"

sed 's/oldstring/newstring/g' log.txt
	Or
sed s:”oldstring”:”newstring”:g log.txt

Delete all lines starting with "#" (comments)

sed '/^#/d' log.txt  

Sed can also be used in incident response to detect SQL injection attacks. For example:

Extract all log entries containing the word "SELECT" to detect potential SQL injection attacks

sed -n '/SELECT/p' access.log 

It can also be used by penetration testers to extract password hashes. For example:

Extract all log entries containing the word "password" to extract password hashes

sed -n '/password/p' auth.log

Awk: The Pattern Processing Powerhouse

The Awk programming language, named after its creators Alfred Aho, Peter Weinberger, and Brian Kernighan, consists of a command line utility and a programming language that are commonly used for data analysis and text processing. This allows you to search, manipulate, and report on structured data in a structured way, making it a powerful tool for log analysis, data mining, and file manipulation, as well as log search and manipulation.

History of Awk

Awk’s development began in the 1970s at Bell Labs, where Aho, Weinberger, and Kernighan created the language as a successor to their earlier editor. Awk’s design focused on simplicity, flexibility, and efficiency, allowing users to write concise and powerful text processing programs. Over time, Awk evolved and became a standard component of Unix systems, with various implementations and extensions emerging.

Awk: Key Features

Awk is a programming language in its own right, designed specifically for text processing. It’s like a supercharged version of sed, with added features like:

Pattern matching: Awk can search for specific patterns in input data.

Field manipulation: Awk can extract, manipulate, and reformat fields within structured data.

Conditional statements: Awk supports if-else statements and loops for conditional processing.

Functions: Awk allows users to define custom functions for reusable code.

Regular expressions: Awk supports regular expressions for advanced pattern matching.

Here are a few examples of what you can do using awk:

Extract all log entries with a specific status code (404)

awk '$5 == "404"' access.log

Print the first and third fields of each line

awk '{print $1, $3}' log.txt : 

Calculate the sum of the second field across all lines

awk 'BEGIN {sum=0} {sum+=$2} END {print sum}' log.txt

Awk is a useful tool in incident response and can help identify brute force attacks. For example:

Extract all log entries with a 401 status code to identify potential brute force attacks

awk '$5 == "401"' access.log 

Penetration testers can also use awk to extract sensitive data. Take the following example:

Extract the second field of each line, which contains sensitive data

awk '{print $2}' sensitive_data.log 

Grep: The Guardian of Patterns

Grep, which stands for Global Regular Expression Print, is a command-line utility used for searching and comparing patterns in text files. There is a primary purpose of grep, which is to find and display lines that contain a certain pattern, making it one of the most important tools for log analysis, data mining, and file searching.

History of Grep

Ken Thompson, a member of the Unix development team at Bell Labs, created the first version of grep in the early 1970s. A regular expression-based pattern matching function in the earlier ed editor inspired Thompson to create grep. A variety of implementations and extensions of grep evolved over time and became a standard component of Unix systems.

Grep: Key Features

The core functionality of grep revolves around pattern matching and searching, offering a range of features, including:

Regular expressions: Grep supports regular expressions for advanced pattern matching.

Pattern searching: Grep can search for specific patterns in text files.

File searching: Grep can search for files containing specific patterns.

Line matching: Grep can display lines that contain a specific pattern.

Color highlighting: Grep can highlight matched patterns in color.

Some common examples of how grep is used include:

Find all log entries containing a specific IP address

grep '192.168.1.100' log.txt 

Search for lines containing either "error" or "warning"

grep -E 'error|warning' log.txt 

Recursively search for a pattern in all files within a directory

grep -R 'pattern' /path/to/directory 

Grep is also useful in incident response and can help detect malware communication. For example:

Search for log entries containing suspicious patterns indicative of malware communication

grep -E 'command\.php|eval\(' access.log 

In addition, penetration testers can use grep to find hidden backdoors in systems. For example:

Recursively search for files containing the string "bash.history" to find hidden backdoors

grep -R 'bash\.history' /home/user

RegEx: The Pattern Matching Powerhouse

Regular Expressions, or RegEx, are a sequence of characters that provide the basis for defining a search pattern that can be used to match and manipulate text. It is a powerful tool that helps you find, validate, and extract data from text files, log files, and strings. In order to write code, programmers, data analysts, and system administrators need to be able to express exactly what they need, making regex an indispensable skill.

The purpose of Basic Regular Expressions is to outline a pattern that can be used to match and manipulate text patterns using a sequence of characters that define a search pattern.

Here are a few basic regex patterns to get you started:

     . (dot) matches any single character

    * (star) matches zero or more of the preceding element

    + (plus) matches one or more of the preceding element

    ? (question mark) matches zero or one of the preceding element

    ^ (caret) matches the start of a line

    $ (dollar sign) matches the end of a line

    [abc] (square brackets) matches any character within the brackets

    (abc) (parentheses) groups elements and captures matches

Examples:

    grep ‘hello*’ file.txt matches lines containing “hello” followed by zero or more characters

    grep ‘^hello’ file.txt matches lines starting with “hello”

    grep ‘hello$’ file.txt matches lines ending with “hello”

    grep ‘[a-zA-Z]’ file.txt matches lines containing any letter (uppercase or lowercase)

RegEx in Action

RegEx is widely used in various programming languages, command-line utilities, and text editors. Some popular use cases include:

  • Validating user input (e.g., email addresses, phone numbers)
  • Extracting data from logs and text files
  • Searching and replacing text in files and strings
  • Parsing HTML and XML documents

Combining Sed, Awk, Grep and RegEx

Combining sed, awk, grep, and regex allows for powerful text processing and manipulation. By chaining these tools together, you can perform complex tasks such as data extraction, formatting, and filtering. For example, you can use grep to search for patterns in a file, then pipe the output to awk for further processing and formatting, and finally use sed to replace or delete text. RegEx can be used throughout the process to specify patterns and match text.

Here’s an example command that combines these tools:

`grep -o '<pattern>' file.txt | awk '{print $2}' | sed 's/<replacement>/' | grep '<final_pattern>'

In this command, grep searches for a pattern in a file, awk extracts the second field, sed replaces text, and finally, grep searches for a final pattern. By combining these tools, you can perform complex text processing tasks with ease.

Here are a few other examples of tasks you can perform by combining sed, awk, grep, and regex:

Replace a string and then search for a pattern

sed 's/oldstring/newstring/g' log.txt | grep 'pattern'

	or

sed s:”oldstring”:”newstring:”g log.txt | grep ‘pattern’ 

Print specific fields and then search for a pattern

awk '{print $1, $3}' log.txt | grep 'pattern'

Detecting SQL injection attacks

Identifying brute force attacks

grep -i "failed login" log.txt | grep -c "<IP_ADDRESS>"

Detecting malware communication

grep -i "tcp connection" log.txt | grep -i "unknownport"

Investigating incident response (group logs by timestamp)

awk '{count[$3]++} END {for (ts in count) print ts, count[ts]}' log.txt

Extracting sensitive data for compliance (grep search for ‘password’ or ‘ credential’

grep -oE '(password|credential)' file.txt

Monitoring system logs for suspicious activity (search logs for a specific username)

grep -i "<USER_NAME>" log.txt

Analyzing network logs for traffic patterns

awk '{count[$11]++} END {for (proto in count) print proto, count[proto]}' log.txt

In addition, they can also be applied to investigate incident response by extracting log entries containing a specific IP address, printing, specific fields, and replacing sensitive information:

grep '192.168.1.100' access.log | awk '{print $1, $3}' | sed 's/oldstring/newstring/g'

There are also use cases for penetration testing. For example:

grep -oE "username=[^&]+" access.log | sed 's/username=//' | grep -v false

This command is like a super-powerful search tool that helps you find specific information in a huge log file called access.log

Here’s what it does step by step:

grep -oE "username=[^&]+" access.log 

This part searches for lines in the access.log file that contain the word “username” followed by an equals sign and some characters that aren’t an ampersand (&).

-o flag tells it to only show the part of the line that matches the search, rather than the whole line.

-E flag lets us use regex search patterns.

The | (pipe character) sends the output of the grep command to the next command following the pipe character.

sed 's/username=//' 

This takes the results from the previous search and removes the “username=” part from each line, leaving just the username itself.

The | (pipe character) sends the output of the grep command to the next command following the pipe character. 

grep -v false

Finally, this filters out any lines that contain the word “false”. The -v flag inverts the search, so it shows everything except the lines with “false.”

In short, this command digs through a log file to find all the usernames that aren’t followed by the word “false.”

Enhancing Automated Scripting

By combining these tools and techniques, you can create powerful automated scripts to process and analyze data. Here are some examples for how these tools can be used to enhance automated scripts:

Grep:

  • Use regular expressions to match complex patterns
  • Use -v option to invert matching
  • Use -A and -B options to print surrounding lines
  • Use -f option to read patterns from a file

Awk:

  •  Use conditional statements (if/else) to manipulate data
  •  Use loops (for/while) to process data
  •  Use arrays to store and manipulate data
  •  Use functions to reuse code

Sed:

  •  Use regular expressions to match and replace patterns
  •  Use -e option to execute multiple commands
  •  Use -f option to read commands from a file
  •  Use -i option to edit files in place

Combining tools:

  •     Use grep to filter data, awk to manipulate data, and sed to transform data
  •     Use pipes (|) to chain commands together
  •     Use redirection (>, >>, <) to read and write files

Scripting tips:

  •     Use variables to store and reuse values
  •     Use conditionals (if/else) to make decisions
  •     Use loops (for/while) to repeat tasks
  •     Use functions to reuse code
  •     Use comments (#) to document code

Conclusion

The sed, awk, and grep commands along with regex skills are among the most powerful tools that a cybersecurity professional can acquire in order to extract valuable insights from data and stay on top of potential security threats.

A cybersecurity professional will be able to build a solid foundation for log analysis with the help of these utilities, so with practice you will become an expert in log analysis.

About TCM Security

TCM Security is a veteran-owned, cybersecurity services and education company founded in Charlotte, NC. Our services division has the mission of protecting people, sensitive data, and systems. With decades of combined experience, thousands of hours of practice, and core values from our time in service, we use our skill set to secure your environment. The TCM Security Academy is an educational platform dedicated to providing affordable, top-notch cybersecurity training to our individual students and corporate clients including both self-paced and instructor-led online courses as well as custom training solutions. We also provide several vendor-agnostic, practical hands-on certification exams to ensure proven job-ready skills to prospective employers.

Pentest Services: https://tcm-sec.com/our-services/
Follow Us: Blog | LinkedIn | YouTube | Twitter | Facebook | Instagram
Contact Us: sales@tcm-sec.com

See How We Can Secure Your Assets

Let’s talk about how TCM Security can solve your cybersecurity needs. Give us a call, send us an e-mail, or fill out the contact form below to get started.

 

tel: (877) 771-8911 | email: info@tcm-sec.com