Overview
Version control systems, such as Git, are essential tools in software development, enabling seamless collaboration and change tracking. However, their widespread use can sometimes lead to unintended security oversights. While Git excels in managing code changes and collaboration, it can also inadvertently harbor sensitive information, like API keys or passwords. Often, developers may unintentionally commit confidential data, assuming its safety in private repositories. This article delves into the intricacies of Git, explores how sensitive information can end up in logs, and offers techniques and tools to unearth and prevent such vulnerabilities.
What is Git?
Git, at its heart, serves as a distributed version control system that tracks changes in any set of files. While developers primarily use it to coordinate work on large software projects, it can unintentionally store sensitive information like API keys, passwords, and secret tokens.
So, how does this oversight occur? Engineers might mistakenly commit sensitive data, assuming a temporary need or deeming it safe within a private repository. At times, they might commit secrets with plans to remove them later. However, even after deleting these secrets or updating files, their historical records persist in Git logs.
Basic Log Access
To access the logs, use the git log
command. This displays all commits along with their messages. Scanning for messages like ‘updated API key’ can quickly reveal sensitive data.
git log
Basic Log access with Grep
grep
, a robust command-line utility, pairs effectively with git log
to pinpoint sensitive details:
git log -p -i --grep=password
git log -p | grep -i -B2 -A2 'password'
Here, -p
showcases the difference in each commit, and -i
ensures a case-insensitive search.
Discovering Large Commits
Large commits can sometimes indicate that a big file, e.g. a database or configuration file, has been added to the repository. We can use the following command to identify the largest commits.
git log --oneline --shortstat | grep -E 'insertions\(\+\)'
Want to Try Hands-On?
TryHackMe and HackTheBox both have some hands-on labs to try these techniques out for yourself. Start with the easy machines and work your way up from there.
Other Tools
While manual exploration is valuable, there are dedicated tools designed to trawl through Git repositories to find secrets.
- truffleHog: This Python tool searches through Git repositories for secrets, digging deep into commit history and branches. TruffleHog uses entropy calculations to find potentially sensitive strings.
- GitLeaks: Another powerful tool, GitLeaks provides a wide array of options to search for secrets and sensitive information in Git repositories.
Preventing Sensitive Data in Git Logs
There are a few ways to help prevent sensitive data ending up in git logs.
- Pre-Commit Hooks: Use Git hooks to prevent secrets from being committed. Tools like pre-commit can integrate checks before each commit is made.
- Secrets Management Solutions: Use tools like HashiCorp’s Vault or AWS Secrets Manager to manage sensitive information outside of codebases.
- Regular Audits: Regularly audit code repositories with dedicated tools to ensure no new sensitive data has been committed.
Conclusion
In the intricate world of software development, tools like Git play a pivotal role, streamlining processes and fostering collaboration. However, with the immense benefits come inherent risks. As developers, it’s paramount to remain vigilant and adopt proactive measures to safeguard sensitive information inadvertently stored in version control systems. By coupling Git’s powerful features with the best practices and tools discussed, we can maintain the integrity of our codebases and fortify them against potential breaches. In an era where data is gold, ensuring its security is not just good practice—it’s a necessity.
About the Author: Alex Olsen
Alex is a Web Application Security specialist with experience working across multiple sectors, from single-developer applications all the way up to enterprise web apps with tens of millions of users. He enjoys building applications almost as much as breaking them and has spent many years supporting the shift-left movement by teaching developers, infrastructure engineers, architects, and anyone who would listen about cybersecurity. He created many of the web hacking courses in TCM Security Academy, as well as the PJWT and PWPT certifications.
Alex holds a Master’s Degree in Computing, as well as the PNPT, CEH, and OSCP certifications.
About TCM Security
TCM Security is a veteran-owned, cybersecurity services and education company founded in Charlotte, NC. Our services division has the mission of protecting people, sensitive data, and systems. With decades of combined experience, thousands of hours of practice, and core values from our time in service, we use our skill set to secure your environment. The TCM Security Academy is an educational platform dedicated to providing affordable, top-notch cybersecurity training to our individual students and corporate clients including both self-paced and instructor-led online courses as well as custom training solutions. We also provide several vendor-agnostic, practical hands-on certification exams to ensure proven job-ready skills to prospective employers.
Pentest Services: https://tcm-sec.com/our-services/
Follow Us: Blog | LinkedIn | YouTube | Twitter | Facebook | Instagram
Contact Us: sales@tcm-sec.com
See How We Can Secure Your Assets
Let’s talk about how TCM Security can solve your cybersecurity needs. Give us a call, send us an e-mail, or fill out the contact form below to get started.