Tech News
← Back to articles

Public GitLab repositories exposed more than 17,000 secrets

read original related products more articles

After scanning all 5.6 million public repositories on GitLab Cloud, a security engineer discovered more than 17,000 exposed secrets across over 2,800 unique domains.

Luke Marshall used the TruffleHog open-source tool to check the code in the repositories for sensitive credentials like API keys, passwords, and tokens.

The researcher previously scanned Bitbucket, where he found 6,212 secrets spread over 2.6 million repositories. He also checked the Common Crawl dataset that is used to train AI models, which exposed 12,000 valid secrets.

GitLab is a web-based Git platform used by software developers, maintainers, and DevOps teams to host code, for CI/CD operations, development collaboration, and repository management.

Marshall used a GitLab public API endpoint to enumerate every public GitLab Cloud repository, using a custom Python script to paginate through all results and sort them by project ID.

This process returned 5.6 million non-duplicate repositories, and their names were sent to an AWS Simple Queue Service (SQS).

Next, an AWS Lambda function pulled the repository name from SQS, ran TruffleHog against it, and logged the results.

“Each Lambda invocation executed a simple TruffleHog scan command with concurrency set to 1000,” describes Marshall.

“This setup allowed me to complete the scan of 5,600,000 repositories in just over 24 hours.”

The total cost for the entire public GitLab Cloud repositories using the above method was $770.

... continue reading