Skip to content
Tech News
← Back to articles

Show HN: Pgit – A Git-like CLI backed by PostgreSQL

read original get PostgreSQL Command Line Tool → more articles
Why This Matters

Pgit introduces a revolutionary approach to version control by storing repositories in PostgreSQL, enabling full SQL query access to commit history and file changes. Its efficient delta compression outperforms traditional git garbage collection and offers seamless integration with AI tools for rapid codebase analysis. This innovation enhances data accessibility, analysis capabilities, and performance for developers and organizations alike.

Key Takeaways

TL;DR: Built a Git-like CLI backed by PostgreSQL with automatic delta compression. Import any git repo, query its entire history with SQL. Benchmarked on 20 real repositories (273,703 commits): pgit outcompresses git gc --aggressive on 12 out of 20 repositories, while giving you full SQL access to every commit, file version, and change pattern. Then I gave an AI agent a single prompt and it produced a full codebase health report on Neon's own repo in under 10 minutes.

What is pgit?

pgit is a Git-like version control CLI where everything lives in PostgreSQL instead of the filesystem. You get the familiar workflow (init, add, commit, push, pull, diff, blame), but your repository is a database. And that means your entire commit history is queryable.

pgit init pgit import /path/to/your/repo --branch main pgit analyze coupling

file_a file_b commits_together ──────────────────────── ──────────────────────── ──────────────── src/parser.rs src/lexer.rs 127 src/db/schema.go src/db/migrations.go 84 README.md CHANGELOG.md 63

No scripts. No parsing git log output. No piping things through awk. Just answers.

The most common analyses are built in as single commands: churn, coupling, hotspots, authors, activity, and bus-factor. All support --json for programmatic consumption, --raw for piping, and display results in an interactive table with search and clipboard copy.

But everything is PostgreSQL underneath. When the built-in analyses aren't enough, drop down to raw SQL:

The coupling analysis above, as raw SQL SELECT pa.path, pb.path, COUNT ( * ) as times_together FROM pgit_file_refs a JOIN pgit_paths pa ON pa.path_id = a.path_id JOIN pgit_file_refs b ON a.commit_id = b.commit_id AND a.path_id < b.path_id JOIN pgit_paths pb ON pb.path_id = b.path_id GROUP BY pa.path, pb.path ORDER BY times_together DESC ; This finds every pair of files changed in the same commit, counts co-occurrences, and ranks by frequency. The a.path_id < b.path_id condition avoids counting the same pair twice. pgit analyze coupling optimizes this further: it computes pairs in memory and filters out bulk reformats (commits touching 100+ files) that produce noise, not signal.

Want to know your maintenance hotspots? That's pgit analyze churn . Or as SQL:

... continue reading