pg_lake: Postgres for Iceberg and Data lakes
pg_lake integrates Iceberg and data lake files into Postgres. With the pg_lake extensions, you can use Postgres as a stand-alone lakehouse system that supports transactions and fast queries on Iceberg tables, and can directly work with raw data files in object stores like S3.
At a high level, pg_lake lets you:
Create and modify Iceberg tables directly from PostgreSQL, with full transactional guarantees and query them from other engines
tables directly from PostgreSQL, with full transactional guarantees and query them from other engines Query and import data from Parquet , CSV , JSON , and Iceberg files stored in S3 or other compatible object stores
, , , and files stored in S3 or other compatible object stores Export query results back to S3 in Parquet , CSV , or JSON formats using COPY commands
, , or formats using COPY commands Read geospatial formats supported by GDAL, such as GeoJSON and Shapefiles
supported by GDAL, such as and Use compression transparently with .gz and .zst
and Use the built-in map type for semi-structured or key–value data
type for semi-structured or key–value data Combine heap , Iceberg , and external Parquet/CSV/JSON files in the same SQL queries and modifications — all with full transactional guarantees and no SQL limitations
... continue reading