Iceberg Table Maintenance
Over time, Iceberg tables accumulate small data files, stale snapshots, orphaned files, and fragmented manifests — all of which slow down query planning and waste storage. SeaweedFS keeps the tables behind its built-in Iceberg REST catalog healthy with an automated maintenance worker, so you don’t need a separate compaction service or external orchestration. Any engine that queries through the catalog — Spark, Trino, Dremio, DuckDB, and more — benefits automatically.
What It Does
The maintenance worker runs five operations, applied in this order:
- Compact data files: Merges small Parquet data files within a partition into larger ones, grouped by partition spec and partition key. Fewer, larger files mean faster scans and less per-file overhead. Compaction bin-packs by default, or can re-sort rows by the table’s sort order for tighter data clustering.
- Rewrite delete files: Consolidates Iceberg position delete files so reads spend less time reconciling merge-on-read deletes.
- Expire snapshots: Removes old table snapshots and cleans up their manifest-list files, always keeping the newest few regardless of age.
- Remove orphan files: Collects every file still referenced by a live snapshot and deletes unreferenced leftovers from previous writes and interrupted commits.
- Rewrite manifests: Consolidates many small manifest files into fewer, larger ones to cut query-planning overhead.
How It Works
- Automatic detection: A detection scan runs periodically (hourly by default) across your table buckets and flags tables that have crossed configurable thresholds — too many small files, too many snapshots, too many manifests, and so on. The admin server then schedules the maintenance jobs.
- Runs on workers, not query engines: Maintenance executes on dedicated worker nodes, so it never competes with your query workloads.
- Catalog-native and concurrency-safe: Each operation commits a new snapshot through the catalog’s normal metadata path and bails out if the table head moved during planning — so maintenance stays consistent with concurrent reads and writes.
Configuration
Maintenance is threshold-driven. The most common settings (defaults shown):
| Setting | Default | Purpose |
|---|---|---|
operations |
all |
Which operations to run (compact, rewrite_position_delete_files, expire_snapshots, remove_orphans, rewrite_manifests) |
target_file_size_mb |
256 | Target size for compacted data files; smaller files are merge candidates |
min_input_files |
5 | Minimum small files in a partition before compaction runs |
rewrite_strategy |
binpack |
binpack, or sort to cluster rows by the table’s sort order |
snapshot_retention_hours |
168 | Age (7 days) past which snapshots may be expired |
max_snapshots_to_keep |
5 | Newest snapshots always kept, regardless of age |
orphan_older_than_hours |
72 | Minimum age before an unreferenced file is treated as an orphan |
min_manifests_to_rewrite |
5 | Minimum manifests before they are consolidated |
Delete-file compaction has its own thresholds (delete_target_file_size_mb, delete_min_input_files, and related knobs) for grouping and sizing rewritten delete files.
Engine Compatibility
Works with any engine that reads Iceberg tables through the SeaweedFS Iceberg REST catalog, including Apache Spark, Trino, Dremio, and DuckDB.
For the full configuration reference and the latest options, see the Iceberg Table Maintenance wiki.