Iceberg Table Maintenance

Over time, Iceberg tables accumulate small data files, stale snapshots, orphaned files, and fragmented manifests — all of which slow down query planning and waste storage. SeaweedFS keeps the tables behind its built-in Iceberg REST catalog healthy with an automated maintenance worker, so you don’t need a separate compaction service or external orchestration. Any engine that queries through the catalog — Spark, Trino, Dremio, DuckDB, and more — benefits automatically.

What It Does

The maintenance worker runs five operations, applied in this order:

  • Compact data files: Merges small Parquet data files within a partition into larger ones, grouped by partition spec and partition key. Fewer, larger files mean faster scans and less per-file overhead. Compaction bin-packs by default, or can re-sort rows by the table’s sort order for tighter data clustering.
  • Rewrite delete files: Consolidates Iceberg position delete files so reads spend less time reconciling merge-on-read deletes.
  • Expire snapshots: Removes old table snapshots and cleans up their manifest-list files, always keeping the newest few regardless of age.
  • Remove orphan files: Collects every file still referenced by a live snapshot and deletes unreferenced leftovers from previous writes and interrupted commits.
  • Rewrite manifests: Consolidates many small manifest files into fewer, larger ones to cut query-planning overhead.

How It Works

  • Automatic detection: A detection scan runs periodically (hourly by default) across your table buckets and flags tables that have crossed configurable thresholds — too many small files, too many snapshots, too many manifests, and so on. The admin server then schedules the maintenance jobs.
  • Runs on workers, not query engines: Maintenance executes on dedicated worker nodes, so it never competes with your query workloads.
  • Catalog-native and concurrency-safe: Each operation commits a new snapshot through the catalog’s normal metadata path and bails out if the table head moved during planning — so maintenance stays consistent with concurrent reads and writes.

Configuration

Maintenance is threshold-driven. The most common settings (defaults shown):

Setting Default Purpose
operations all Which operations to run (compact, rewrite_position_delete_files, expire_snapshots, remove_orphans, rewrite_manifests)
target_file_size_mb 256 Target size for compacted data files; smaller files are merge candidates
min_input_files 5 Minimum small files in a partition before compaction runs
rewrite_strategy binpack binpack, or sort to cluster rows by the table’s sort order
snapshot_retention_hours 168 Age (7 days) past which snapshots may be expired
max_snapshots_to_keep 5 Newest snapshots always kept, regardless of age
orphan_older_than_hours 72 Minimum age before an unreferenced file is treated as an orphan
min_manifests_to_rewrite 5 Minimum manifests before they are consolidated

Delete-file compaction has its own thresholds (delete_target_file_size_mb, delete_min_input_files, and related knobs) for grouping and sizing rewritten delete files.

Engine Compatibility

Works with any engine that reads Iceberg tables through the SeaweedFS Iceberg REST catalog, including Apache Spark, Trino, Dremio, and DuckDB.


For the full configuration reference and the latest options, see the Iceberg Table Maintenance wiki.