Automatic EC Vacuum — Internals, API & Limits

Technical reference for Automatic EC Vacuum — the detection and compaction phases, safety guarantees, the full configuration table, worker flags, Helm values, health checks, and metrics to plan around.

The worker collects all EC shards for a target volume into its local -workingDir, vacuums the data locally, and then distributes the compacted shards back to volume servers. Volume servers are not overloaded with the CPU and disk work of compaction; they only serve controlled shard transfers and receive the replacement shards.

What Can It Do?

  • Detect Deleted Data: Identifies EC volumes where deleted needles are consuming significant space (configurable threshold).
  • Compact Volumes: Removes deleted needles from EC shards, reclaiming storage space across the cluster.
  • Offload Heavy Work: Performs shard compaction in the worker’s local working directory instead of on volume servers.
  • Improve Efficiency: Reduces unnecessary parity shard overhead by eliminating deleted data before the shard is fully utilized.
  • Reduce Volume Server Load: Keeps volume servers focused on normal serving work while workers handle vacuum CPU, temporary disk, and coordination.
  • Rack-Aware Optimization: Maintains shard distribution across failure domains during compaction.

Why Do You Need It?

In erasure-coded systems, when files are deleted, the space they occupied in the EC shards remains allocated. Over time, this creates “storage waste”:

Scenario Deleted Ratio Problem EC Vacuum Solution
Fresh volume 5% Minimal waste, no action needed No vacuum triggered
Aging dataset 30% 30% of shard space wasted on deleted data Detects and triggers vacuum
Old archive 50% Half the volume is deleted, still consumes parity overhead Compacts to reclaim 50% space

For example, a 100GB EC 10+4 volume with 40% deleted data:

  • Without vacuum: Still consumes 100GB + parity (140GB total with 10+4 ratio)
  • After vacuum: Compacted to ~60GB + parity (~84GB total), a 40% space savings

How Does It Work?

The EC Vacuum worker runs as a scheduled background task on your cluster. When a volume crosses the deleted-data threshold, the worker pulls the full EC shard set into local storage, compacts it, verifies the result, and places the compacted shards back onto the cluster.

              EC Volume Vacuum: worker-local compaction

   Volume Servers                                      Volume Servers
  holding old shards                                  receiving shards

  +-----------+        1. stream EC shards             +-----------+
  | volume A  | -----\                                /| volume D  |
  +-----------+      \                              /  +-----------+
  +-----------+       \                            /   +-----------+
  | volume B  | --------> +--------------------+ ----> | volume E  |
  +-----------+          | EC Vacuum Worker    |       +-----------+
  +-----------+       /  | - collect shards    |   \   +-----------+
  | volume C  | -----/   | - vacuum locally    |    \> | volume F  |
  +-----------+          | - verify output     |       +-----------+
                         | - distribute shards |
                         +--------------------+
                             local -workingDir

       2. remove deleted needles locally
       3. rebuild compacted EC shards
       4. distribute compacted shards back with rack-aware placement

Detection Phase

The worker periodically scans EC volumes in the cluster:

  1. Analyzes each EC volume’s shard composition
  2. Calculates the ratio of deleted needles vs. total data
  3. Compares against the configured threshold (default: 30%)
  4. Only processes volumes that exceed the threshold

Compaction Phase

When a volume triggers vacuum:

  1. Analyze: Determines which shards can be safely compacted
  2. Collect: Streams all required EC shards from volume servers into the worker’s local -workingDir
  3. Vacuum Locally: Removes deleted needles and rebuilds compacted shard files on the worker
  4. Verify: Validates that compacted shards match the original data
  5. Distribute: Places compacted shards back across the cluster using rack-aware placement
  6. Cleanup: Removes old oversized shards and temporary data

Technology Advantages

  • No in-place volume-server vacuum: Volume servers are not asked to compact EC shards on their own disks. They serve shard data to the worker and receive compacted replacements.
  • Resource isolation: CPU, temporary disk I/O, and shard rewrite work are concentrated on worker nodes that operators can size independently.
  • Predictable cluster impact: Global and per-worker concurrency limits control how many volumes are vacuumed at once.
  • Network-efficient placement: The worker can distribute compacted shards back to appropriate target nodes while preserving rack-aware EC placement.
  • Operational simplicity: Adding more worker capacity increases vacuum throughput without changing the volume server role.

Safety Guarantees

  • Only proceeds when sufficient healthy shards exist (>= data shard count)
  • Validates data integrity before replacing original shards
  • Keeps original shards available until compacted replacements are verified and distributed
  • Shard distribution follows the same rack-aware rules as EC repair

Configuration

EC Vacuum runs as a plugin worker with configurable thresholds:

Setting Default Description
Detection interval 30 min How often to scan for high-deletion volumes
Detection timeout 10 min Maximum time for a detection scan
Min interval 300s Minimum seconds between detection runs
Deleted ratio threshold 0.30 Trigger vacuum if >= 30% of volume is deleted
Max jobs per cycle 100 Maximum vacuum jobs per detection cycle
Global concurrency 4 Total concurrent vacuum jobs across cluster
Per-worker concurrency 1 Concurrent vacuum jobs per worker node

You can also filter vacuums by collection to focus on specific data sets.

Deployment

EC Vacuum runs as a plugin worker process, integrated with your EC infrastructure. Start one or more weed worker processes that connect to the admin server.

Starting a Worker

EC Vacuum is automatically included when the erasure_coding handler is enabled:

# Start a worker for EC tasks (includes EC encoding, EC repair, and EC vacuum)
weed worker -admin=admin.example.com:23646 -jobType=erasure_coding \
  -workingDir=/var/lib/seaweedfs-plugin -maxExecute=2

# Start a worker handling all heavy tasks
weed worker -admin=admin.example.com:23646 -jobType=heavy \
  -workingDir=/var/lib/seaweedfs-plugin -maxExecute=4

# Start a worker handling all available task types
weed worker -admin=admin.example.com:23646 -jobType=all \
  -workingDir=/var/lib/seaweedfs-plugin

Key Options

Flag Description
-admin Admin server gRPC address (required)
-jobType Task types: erasure_coding (or ec), heavy, all
-workingDir Directory for collected input shards and compacted output during vacuums
-maxExecute Max concurrent job executions per worker (default: 4)
-metricsPort Prometheus metrics port for monitoring
-id Stable worker ID across restarts; auto-generated if omitted

Production Recommendations

  • Run at least 2 worker instances for high availability
  • Allocate sufficient -workingDir disk space for the largest concurrent EC vacuum job, including collected input shards and compacted output
  • Set metricsPort to monitor vacuum progress and troubleshoot issues
  • Consider grouping EC workers by region/data center to minimize cross-network transfers
  • Vacuum works alongside EC repair without coordination

Kubernetes

Workers are supported in the SeaweedFS Helm chart:

worker:
  enabled: true
  replicas: 2
  jobType: "heavy"
  maxExecute: 2
  workingDir: "/var/lib/seaweedfs-plugin"
  metricsPort: 9327

Health Checks

Workers expose HTTP endpoints when -metricsPort is set:

  • /health — always returns 200
  • /ready — returns 200 only when connected to admin
  • /metrics — Prometheus metrics

Monitoring and Observability

EC Vacuum integrates with SeaweedFS observability:

Metric Purpose
vacuum_jobs_detected Number of volumes requiring vacuum in the last detection cycle
vacuum_jobs_executed Number of successful compactions
vacuum_jobs_failed Number of failed compactions
vacuum_bytes_reclaimed Total storage space freed across all compactions
vacuum_shard_rebuild_time Time taken to rebuild shards during compaction

How EC Vacuum Complements Other Enterprise Features

  • With EC Repair: EC repair restores fault tolerance; EC vacuum improves efficiency
  • With Self-Healing: Self-healing detects shard issues; EC vacuum handles scheduled compaction
  • With Custom EC Ratios: Higher-density ratios like 20+4 benefit most from vacuum’s space reclamation

Requirements & limits

  • Requires a valid SeaweedFS Enterprise license; runs as a plugin worker connected to the admin server.
  • Only proceeds when sufficient healthy shards exist (>= data shard count).
  • Vacuum is only triggered for volumes past the deleted-ratio threshold (default 0.30).
  • Global and per-worker concurrency limits cap how many volumes are vacuumed at once (defaults: 4 global, 1 per worker).