Data Recovery — Internals, API & Limits

Technical reference for Data Recovery — how retention, the metadata log, and restore work, plus the full HTTP API and the limits to plan around.

How it works

1. Delayed garbage collection (retention)

Normally, when a file is deleted, SeaweedFS marks its needles as tombstones and reclaims the space on a later volume vacuum. With Data Recovery enabled, the master enforces a retention window: tombstones younger than the window are preserved through vacuum, so the underlying bytes remain readable and recoverable.

Enable it by setting a non-zero retention on the master:

weed master -deletionRetention=72h
# or, when running the combined server:
weed server -master.deletionRetention=72h

Notes:

  • The default is 0, which disables Data Recovery.
  • The value must be non-negative and is not hot-reloadable — restart the master to change it.
  • The auto-installed default 25 TB trial license caps retention at 1 hour; install a full license to use longer windows.

2. Metadata log analysis

The Admin server subscribes to the filer’s metadata change log and replays events within the retention window on demand. Each deletion is classified as either:

  • A normal deletion: the original file entry was removed.
  • An S3 delete marker: a versioned object was hidden behind a delete marker under <bucket>/<object>/.versions/.

The scan is stateless and paginated. Each request reads a fresh window from the log, so the Admin server does not maintain a separate recovery catalog.

3. Restore

For a normal deletion, SeaweedFS rereads the retained chunks from the volumes, including data still marked deleted, uploads fresh copies, and recreates the file entry at the chosen path. The restore path preserves inline data, chunk manifests, and server-side encryption metadata. For an S3 delete marker, SeaweedFS removes the marker so the previous version becomes visible again.

Options at restore time:

  • Target path: restore in place or to a new location.
  • Conflict mode: fail (leave an existing target untouched), overwrite, or rename (append a timestamp suffix).
  • Parent recovery: recreate missing parent directories, restoring their original metadata where the log still has it.

Object versioning, delete markers, and version-specific deletes

Data Recovery enhances SeaweedFS’s existing protections. Object Lock, versioning, and governance retention already guard your data: versioning keeps prior versions behind delete markers, and Object Lock can hold objects immutable for a defined retention period. Data Recovery adds a recovery layer for the deletes those controls don’t stop.

In a versioned bucket, an ordinary delete only inserts a delete marker that hides the object, and clearing it brings the previous version back. The harder case is a version-specific delete — a DELETE that names a versionId, a lifecycle rule that expires noncurrent versions, a privileged user bypassing governance retention, a script that empties a bucket version by version, or an attacker who sweeps every version of every object. Those operations permanently remove the underlying object, and versioning alone cannot undo them — as AWS puts it, “you cannot recover permanently removed objects.” Because Data Recovery retains deleted bytes at the storage layer within the retention window, it can recover a specific version that was permanently deleted, not just clear a delete marker.

HTTP API

The Admin UI operations are also available over HTTP for automation:

Endpoint Method Purpose
/api/undelete/settings GET Whether recovery is enabled, and the retention window
/api/undelete/search GET List recoverable deletions (filterable, streamable)
/api/undelete/event/{eventId} GET Details for a single deletion
/api/undelete/event/{eventId}/restore POST Restore one file
/api/undelete/restore-batch POST Restore many files at once

Requirements & limits

  • Requires a valid SeaweedFS Enterprise license and -deletionRetention greater than 0.
  • Only deletions inside the retention window are recoverable; older events age out of the log and their data is vacuumed away.
  • Objects with a TTL may not be recoverable. TTL expiry runs on its own schedule, independent of the deletion-retention window, so a TTL object’s data can be reclaimed once its TTL elapses — even inside the recovery window.
  • fs.log.purge will not prune metadata-log entries inside the retention window by default (use -force to override), preserving what Data Recovery depends on.
  • Data Recovery complements backups and disaster recovery. Keep independent backups for storage failure, cluster loss, regulatory retention, and security incidents that go beyond recoverable delete events.

License expiry: if your enterprise license expires, SeaweedFS falls back to the open-source behavior and Data Recovery is disabled. Data already retained on disk remains until the normal vacuum reclaims it.