Backup Failure - Staging Storage full

Steps to manually clear stale staged data for aborted/failed backup runs

Problem: 

In Zmanda, users may encounter the scenario where the staging contents are not cleared and remains there occupying useful storage. This happens as a result of failed or aborted backup runs, causing the staging area to accumulate data without freeing up space for subsequent backups. In the standard operation, the staging area is expected to clear after a successful backup run to ensure its readiness for future use. 

Impact: 

  • Staging area disk space utilization continuously increases. 
  • Subsequent backup runs fail due to the lack of available space in the staging area. 
  • Failed or aborted backup runs do not trigger automatic clearing of the staging area. 
  • Affects versions 4.0 through 5.1

Cause: 

  1. A backup run to cloud storage was aborted during the writing stage. As we can see here, the staged data remains in the specified path as it is. 
    (Usually /var/lib/amanda/staging/<backup-set-name>)  
    initialfirstsecond
  2. Multiple aborted/failed backups can eventually fill up the staging area completely making it unusable for future backup runs. To mitigate this, we need to clear the stale staging data after such backup runs so that the staging area is ready for future backup runs. 
  3. Flushing this data won’t work since the backup run will not be resumed as it is seen as a completed task. Hence, flushing it manually from the UI will result in failure as manual flush is not implemented for cloud backups. 

Solution: 

The goal of this solution is to guide users in deleting stale staging data in zmanda backups by following a step-by-step process. 

Step 1: List the data that needs to be deleted 

Note: Please make sure to be a root user before executing these commands. 

  • Execute as amandabackup user – su amandabackup 
  • Fetch holding disk data of your backup set – amadmin <backup-set-name> holding list 

Step 2: Delete stale staging contents 

  • Execute this command to delete the staging data: 
    amadmin <backup-set-name> holding delete <host> <disk> <datestamp> 
    <backup-set-name>: Replace this with the name of your specific backup set. 
    <host>: Specify the host for which you want to clear the staging area. 
    <disk>: Optional parameter. If provided, it specifies the disk associated with the staging area. 
    <datestamp>: Optional parameter. If provided, it specifies a specific datestamp associated with the staged contents. 
  • The above command will delete the staging data from a specific backup run. 
  • To delete all staging data, execute the following command: 
    amadmin <backup-set-name> holding delete <host> 

Example:  

  • Backup set name - resfl3 
  • Host - localhost 
  • Disk - /home/phanish.sn/flushbug 
  • Command -amadmin resfl3 holding delete localhost fl3

More Information: 

  • amadmin is an administrative tool in amanda that facilitates various backup management tasks. 
  • The holding delete subcommand is used to delete the contents of the staging area. 
  • Caution: Exercise caution when executing the holding delete command, as it permanently removes data from the staging area. Ensure that you have identified the correct backup set, host, disk, and datestamp to avoid unintended data loss.