How to Use the Tape Span Script

This script can be used directly within Zmanda Pro to copy storage vault contents from disk to tape. It automatically handles switching tapes if the data is too large for a single tape.

Use Case

Automatic archival of a specified path to one or more tapes within a tape library. If the specified path is too large for one tape, the program automatically exchanges tapes and continues writing to the next tape until all data is written.

For more information on how to use this script within Zmanda Pro, see this KB article.

High-Level Details

  • The tool is called using a single command so that it can be used easily within Zmanda Pro pre- and post-job commands
  • The exchange of tapes happens automatically, without any user input
  • The tool works with a variety of tape libraries (uses standard tar, mt, and mtx commands)
  • The tool informs the user which tapes were used for the job
  • The data is recoverable using standard Linux commands (tar)
  • IMPORTANT: It is the user's responsibility to keep track of which tapes have data and when they can be overwritten. It is also the user's responsibility to manage retention policies and keep track of when jobs were performed.

Dependencies

  • python3 (Developed using 3.4.10)
  • tar (Developed using 1.23)
  • mt-st (Developed using 1.1)
  • mtx (Developed using 1.3.12)

Example Usage

Click Here to Download (tape_span.tar.gz) - sha256 Checksum: b3ca3a9e26963800504877ffd6394aa6c0e73dae02d86f26575ba08192224968

Generic Usage Example:
  • python3 multi_tape_backup.py <volume/tape size in MB> <path to source> <tape library device> <tape drive device> <drive number>
Specific Usage Example:
  • python3 multi_tape_backup.py 1500000 /home /dev/sg2 /dev/st0 0
Definition of Parameters:
  • Volume/tape size in MB -> The capacity of each tape. Example: LTO-5 = 1.5TB = 1500000 MB
    • Used to estimate if there are enough tapes in the library, thus, it is recommended to use the native size and not the compressed size of each tape
  • path to source -> The path to the file or folder that you wish to archive
  • tape library device -> Example: /dev/sg2
    • Find this using lsscsi -g
  • tape drive device -> Example: /dev/st0
    • Find this using lsscsi -g
  • drive number -> the Data Transfer Element number that corresponds to the above device
    • find this using mtx -f <tape library device> status

Example Output

[root@mhvtl tape]# python3 multi_tape_backup.py 10000 /mnt/NAS/5gb1/ /dev/sg12 /dev/st6 2

INFO: The first available tapes will be used for this job and any current data will be OVERWRITTEN. The job will proceed in 15 seconds

1 tapes will be used for this job.

Here are the labels of the tapes that will be used:

Storage Element 1, Label: G03001TA

Loading media from Storage Element 1 into drive 2...done

Executing tar command: tar -cvMf /dev/st6 /mnt/NAS/5gb1/ --new-volume-script=./change_tape.sh

tar: Removing leading `/' from member names

/mnt/NAS/5gb1/

/mnt/NAS/5gb1/Largefile.txt

Unloading drive 2 into Storage Element 1...done

Loading media from Storage Element 2 into drive 2...done

Unloading drive 2 into Storage Element 2...done

Loading media from Storage Element 3 into drive 2...done

Unloading drive 2 into Storage Element 3...done

Loading media from Storage Element 4 into drive 2...done

Unloading drive 2 into Storage Element 4...done

Loading media from Storage Element 5 into drive 2...done

Unloading drive 2 into Storage Element 5...done

Loading media from Storage Element 6 into drive 2...done

Unloading drive 2 into Storage Element 6...done

Loading media from Storage Element 7 into drive 2...done

Unloading drive 2 into Storage Element 7...done

Loading media from Storage Element 8 into drive 2...done

Unloading drive 2 into Storage Element 8...done

Loading media from Storage Element 9 into drive 2...done

Unloading drive 2 into Storage Element 9...done

Loading media from Storage Element 10 into drive 2...done

Unloading drive 2 into Storage Element 10...done

Loading media from Storage Element 11 into drive 2...done

/mnt/NAS/5gb1/zmc_restored_2023-12-02_01-59-45.Backupset19.default.admin.281463.log

/mnt/NAS/5gb1/zmc_restored_2023-11-24_12-57-18.minio.default.admin.327640.log

/mnt/NAS/5gb1/zmc_restored_2023-11-24_10-34-45.minio.default.admin.273056.log

Backup completed successfully.

Unloading drive 2 into Storage Element 11...done

Tapes used:

Storage Element StorageElement, Label: Label

Storage Element 1, Label: G03001TA

Storage Element 2, Label: G03002TA

Storage Element 3, Label: G03003TA

Storage Element 4, Label: G03004TA

Storage Element 5, Label: G03005TA

Storage Element 6, Label: G03006TA

Storage Element 7, Label: G03007TA

Storage Element 8, Label: G03008TA

Storage Element 9, Label: G03009TA

Storage Element 10, Label: G03010TA

Storage Element 11, Label: G03011TA

Recovery Instructions (manual process)

Prerequisites

  • The tapes with the data to be restored must be in the library (if using a library to do the restore)
  • You can use any drive (even standalone) if the LTO number matches that of the tapes
Steps:
  1. Put the first tape in the desired drive (mtx load command if using a library)
  2. Run tar -xvMf <tape drive> -C <optional output directory>
  3. Note: The "-C" flag and output directory are optional. If blank, data will be restored to the current working directory
  4. When prompted, unload the current tape and load the next tape. THIS MUST BE DONE IN THE SAME ORDER THAT THE TAPES WERE USED. Pay attention to the labels. You can find the order from the output of the backup job (See example output above).
  5. The restore should complete automatically, and all you must do is switch the tapes (this will require opening two terminals on the same system if using a tape library)

Recovery Instructions (semi-automated)

This is a two-part process. First, you must run the restore.py script to generate helper files required to automate the exchange of tapes. Then, run the tar command given by the restore.py program to restore data automatically.

Prerequisites

  • The tapes with the data to be restored must be in the library AND they must be slotted in the same sequential order as they were written to during the backup job

Steps:

  1. Run python3 restore.py <tape library device> <tape drive device> <drive number> which will generate the following helper files:
    1. available_tapes.csv
    2. used_tapes.csv
    3. devices.txt
  2. Follow the remaining steps as described by the output of restore.py. The commands are created dynamically.
    Example Output:
    [root@mhvtl tape_span]# python3 restore.py /dev/sg11 /dev/st0 0
    Loading media from Storage Element 1 into drive 0...done

    INFO: Ready to restore. Tapes will be read sequentially as found in the library. Ensure the tapes are in the same sequence as they were written to for the backup, otherwise, data WILL be corrupted.

    INFO: Run the following command to restore data to the current working directory:
    tar -xvMf /dev/st0 --new-volume-script=./change_tape.sh

    INFO: If you wish to restore to a different directory, run the following commands:
    cp change_tape.sh change_tape.py /path/to/restore/directory
    mv available_tapes.csv used_tapes.csv devices.txt /path/to/restore/directory
    cd /path/to/restore/directory
    tar -xvMf /dev/st0 --new-volume-script=./change_tape.sh

    INFO: When the job completes, remember to unload the last tape, and run this command to remove the temporary files:
    rm -rf available_tapes.csv used_tapes.csv devices.txt
  3. IMPORTANT: Once the restore is complete, unload the last tape and replace it in its original slot. Run the given command to delete the helper files. If you do not do this, there is a chance that future backup/restore jobs using these scripts will fail.

Additional Information

Components

  • multi_tape_backup.py
    • Entry point for the program
    • Includes the Tape class
    • Includes the following functions:
      • get_available_tapes(tape_device)
      • validate_inputs(volume_size_mb, repo_path, tape_device, tape_drive)
      • calculate_tapes_needed(volume_size_mb, repo_path, tape_device, tape_drive, available_tapes)
      • is_tape_in_drive(tape_drive)
      • write_to_tapes(repo_path, tape_device, tape_drive, available_tapes, drive_number)
      • load_tape_into_drive(tape_device, slot_number, drive_number)
      • unload_last_tape(tape_device, drive_number)
    • Uses the following libraries:
      • os
      • subprocess
      • sys
      • math
      • time
      • csv
  • change_tape.py
    • Responsible for loading and unloading tapes for the tar command
    • Takes in the filenames (hardcoded) available_tapes.csv, used_tapes.csv, and devices.txt
    • Includes the following functions:
      • load_tape(devices_file, tapes_file)
      • unload_tape(tape_device, slot_number, drive_number)
      • load_tape_into_drive(tape_device, slot_number, drive_number)
    • Uses the following libraries:
      • sys
      • csv
      • subprocess
      • os
  • change_tape.sh
    • Shell script that simply runs change_tape.py with the appropriate parameters
  • README.md
    • Usage Instructions
  • TEMPORARILY CREATED FILES (DELETED AFTER PROGRAM COMPLETION)
    • available_tapes.csv
      • csv file of available tape labels and storage element numbers
    • used_tapes.csv
      • csv file of used tape labels and storage element numbers
    • devices.txt
      • text file containing the user-provided parameters

Explanation of Routine

  1. Take in inputs and validate
  2. Copy the inputs to devices.txt
  3. Check if a tape is already in the specified drive (if there is, prompt the user to remove it and re-run the program)
  4. Get the available tapes and record to available_tapes.csv
  5. Print a warning message saying this is a destructive script and will overwrite data currently present (wait 15 seconds)
  6. Calculate how many tapes are needed and print which ones will probably be used
  7. Write to tapes
    1. Load the first available tape
    2. record the label and storage element number in used_tapes.csv (do this with all new tapes loaded into the drive)
    3. Begin the tar command (tar -cvMf <tape drive> <repo path> --new-volume-script=./change_tape.sh)
      1. Tar command calls change_tape.sh each time the end of the current tape is reached (this is what the --new-volume-script flag does)
        1. change_tape.py is called and switches the tapes, making sure to make the appropriate changes in available_tapes.csv and used_tapes.csv each time
    4. Tar command ends. Last tape is unloaded
  8. The tapes that are used are printed to the terminal (yes, this can differ from the predicted tapes in step 6)
  9. Available_tapes.csv, used_tapes.csv, and devices.txt are deleted from the current directory
  10. Program Ends