Managing server storage: Resolving full capacity issues with logical partitioning

Have you ever encountered a situation where critical backup operations on your Zmanda Management Console (ZMC) grind to a halt? A common issue behind this can be a server's storage reaching full capacity. This can prevent you from performing essential tasks like manually pruning old backups or initiating new ones.

This article will guide you through the cause of this issue and provide a step-by-step solution to free up space and get your ZMC back on track.

Understanding the issue:

When your server storage fills up entirely, it disrupts ZMC's operations in several ways:

  • Prevents backup operations: Since the storage is full, it prevents the initiation of further backups, risking data loss in case of system failures or data corruption.

  • Inability to prune backups: Manual pruning of expired backups is hindered, leading to storage clutter and inefficiency.

  • Configuration issues: Updating retention periods and backup cycles through ZMC becomes inaccessible.

  • Unresponsive ZMC: The ZMC interface may become unresponsive due to the underlying storage issue.

Cause: RabbitMQ queue overload

The root cause of this problem lies with RabbitMQ, a messaging system used by ZMC. When the storage becomes full, RabbitMQ gets blocked, preventing publishers from posting tasks onto its queues. This creates a bottleneck, hindering ZMC's functionality.

Solution: Separating RabbitMQ's queue data

To resolve this issue, create a separate logical partition dedicated to RabbitMQ's queue data. This ensures that even if the main server storage fills up, RabbitMQ won't be affected, allowing ZMC to function normally.

Below are the step-by-step instructions to implement this solution:

  1. Creating a loop device:

    • To begin, we'll create a loop device using a 100MB file that functions as a pseudo-device or logical device.

      • Execute the following commands:

        • mkdir -p /mnt/disk1

        • fallocate -l $((100*1024*1024)) /mnt/disk1/file1

        • losetup -f /mnt/disk1/file1

  2. Mapping the device using dmsetup:

    • Utilize dmsetup to map the loop device to a mapper, allowing it to be mounted onto any file system.

      • Calculate 204800 bytes as (100MB * 1024 bytes * 2).

      • Create a different directory to mount the new logical partition for further steps.

      • Execute the following commands:

        • dmsetup create my_device --table "0 204800 linear /dev/loop0 0"

        • mkfs.ext4 /dev/mapper/my_device

        • mkdir -p /mnt/my_device

        • mount -o rw /dev/mapper/my_device /mnt/my_device

  3. Transferring RabbitMQ files:

    • Identify the default location where RabbitMQ is configured, typically found at /var/lib/amanda/rabbitmq/.

    • Remove any trace files and copy all files from /var/lib/amanda/rabbitmq/ to the new block device created earlier.

      • Execute the following commands:

        • rm -rf /mnt/my_device/*

        • cp -r /var/lib/amanda/rabbitmq/* /mnt/my_device/

  4. Mounting the logical device:

    • Unmount the current location and mount the logical device to /var/lib/amanda/rabbitmq/.

    • Since the main storage and the logically separated partition are distinct, backups cannot fill up this storage. This ensures that RabbitMQ's database is never flooded, and publishers are free to post tasks onto the queue.

      • Execute the following commands:

        • umount /mnt/my_device

        • mount -o rw /dev/mapper/my_device /var/lib/amanda/rabbitmq/

  5. Adjusting permissions:

    • Grant the necessary permissions to the logical device for normal functioning by assigning ownership to the amandabackup user.

      • Execute the following command:

        • chown -R amandabackup:amandabackup /var/lib/amanda/rabbitmq

  6. Ensuring operational continuity:

    • With this setup, even if the storage is completely full, users can still manually prune backups, trigger auto-pruning jobs, and update the media page without any lag.

  7. Verifying success: After completing the steps, you can use the df command to confirm that the logical partition is listed and has sufficient free space.

image-20240320-075644

Removing the logical partition:

If you're uninstalling Zmanda Classic and want to remove this logical partition, you can use these steps to reintegrate the 100MB storage into the main server storage.

  1. Begin by unmounting the logical partition from /var/lib/amanda/rabbitmq.

    • Execute the following command:

      • umount /var/lib/amanda/rabbitmq

  2. Next, remove the device mapper associated with the logical partition.

    • Execute the following command:

      • dmsetup remove my_device

  3. Release the loop device used for the logical partition.

    • Execute the following command:

      • losetup -d /dev/loop0

  4. Finally, delete the 100MB file created earlier to serve as the pseudo-device.

    • Execute the following command:

      • rm /mnt/disk1/file1

Note: Restarting services is not required after implementing this solution. Once the logical partition is mounted, Zmanda Classic will resume normal operations.

By following these steps, you can address the server storage full issue and ensure that ZMC functions smoothly, allowing you to manage your backups efficiently.