Questions about backups and transaction logging in a large company

We are moving from a distributed environment with 20+ mail servers to 2 large clustered mail servers. We will eventually have 4+ TB of mail files with circular transaction logging running on each. We currently use CommVault to back up our existing servers, but we are trying to formulate a realistic backup and recover method with the centralized mail cluster.

  1. What is the best way to backup this data in a reasonable amount of time (ideally less tan 24 hours)?

  2. The backup guy at my company says restores take to long using transaction logging backups. Is this true? Or are we doing something wrong with CommVault?

  3. Instead he wants to perform regular file level backups. The problem is, how do I reset my transaction logs? With circular logging, the server started failing when the logs got full (which I didn’t think was possible with circular logging). Is my only option to have scheduled reboots or use a proper backup for trans logs?

  4. We planned to use 400 GB LUN’s on our SAN and use directory links (i.e. Mail\01, Mail\02, Mail\03, etc.). Then they would make snapshots of each LUN, mount the snapshot to the backup server, and back it up (not sure to tape or disk). When we tested this, the snap cache was filling up at a ridiculously quick rate. We are doing this on another Domino server, but have no problem with the snapshot. The only difference is clustering and Windows 2003 64-bit on the new server. Would either of these cause a snapshot to have this problem?

Bottom line: Is anyone backing up multiple TB’s of clustered, trans logged mail? If so, what backup strategy to you use that is reliable and efficient?

Subject: Questions about backups and transaction logging in a large company

  1. A translog based backup will be the fastest way to get daily backups as it is the method which will be backing up the smallest amount of data per day.2. Yes, it takes longer, based on when your point in time restore is compared to the last full backup of the file being restored.

  2. Your server shouldn’t be running out of space with circular logging, they should be set to a static size, and you wouldn’t backup circular translog files. It sounds instead like you’re speaking of archive style translog files, which you should only use in the event that you’re using a backup method which will backup those files. Did you check the server document and the server ini to make sure they both reflect Circular logging (for ini: TransLog_Style=0). Open up a PMR with Lotus if everything seems set correctly and you still experienced logs growing when configured for circular.

  3. I’ve worked with this type of setup before and while I didn’t monitor the SAN side of things, we’ve gotten good backups using it. Are the backups being taken during peak hours or when you have scheuled compacts or updalls? Since the snap has to keep up with all changes, make sure the backup is scheduled for a time when the fewest changes are occurring.

My suggestion would be to use archive style translogs and do translog backups for your dailys, running those backup jobs several times per day to prune the logs, with weekly full backups. Also, I’d suggest having a separate server to do restores to, and to test your restore process thoroughly to make sure you’re getting good full backups, translog backups, and restores.

If you do go to archive translog backups, try to get an estimate of how much data your server will be writing out per day and give your translog volume enough space to go several days without a backup job pruning them. That will give you a good cushion in the event there’s a problem with your backup system.

Good luck and I’d be interested to know what you find works best for you.