Hello there.
I have been experiencing database corruption problems on my Domino server for something like 18 months now. It’s not just 1 or 2 databases either, but usually about 30-40 users! I know how to fix them each time now, but I would like to know what actually is causing this.
Previously, I had put it down to the old Domino server hardware crashing or Domino not being closed properly, both of which had happened on occasion. However, about 2 months ago I migrated the server to new hardware, and the other day it happened again.
This time, there was no server crash. Domino was still up when we started seeing problems. The first symptom was users getting a dialog saying they could not access the server due to insufficient memory (this server is pretty decent spec with 4gig of RAM). When I looked on my Domino server I noticed this error, as well as corrupt DB errors… I quit Domino from the console and rebooted the box. Domino restarts ok, I fix my log.nsf, activity.nsf, catalog.nsf using ncompact. Then I get about 25 users with corrupt databases, which I’ve just finished fixing.
This has happened on a 6.5.5 server and on our current version 7.0.2 server.
So does anyone have any idea what could actually be causing this? It’s becoming rather an inconvenience having to fix dozens of corrupt databases every few months!!
(Sorry for long post)
Subject: Database Corrupt - Causes??
Hi Chris,
Just some thoughts (I may have missed other possibilities so don’t assume this is a complete list of things to check):
Are you running any anti-virus or backup software on the server? If you are you may want to check whether a) they are compatible with the version of Domino that you are running and b) they are compatible with each other.
Also are there any add-ins running on the Domino server?
If the hardware has changed it must be a software issue and if it’s a software issue I doubt that it’s the core Domino code causing the problem as it’s not something effecting other customers (hence my suspicions would lie with AV / backup sw / addins).
Hope this helps.
Cheers,
Phil
Subject: RE: Database Corrupt - Causes??
Well we have McAfee AV and a disk-to-disk backup system called Attix5. I’d like to think it wasn’t those things as we had been using them for a long while on our email server without problems. The Attix5 software has a Domino plug-in which should be fully compatible with Domino 7. McAfee should also be compatible.
If it was going to be either I would suspect the backup software ahead of the AV. It’s due an upgrade anyway so maybe I’ll try that.
I have seen lots of people on this forum and others report problems with database corruption. Perhaps not as bad as ours though!
Subject: RE: Database Corrupt - Causes??
I would add to Phil questions :
Do you have any custom application which have agents that run on schedule on this server ?
Custom apps might not be 100% compatible with new releases of notes and if one agent which runs on the server has a problem it can “eat” all the memory available and lead to a server crash.
Renaud
Subject: RE: Database Corrupt - Causes??
Hmmm, there’s nothing like that. We’re pretty much ‘one app one box’. It’s only the Domino server then AV and the backup software.
Thanks for your help btw guys.
Subject: RE: Database Corrupt - Causes??
Chris,
When I sais custom apps, I was thinking of a domino application (or other said, domino database) other than mail databases, not a custom apps on the OS. 
Renaud
Subject: RE: Database Corrupt - Causes??
Oh sorry. No, not as far as I’m aware. We have a fairly standard Domino setup I think - email only.
Subject: RE: Database Corrupt - Causes??
Ok. Then I would also bet on a problem with the antivirus and the backup software as described before.
Renaud
Subject: RE: Database Corrupt - Causes??
Look for anything that may be running simultaneously against the affected databases.
For instance, anything else that might be running when the “compact” task runs against all databases.
If, for example, that is running at the same time as some other task that’s working on the databases (perhaps even an external one, like a backup process), it could lead to issues.
I spent the better part of a year* chasing down random database corruption issues–and it was a case of a couple things running at once that didn’t get along with each other. It’s been a while, but I think it was “compact” being fussy about a LotusScript Agent based process we have that purges old junk from email files–we moved compact to the weekend and prevented our purge from running on the weekend. Voila, our problem pretty much** went away.
…Rob
*I didn’t work on it full time–but it was a difficult thing to chase down and there was a lot of trial & error involved.
**pretty much: we still have some occasional issues, but they have tended to be rather uncommon–before, it was many db corruptions per week.
Subject: RE: Database Corrupt - Causes??
Hmmm you might be on to something there.
Our compact task is the only thing in the scheduler. It runs at 4am each night. Occasionally our backup software goes a bit weird and overruns (it’s meant to finish around 1am). It did this at the weekend and it did last night - not finishing until 7.30 (we had more corrupt databases this morning).
I know we haven’t had these problems every single time the backups have overrun, but it could be linked to these latest occurrences.
I’ll tweak the schedules and see if it helps. Unfortunately I may not be able to tell if it’s worked because this is usually a very periodic problem!
Thanks for everyone’s help.