Database corruption and server crash during ncompact -c -D

Hi there,i do have a open ticket @ IBM for this, but i would like to post it here too, just to get all possible feedback.

Situation :

OS :

p4, 2,8 GHZ, 2 GB RAM, 20 GB Hardisk free, SCSI System ( AHA ), W2k + sp3

Notes :

Fresh & clean install of 6.01, databases are converted from r5 during the last year from any r6 beta to r6.0.1

( CF1 can not be apllied to the server based on checksum errors, even if this is a fresh 6.0.1 install from the partnerworld software, i hate it and i would be pleased to have a full 6.0.1 CF1 download for partners !! )

Error description :

after the installation we run always ncompact -c -D to be sure that all databses are on the right db version. but this time some databases had problems ( non existing documents, container errors, etc. ). fixup was started on this db’s and afterwards ncompact again.

but even after fixup do not compalin about any documents ncompact found several errors and we had crashes for the same or even other databases

that seems to be ok on the first run.

to get reproducable errors i just do a endless loop for one database ( nfixup afterwards ncompact ), but it never get through, sometimes even the os itself crashed.

We found that the w2k policy setting, log pages in memory needs to be set to get at least 50% of all ncompact runs to work, but even after that ncompact marks sometimes databases as corrupt ( any variations of database errors )

One obvious thought are eliminated :

after all this problems the server had to do a heavy burn-in test to be sure that no faulty memory, hardware or misconfiguration causes this errors. the burn-in test went through ( 3 days without any errors )

Any ideas would be highly appriciated

Regards

TW

Subject: Database corruption and server crash during ncompact -c -D

Do you have quota’s on any of the databases and have them enforced?

We had upgraded serevrs and now are starting to see some issues, server crashes and database corruption. It appears that fixup and compact do not like the old 41 ODS, and only work properly on 4. We have been able to crash a server on demand now, by taking a DB over quota and using the 41 ODS and running fixup. The server panics and dies right then. We are talking with IBM now…hopefully they can fix this rather quickly. With 6.02 in code freeze and 6.5 in the pipeline, I would hope they saw this bug earlier.

Subject: RE: Database corruption and server crash during ncompact -c -D

Hithy for the Info.

there are no quota involved here and the db’s seems to be on the ODS 43. i even tried with one to create a local replica on the workstation and create a clean replica on the server, but it doesnt help either.

Last statements from lotus support was that they havent seen the crash so far ( based on nsd analysis and the try to excalate it to the developer.

I wouldn’t expect that its in the 6.02 code or the MS1 for 6.5 and i’m also a little bit …, but i’ll keep you posted about any progess and i would be glad if you could do it too :wink:

Regards

TW

Email : Thomas_Weber@DicomGroup.com