Server Crashes after 6.0.1 CF1 Install

I’m testing the upgrade from 5.0.12 to 6.0.1 CF1 in our isolated test environment. I’m glad it’s isolated, because we’re having real problems with CF1. Every time the server starts, before it gets a screen full of info on the console, it NSDs.

After reading Debra Alioto’s experiences (what a nightmare that must have been!), I went through these steps to upgrade the the test version of our main web server:

  1. Upgraded the code to 6.0.1.

  2. From the program directory, ran NCOMPACT -c -i to upgrade all NSFs to the newest ODS version.

  3. Ran NUPDALL names.nsf -t “($ServerAccess)” -r

  4. Ran NUPDALL names.nsf -t “($Users)” -r

  5. Started the server long enough to upgrade NAMES.NSF’s design, then stopped the server with the standard quit command

  6. Ran NFIXUP -f -j -l -Y

  7. Ran NUPDALL

At that point, I fired up 6.0.1, and it seemed to run without complaint. I performed basic testing with our web application, and that testing included firing COM objects within the HTTP task, running a variety of WebQuerySave and WebQueryOpen LotusScript agents, and accessing RDBMS data via Microsoft ADO and @Formula commands to SQL Server and Oracle. So far, so good.

I brought the server down and installed CF1. When I restarted the server, it crashes. The NSD log doesn’t show anything that I can decipher (though I haven’t really disected crash dumps since my OS/2 days).

I tried rerunning the NCOMPACT command just against NAMES.NSF; I tried creating a new NAMES.NSF replica and replicating it over itself to give me a fresh NAMES.NSF copy; I even tried running NDESIGN to update the design of NAMES.NSF (all based on suggestions here for similar problems). The server still dies.

I have a second server that I upgraded by going straight from 5.0.12 to 6.0.1 and 6.0.1 CF1. It dies soon after starting as well.

Both servers have been in service since the early Notes 4.x days.

We’re stumped and somewhat surprised. We’ve never had a Lotus Domino release behave this way before.

What have I missed?

Thanks in advance for any suggestions – even if no one has any suggestions, I think it’s worth while to have this on record.

Thanks.

Subject: Server Crashes after 6.0.1 CF1 Install

Terry, we have seen this in the AIX world. What we have heard from our LSM is to try upgrading from 5.x to 6.0 then 6.0.1 / CF1 (if needed). This is a timely install process and we just received this tidbit, so I can’t verify it currently, but it might worth a try.

Subject: RE: Server Crashes after 6.0.1 CF1 Install

The servers both ran over the weekend. The main mail server crashed early this morning memory allocation errors during indexing, but I think that’s a separate issue.

I wonder why Report and/or Event would so quickly crash the server?

In any event, thanks to Matt Hays and Eric Eskam for their replies and suggestions.

Subject: RE: Server Crashes after 6.0.1 CF1 Install

Interesting – thanks for the suggestion. If you get time and think about it when your upgrade’s done, would you please post your results?

Before I go that route, I’m going to try running FileMon and seeing if I can understand what Domino’s doing just before it crashes. That might give me some insight.

Like I said before, this kind of instability is really unusual for a Lotus product.

Thanks!

Subject: RE: Server Crashes after 6.0.1 CF1 Install

Using FileMon, I saw that the server typically began its dance of death just as NREPORT started to run. I removed all ServerTasks from NOTES.INI and noticed NREPORT would still try to run. So I removed it from the Program list in NAMES.NSF (along with Event).

Now, the server appears to be running.

I put the ServerTasks back and recycled the server. It still seems to be running.

This is tenative, of course. The server could be crashing as I type this. I’ll try to let it run over the weekend and will let you know how it goes.

Thanks.

Subject: Server Crashes after 6.0.1 CF1 Install

When you say you upgraded the code, did you just run the D6 installer on top of your D5 install?

Call me paranoid, but I always clean out the Domino directory of everything but the Data folder and the NOTES.INI. We have a couple custom .DLL’s for add-in tasks - I keep those too. I don’t worry about the Antivirus software - it’s trivial to re-run the setup for it after completing the upgrade (symantec antivirus).

I also move all the templates out of data, whack the \data\domino folder, \data\win32, \data\modems before running the D6 installer. I then move back the stuff I need after letting the server come back up.

The Domino folder itself is the critical one - I’ve had no end of .dll problems in the past from the stupid installers not cleaning up the previous version fully. And really, there is no reason not to start with a clean domino directory…

Just did all 18 of our production servers with only one problem - on one of them the NOTES.INI got truncated somewhere along the line - server wouldn’t start (not even the outline of the console screen). Restored NOTES.INI from backup and the server came right up.

Eric

Subject: RE: Server Crashes after 6.0.1 CF1 Install

Yes, I installed over the old code. I never thought of just keeping the DATA directory. I’ll keep that in mind for the next set of tests.

Thanks for the suggestion.