Shutdown mystery

We are running Domino 6.0 on a Dell 2450 with dual processors and over 500MB of memory. The server ran fine until we upgraded to 6.0.1 when Domino began shutting iteself down between 1-2 AM. We scrubbed the server, going as far as to reformat the drives , install fresh OS, etc. We restored the server from saved config. documents. The server is still shutting iteself down overnight. I can’t seem to find anything significant in the logs in domino or Windows. Anyone have any ideas as to where I go from here?

Tim

Subject: Shutdown mystery

A couple of suggestions

  1. Look for a Program document that is scheduled to run between those hours and see if that is causing the shutdown. Also look at your notes.ini file to see what things are in the ServerTasksAtxx lines.

  2. Increase your logging level - maybe you can get an idea whether the server is shutting down or crashing, and what’s causing it.

  3. Look for NSD reports - they will have the date and time of the crash (if it is a crash) and may give you some pointers.

    Stephen Lister

Subject: RE: Shutdown mystery

More to add in our mystery:

It appears that the server is crashing. After digging up the NSD it appeared we were hitting a problem with the agent manager. We removed all of our apps from the system that make use of JAVA agents. I also turned on the auto-recovery feature and disabled any replication. It still crashed. It happened at the same time, 5 AM. Included is the section from the NSD that details the namgr problems, I think this is the cause but if anyone has any other ideas I’d be glad to listen. Can anyone shed some light on this problem?

Thanks

Tim

############################################################

FATAL THREAD 12/13 [ nAMgr:08a4:04b8]

FP=00000001, PC=10021a1a, SP=0d1cfb10, stksize=-220003087

Exception code: c0000005 (ACCESS_VIOLATION)

############################################################

[ 1] 0x10021a1a jvm._IBM_GetInterface@4+19626

############################################################

PASS 2 : FATAL THREAD with STACK FRAMES 12/13 [ nAMgr:08a4:04b8]

FP=00000001, PC=10021a1a, SP=0d1cfb10, stksize=-220003087

Exception code: c0000005 (ACCESS_VIOLATION)

############################################################

---------- Top of the Stack ----------

   # 0d1cfb10  00000000 0cce9990 0d1cfc6c 00000000  |........l.......|

   # 0d1cfb20  00000000 6268d780 0cce9990 00000000  |......hb........|

   # 0d1cfb30  00000000 0cd0ea68 0cce9990 0d1cfcb8  |....h...........|

   # 0d1cfb40  62746348 00000000 0109d118 1001ec4c  |Hctb........L...|

   # 0d1cfb50  00000000 0ccddcb0 1001e6f5 0cce8a80  |................|

   # 0d1cfb60  0d1cfcb8 0ccddcb0 100db290 0d1cfbdc  |................|

   # 0d1cfb70  00000000 00000108 0cce9d54 00000000  |........T.......|

   # 0d1cfb80  0d1cfbc8 00000000 00000000 1001f0b8  |................|

   # 0d1cfb90  0cce9990 0d1cfcb8 00000000 100db290  |................|

   # 0d1cfba0  0d1cfbdc 00000000 0cce9990 00000020  |............ ...|

   # 0d1cfbb0  079cc780 00000000 07d29f48 0cce99b0  |........H.......|

   # 0d1cfbc0  10070295 0cce9990 079cc780 00000020  |............ ...|

   # 0d1cfbd0  00000000 079b04f0 00000000 00000000  |................|

   # 0d1cfbe0  0cce9990 1007133d 0cce9990 00000000  |....=...........|

   # 0d1cfbf0  00000024 00000000 0d1cfd04 0c14cf9f  |$...............|

   # 0d1cfc00  0d1cfcfc 00000000 07cc8f30 00000002  |........0.......|

   # 0d1cfc10  07a68cbc 0cce9990 00000000 00000024  |............$...|

   # 0d1cfc20  07a694f0 100d50bd 0cce9990 0d1cfc5c  |.....P......\...|

   # 0d1cfc30  00000000 0109bde8 00000000 0c309794  |..............0.|

   # 0d1cfc40  00000000 0ccdfd54 00000000 0109bde8  |....T...........|

   # 0d1cfc50  0109bed2 00000001 00000000 0d1cfb34  |............4...|

   # 0d1cfc60  0d1cfcc8 62729303 00000002 0cce9d18  |......rb........|

   # 0d1cfc70  100d5e40 0cce9990 0109cc1c 00000000  |@^..............|

   # 0d1cfc80  0cce9990 0d1cfcbc 0d1cfcf4 00000000  |................|

   # 0d1cfc90  100a9b63 07cccdb0 0cd0ea68 00000002  |c.......h.......|

   # 0d1cfca0  0cce9cd4 0d1cfcbc 0cd0ea68 100b1ebe  |........h.......|

   # 0d1cfcb0  0d1cfcf4 100b1ebe 00000000 07cccdb0  |................|

   # 0d1cfcc0  0d1cfcc0 0cce9990 0d1cfd14 100ab4b7  |................|

   # 0d1cfcd0  00000000 000c0000 0cd0f558 0cd0fada  |........X.......|

   # 0d1cfce0  0cd0e56c 00000000 00000000 00000002  |l...............|

   # 0d1cfcf0  000007d0 0d1cfd40 100b220e 00000013  |....@...."......|

   # 0d1cfd00  07cccdb0 00000000 07cccdb0 0d1cfd0c  |................|

   # 0d1cfd10  0cce9990 0d1cfd74 100ab4b7 00000000  |....t...........|

   # 0d1cfd20  000c0000 0ccefb08 0ccf0083 0ccee38c  |................|

   # 0d1cfd30  00000000 00000000 00000002 000006a4  |................|

   # 0d1cfd40  0d1cfda0 100b1ebe 07d17a10 07d17ae8  |.........z...z..|

   # 0d1cfd50  07d178e8 07d17948 07d17ce0 07ccb078  |.x..Hy...|..x...|

   # 0d1cfd60  00000000 07ccde60 07ccbb50 0d1cfd6c  |....`...P...l...|

   # 0d1cfd70  0cce9990 0d1cfdc0 100ab4b7 00000000  |................|

   # 0d1cfd80  000c0000 0ccef978 0ccefd0a 0ccee38c  |....x...........|

   # 0d1cfd90  00000000 00000000 00000002 000007d0  |................|

   # 0d1cfda0  0d1cfdec 100b1ebe 100b1ebe 07ccde60  |............`...|

   # 0d1cfdb0  07ccb910 07ccbb50 0d1cfdb8 0cce9990  |....P...........|

   # 0d1cfdc0  0d1cfe34 100ab4b7 00000000 000c0000  |4...............|

   # 0d1cfdd0  0c188888 0c188973 0c18877c 00000000  |....s...|.......|

   # 0d1cfde0  00000000 00000003 000007d0 0d1cfe20  |............ ...|

   # 0d1cfdf0  100b1dc3 07ccb078 07cbba08 07ae1518  |....x...........|

   # 0d1cfe00  00000000 0c301598 0c301b3a 0c30548c  |......0.:.0..T0.|

   # 0d1cfe10  00000000 00000000 00000001 000007d0  |................|

   # 0d1cfe20  0d1cfe60 100b1dc3 07ae1518 0d1cfe2c  |`...........,...|

   # 0d1cfe30  0cce9990 0d1cfe80 100ab4b7 00000000  |................|

   # 0d1cfe40  000c0000 0c306584 0c30699f 0c18827c  |.....e0..i0.|...|

   # 0d1cfe50  00000000 00000000 00000001 000007d0  |................|

   # 0d1cfe60  0d1cfeac 100ad049 00000000 100ab4b7  |....I...........|

   # 0d1cfe70  100ad049 07ae1518 100ad72c 0cce9990  |I.......,.......|

   # 0d1cfe80  0d1cffa4 100ab4b7 00000000 000c0000  |................|

   # 0d1cfe90  00000000 0d1cff20 0d1cff04 00000000  |.... ...........|

   # 0d1cfea0  00000000 00000001 0cce9990 deadbeef  |................|

   # 0d1cfeb0  0cce9990 079c52ad 0cce9cd4 00000004  |.....R..........|

   # 0d1cfec0  00000000 0d1cfcd4 0cce9cd4 0cce9d0c  |................|

   # 0d1cfed0  00000000 00000000 0d1cfe80 100da5ca  |................|

   # 0d1cfee0  0cce9990 0d1cff20 07ae1518 0cce9990  |.... ...........|

   # 0d1cfef0  0d1cffb4 0ccd4298 00000000 0d1c8a80  |.....B..........|

   # 0d1cff00  7800ffb8 0d1cfefc 0c306584 0d1cffb4  |...x.....e0.....|

[ 1] 0x10021a1a jvm._IBM_GetInterface@4+19626

Subject: RE: Shutdown mystery

There’s more data to the amgr? There should be another section on the next page that shows info on what database was involved, etc.

Subject: RE: Shutdown mystery

Well, like I said we don’t have any databases that are running any scheduled agents. I cannot seem to find the data you are talking about. Can you give me something a bit more specific. My NSD files are averaging about 3 MB of plain text, its a bit much to sort though.

Thanks

Tim

Subject: RE: Shutdown mystery

we don’t have any databases that are running any scheduled agents

What is the result of

tell amgr sched

command?

maybe some users has agent triggered on “Before new mail” or simillar?

Subject: Ok, what are your parameters in the Server doc for the Agent Manager?

Specifically, what is your Daytime vs. Nighttime schedule? Does it switch at 5:00 AM? Do you have different # of Max concurrent Agents? If so, try making them the same.

Subject: RE: Ok, what are your parameters in the Server doc for the Agent Manager?

Its not our D vs. N agent schedule (at least I don’t this is the problem). It changes over at 8 Am/Pm. The differences in the number of agents is 4 during the day vs. 6 at night. With a delay for 40% memory usage during the day and 60% memory usage at night. The agent cache refreshes at 12 AM. It does not seem that any of these events coincides with our 5 AM outage.Any other ideas?

Thanks for your input

Tim

Subject: RE: Ok, what are your parameters in the Server doc for the Agent Manager?

I am afraid I cannot shed any light on the essense of your problem, but I wanted to point out the delay paramaters has been obsoleted since Release 5. They are present only for backwards compatibiltity with releases prior to 5 (and the note next to these parameters denotes them as obsolete). So they would not causing a problem for you.