Domino Server Crashs REDHAT ES3

Hello,

my Domino Server crashs after 50 to 60 days. In the NSD dump I find one fatal error. Has anyone one idea what it means?

Here is the log:

----- Thread 8158 -----

0xb54b15f1: __GI_select + 0x61 (124f8, b74f2344)

0xb5ba22ee: OSDelayThread + 0x26 (4b, b74f2344, b47c2ecc, b47c2ee8)

0xb5b8afff: OSLockSpin + 0x63 (b47c2ee8, b74f2344, 5dc, b47c2ecc)

0xb5b8b5fd: SemAlloc + 0x2d (b47c2ecc, b74f2344, b47c2ea4, adb0fe4c, 1)

0xb5b8b44a: OSLockSemInt + 0xa2 (b47c2ecc, 1, b74f2344)

0xb5b8b37b: OSLockSem + 0x1b (b47c2ecc, b74f2344, 2, adb0fea8, b4f58000)

0xb5b963c6: LockHandle + 0x152 (19ab, adb0fe4c, adb0fe50, 1f4, 2, adb0fea8) + c

0xb5b95e02: OSLockObject + 0x26 (19ab, 1f4, 2, 3)

0x080672c5: GetTaskPtr + 0x49 (fc7d0010, 1, adb0fea8, b74f2344, b559d5d0, 3) + 38

0x08064272: Scheduler + 0x302 (0, b4edcf70, b559d398, b559d5d0, 8129a20, 0) + 8

0xb5ba267f: ThreadWrapper + 0xfb (0, 0, 8129b38, 0, 0, 0) + e0

0xb5593e21: pthread_start_thread + 0x181 (8129a20, 0, 0, 0, 0, 0) + 524efff4

----- Thread 8159 -----

0xb5484d56: __GI_nanosleep + 0x46 (0, b74f2344, adaccc9c, adacd0f4, adacc708, adaccb5c) + 558

0xb5ba6cc7: OSRunExternalScript + 0x207 (adaccc9c, adaccc9c, 0, b74f2344, adacd0f4, adacd0f4) + 34c

0xb5ba563d: OSFaultCleanup + 0x1d5 (0, 0, 0, b559d398, 0, b) + 20c

0xb5b866a9: fatal_error + 0x12d (b, adacd258, adacd2d8, 0, 0, 0) + 30

0xb5598f0a: __pthread_sighandler_rt + 0x7a (b, adacd258, adacd2d8, b, 0, 0) + 378

0xb5402d30: __GI___libc_sigaction + 0x130 (54016, b, b74f2344, adacd90c)

0xb559658b: __pthread_raise + 0x2b (b, b74f2344, 195, f8f5e000, b37b, adacd62c) + 75c

0xb5bbe6b1: Panic + 0x209 (b6e7f395, 195, b74f2344, 80)

0xb5bbe43f: Halt + 0x2f (195, b74f2344, b47c0800, 12d, 200, b5b80800)

0xb5b8b8f4: CreateNativeSemaphore + 0x7c (800, b74f2344, b47c2ecc, b47c2ee8)

0xb5b8b739: GetNativeSemaphore + 0xa9 (12d, 0, b74f2344, 0, b47c2ecc)

0xb5b8b614: SemAlloc + 0x44 (b47c2ecc, b74f2344, b47c2ea4, adacde4c, 1)

0xb5b8b44a: OSLockSemInt + 0xa2 (b47c2ecc, 1, b74f2344)

0xb5b8b37b: OSLockSem + 0x1b (b47c2ecc, b74f2344, 2, adacdea8, b4f58000)

0xb5b963c6: LockHandle + 0x152 (19ab, adacde4c, adacde50, 1f4, 2, adacdea8) + c

0xb5b95e02: OSLockObject + 0x26 (19ab, 1f4, 2, 3)

0x080672c5: GetTaskPtr + 0x49 (fc7d000a, 1, adacdea8, b74f2344, b559d5d0, 3) + 38

0x08064272: Scheduler + 0x302 (0, b4edcf70, b559d398, b559d5d0, 8129fe0, 0) + 8

0xb5ba267f: ThreadWrapper + 0xfb (0, 0, 812a0f8, 0, 0, 0) + e0

0xb5593e21: pthread_start_thread + 0x181 (8129fe0, 0, 0, 0, 0, 0) + 52531ff4

----- Thread 8160 -----

0xb54b15f1: __GI_select + 0x61 (f4240, b74f2344)

0xb5ba22ee: OSDelayThread + 0x26 (3e8, b74f2344, 2)

0xb5b8ddcb: StaticHangEnable + 0x53 (b74f2344, 2, ada8bea8, b74f2344)

0xb5b9629f: LockHandle + 0x2b (19ab, ada8be4c, ada8be50, 1f4, 2, ada8bea8) + c

0xb5b95e02: OSLockObject + 0x26 (19ab, 1f4, 2, 3)

0x080672c5: GetTaskPtr + 0x49 (fc7d000b, 1, ada8bea8, b74f2344, b559d5d0, 3) + 38

0x08064272: Schedulercrawl: Input/output error

Error tracing through thread 8168

  • 0x302 (0, b4edcf70, b559d398, b559d5d0, 812a640, 0) + 8

0xb5ba267f: ThreadWrapper + 0xfb (0, 0, 812a758, 0, 0, 0) + e0

0xb5593e21: pthread_start_thread + 0x181 (812a640, 0, 0, 0, 0, 0) + 52573ff4

Thanks

Subject: Domino Server Crashs REDHAT ES3

Hi Christian,

you should add some semaphore debugging to see what semaphores are being locked. Looks like this is the case which could result that the e.g. memory is not being release which will slowly bring your server in the knees…

//Kjeld

Subject: RE: Domino Server Crashs REDHAT ES3

Hi,

on the IBM page I found this text:

http://www-1.ibm.com/support/docview.wss?rs=899&uid=swg21192334

Does the update solve my problem?

The Server have 2GB memory and needs 1.9GB.

Subject: RE: Domino Server Crashs REDHAT ES3

Hi,

This does not seems to be the case as the call stacks are very different. If it was the same crash they should have been similar around the panic/fatal call. You see in you in your stack that the stack is waiting for a semaphore to be release but never happens. Semaphore debug and thread debug will give you a lot of info to see who is the writer of the locked semaphore, could be an antivirus or backup SW, an agent etc. What is the fatal task? Router, AdminP, Amgr etc?

//Kjeld

Subject: RE: Domino Server Crashs REDHAT ES3

Hi,

for 3 Month I have set the semaphores. Now the Server crashs after 98 Days.

I have set this semaphores:

kernel.shmmni=8192

kernel.sem=250 32000 32 1024

vm.max-readahead = 512

vm.min-readahead = 512

The Server crashs after the event Task. Here is the Log. What can i do?

2316: /opt/lotus/notes/latest/linux/event

----- Thread 2316 -----

0xb54b05f1: __GI_select + 0x61 (4c4b40, b74f2344)

0xb5ba22ee: OSDelayThread + 0x26 (1388, b74f2344, b4f0067c, 0, b60000)

0xb5b904cd: OSStaticMem + 0x115 (0, 8132, b74f2344, b4f0067c, 64, b5bae445) + 124

0xb5b8f803: OSStaticMemBeginInit + 0x6f (0, 8132, 698, bfffb3ac, b74f2344, b47728f4)

0xb5b7af30: NotesSDKData + 0x28 (b74f2344)

0xb5b7b5b1: AddInShouldTerminate + 0x15 (b4f00684, bfffdaa4, b46140f8, bfffdac4)

0xb5b7b64e: AddInIdleDelay + 0x62 (64, 808b6dc, 1, 1, bfffdb14, bfffb8a4) + 2a40

0x08058d66: AddInMain + 0x1236 (20, 1, 808b6dc, bfffde5c, 0, 0) + c

0x08066425: NotesMain + 0x35 (1, 808b6dc, b550e940, b7600600, 1, b53dc458) + fc

0x08066541: notes_main + 0xe5 (0, 0, 0, 1, 808b6dc)

0x08066456: main + 0x16 (1, bfffdfe4, bfffdfec, b7600af0, 1, 804eb00)

0xb53eeb77: __libc_start_main + 0xc7 (8066440, 1, bfffdfe4, 804d8a8, 80665fc, b75f7ad0) + 40002028

----- Thread 2320 -----

0xb54ae38a: __GI___poll + 0x7a (806e71c, 1, 7d0, 0, 806e758, 916) + 170

0xb5591d5e: __pthread_manager + 0x1ae (806e900, 0, 0, 0, 29, 0) + f7f91764

----- Thread 2321 -----

0xb54b05f1: __GI_select + 0x61 (2710, b559c398, b559c5d0, 806edc0, fffffbc1, b4f56ddc) + 140

0xb5babc3d: TimerTask + 0x511 (0, 0, 806eed8, 0, 0, 0) + e0

0xb5592e21: pthread_start_thread + 0x181 (806edc0, 0, 0, 7f397223, 0, 1) + 4b0a8ff4

----- Thread 2324 -----

0xb54b85f8: semop + 0x28 (80010, b13fde80, 1, b74f2344, b460af14, b460af14) + 2c

0xb5b8be53: WaitOnNativeSemaphore + 0x393 (108, 0, 1f4, 0, b74f2344, b460af14) + 4

0xb5b8adb4: OSWaitEvent + 0x30 (b460af14, 1f4, b74f2344, b559c5d0)

0xb5bb40d3: AutoFDGCleanupThreadProc + 0x37 (0, b4e5af70, b559c398, b559c5d0, 808c8a0, 0) + 8

0xb5ba267f: ThreadWrapper + 0xfb (0, 0, 808c9b8, 0, 0, 0) + e0

0xb5592e21: pthread_start_thread + 0x181 (808c8a0, 0, 0, 0, 2000, ffff) + 4ec01ff4

----- Thread 2325 -----

0xb54b05f1: __GI_select + 0x61 (4c4b40, b74f2344)

0xb5ba22ee: OSDelayThread + 0x26 (1388, b74f2344, b4ef8038, 0, b60000)

0xb5b904cd: OSStaticMem + 0x115 (0, 8132, b74f2344, b4ef8038, 64, b5bae445) + 124

0xb5b8f803: OSStaticMemBeginInit + 0x6f (0, 8132, 698, b13bbe88, b74f2344, b47728f4)

0xb5b7af30: NotesSDKData + 0x28 (b74f2344)

0xb5b7b5b1: AddInShouldTerminate + 0x15 (b4ef8040, b4efff00, b559c5d0, 3)

0xb5b7b64e: AddInIdleDelay + 0x62 (64, b74f2344, 0)

0x080651d2: LogToEventLogThread + 0x92 (0, b4e5af70, b559c398, b559c5d0, 808cd60, b4f000c0) + 8

0xb5ba267f: ThreadWrapper + 0xfb (0, 0, 808ce78, 0, 0, 0) + e0

0xb5592e21: pthread_start_thread + 0x181 (808cd60, 0, 0, 0, 0, 0) + 4ec43ff4

----- Thread 2326 -----

0xb54b05f1: __GI_select + 0x61 (4c4b40, b74f2344)

0xb5ba22ee: OSDelayThread + 0x26 (1388, b74f2344, b4ef4894, 0, b60000)

0xb5b904cd: OSStaticMem + 0x115 (0, 8132, b74f2344, b4ef4894, c8, b5bae445) + 124

0xb5b8f803: OSStaticMemBeginInit + 0x6f (0, 8132, 698, b1379c2c, b74f2344, b47728f4)

0xb5b7af30: NotesSDKData + 0x28 (b74f2344)

0xb5b7b5b1: AddInShouldTerminate + 0x15 (b4ef489c, b74f2344, 0, 3)

0xb5b7b64e: AddInIdleDelay + 0x62 (c8, b74f2344, b559c5d0, 3, b4e8b584, 0) + 250

0x08051734: LicEventThread + 0x124 (0, b4e5af70, b559c398, b559c5d0, 808d300, b4f00040) + 8

0xb5ba267f: ThreadWrapper + 0xfb (0, 0, 808d418, 0, 0, 0) + e0

0xb5592e21: pthread_start_thread + 0x181 (808d300, 0, 0, 0, 0, 0) + 4ec85ff4

----- Thread 2327 -----

0xb5483d56: __GI_nanosleep + 0x46 (0, b74f2344, b1334cbc, b1335114, b1334728, b1334b7c) + 558

0xb5ba6cc7: OSRunExternalScript + 0x207 (b1334cbc, b1334cbc, 0, b74f2344, b1335114, b1335114) + 34c

0xb5ba563d: OSFaultCleanup + 0x1d5 (0, 0, 0, b559c398, 0, b) + 20c

0xb5b866a9: fatal_error + 0x12d (b, b1335278, b13352f8, b133528c, 0, 1) + 30

0xb5597f0a: __pthread_sighandler_rt + 0x7a (b, b1335278, b13352f8, b, 0, 2) + 380

0xb5401d30: __GI___libc_sigaction + 0x130 (b1335b7a, b1335644, fffffffb, b1335b44, b1335a44, b1337b44) + 1cc

0x0804fe96: GetMonitorForm + 0x56 (0, 0, 0, 0, 0, 0) + 4ecca7f0

----- Thread 2328 -----

0xb54b05f1: __GI_select + 0x61 (4c4b40, b74f2344)

0xb5ba22ee: OSDelayThread + 0x26 (1388, b74f2344, b4ef4d00, 0, b60000)

0xb5b904cd: OSStaticMem + 0x115 (0, 8132, b74f2344, b4ef4d00, 12c, b5bae445) + 124

0xb5b8f803: OSStaticMemBeginInit + 0x6f (0, 8132, 698, b12f3814, b74f2344, b47728f4)

0xb5b7af30: NotesSDKData + 0x28 (b74f2344)

0xb5b7b5b1: AddInShouldTerminate + 0x15 (b4ef4d08, b74f2344, b559c5d0, 3)

0xb5b7b64e: AddInIdleDelay + 0x62 (12c, b74f2344, b559c5d0, 3, b12f5a74, b12f5974) + 2668

0x0805017a: MailEventThread + 0xda (0, b4e5af70, b559c398, b559c5d0, 808df40, b4efff80) + 8

0xb5ba267f: ThreadWrapper + 0xfb (0, 0, 808e058, 0, 0, 0) + e0

0xb5592e21: pthread_start_thread + 0x181 (808df40, 0, 0, 0, 0, 0) + 4ed09ff4

----- Thread 2330 -----

0xb54b85f8: semop + 0x28 (80010, b1232e6c, 1, b74f2344, b42e1c28, b42e1c18) + 2c

0xb5b8be53: WaitOnNativeSemaphore + 0x393 (10e, 0, 0, 0, b74f2344, b42e1be8) + 4

0xb5b8adb4: OSWaitEvent + 0x30 (b42e1c18, 0, b74f2344, b559c5d0, 3, 0) + c

0xb6575314: NAMELookupThread + 0xec (b42e1be8, b4e5af70, b559c398, b559c5d0, 808e4e0, 0) + 8

0xb5ba267f: ThreadWrapper + 0xfb (0, 0, 808e5f8, 0, 0, 0) + e0

0xb5592e21: pthread_start_thread + 0x181 (808e4e0, 0, 0, 0, 81002, dddd04d2) + 4edccff4