For a WebService we receive about 4’000 requests per day from an external partner, which are answered correctly by our application. The WebService does a complex calculation by using two lookups to other databases (LotusScript).
Anyway approximately every two weeks the http tasks hangs and produces a fatal server crash. NSD analysis showed that the crashes were caused by the XYZ?WSDL requests which our external partner was sending additionally to the XYZ?OpenWebService requests. Mostly at 1:00 a.m. or 7:00 a.m. when updall or replicator are running. My assumption is that those server tasks are somehow blocking the lookup databases and therefore the Webservice crashes.
On the other hand it’s only the XYZ?WSDL request which leads to the crash and not the WebService itself. The WSDL does not do any calculation or lookup.
We now asked the external partner to stop sending the additional XYZ?WSDL requests and hope that server crashes will stop.
Anyway any ideas, hints are welcome. I will let you know in some weeks if we had success.
It sounds like you are on the right track. If updall or the replicator has a view locked and your external app is requesting access to that view it could cause a timeout and without the proper coding a crash would happen. Were you able to identify if the crashes were happening in a particular database\document? Let us know if your workaround worked.
In fact we seem to be on the right track. Our admins confirmed that we had some more crashes while UPDALL -R or COMPACT -B are running on databases in which the webservice tries to perform a lookup.
This time the crashes were caused by XYZ?OpenWebService URLs. So it’s not the question of requesting ?WSDL or ?OpenWebService. But it’s the fact that the request runs into a timeout. The NSD files show a 0 for the endtime of the nhttp task which caused the fatal error.
So the effect is reproducable. But I still don’t understand why a timeout leads to a fatal server crash. We will try to programm a workaround and escalate this issue to IBM support.