UPDATE: This has been reproduced by IBM support and escalated to Development as SPR DMNI7KH9ES.
We’ve had an issue at a customer with SNMP traps for Task status this week, which resulted in their SNMP management software mis-interpreting what it received. After a some testing with other management software and a delve into domino.mib, we’ve found where part of the issue lies.
What any SNMP agent sends is defined for the management software by the Management Information Base (MIB); they generally ship with many, and you can load up others, and with Domino this is the file domino.mib in the program directory.
So, to the heart of the problem. We enabled task status event generators for http, router and smtp, with the generator set up to issue an SNMP trap. The main proxy agent was set up to forward to the appropriate destination, and the Domino SNMP service was switched on, while ISpy was running, as were intrcpt and quryset (I can’t recall their full names).
Then we switched off the monitored tasks and saw the trap in the Domino SNMP agent window.
Problem: the SNMP management software only appeared to received 5 parameters in the trap, which it couldn’t interpret, rather than the six it was expecting, because that was what domino.mib told it for that particular type of Domino SNMP trap.
I went about setting up SNMP in my environment (“no boss, not in our liveenvironment, honest, gov”), and set about reproducing the problem. Only trouble was, I couldn’t. My management software ( a 7 day eval version of OidView, from ByteSphere, if you’re interested) saw all six parameters as expected.
We then did the same in the customers live environment, on a couple of standby non-production servers, running OidView on a local machine, and letting the SNMP project team know when we were generating traps. Same result: we saw six, they saw five. They were consistently missing a parameter described as EventType while we were receiving the integer value 23 for it.
However, in the MIB, what we see for the task status trap is this:
lnEvtServer OBJECT-TYPE
SYNTAX DisplayString
ACCESS read-only
STATUS mandatory
DESCRIPTION
"Originating server for this event."
::= { lnInterceptor 1 }
lnEvtType OBJECT-TYPE
SYNTAX INTEGER {
unknown(0),
communications(1),
security(2),
mail(3),
replication(4),
resource(5),
miscellaneous(6),
server(7),
alarm(8)
}
ACCESS read-only
STATUS mandatory
DESCRIPTION
"Returns a value for the type of this event."
::= { lnInterceptor 2 }
lnEvtSeverity OBJECT-TYPE
SYNTAX INTEGER {
unknown(0),
fatal(1),
failure(2),
warning1(3),
warning2(4),
normal(5)
}
ACCESS read-only
STATUS mandatory
DESCRIPTION
"Returns a value for the severity of this event."
::= { lnInterceptor 3 }
lnEvtWhen OBJECT-TYPE
SYNTAX INTEGER (0..2147483647)
ACCESS read-only
STATUS mandatory
DESCRIPTION
"Returns a value for the date and time of this event."
::= { lnInterceptor 4 }
lnEvtData OBJECT-TYPE
SYNTAX DisplayString
ACCESS read-only
STATUS mandatory
DESCRIPTION
"Information about this event."
::= { lnInterceptor 5 }
lnEvtSeq OBJECT-TYPE
SYNTAX INTEGER (1..2147483647)
ACCESS read-only
STATUS mandatory
DESCRIPTION
"Sequence number of this event for internal
use only. This information is not available in
this MIB variable. If you wish to determine the
sequence number for the last trap, please see
the variable lnControl.lnRecentTrapsTable.lnRecentTrapEntry.lnTrapSeq."
::= { lnInterceptor 6 }
It’s that second item, lnEvtType that’s at issue:
lnEvtType OBJECT-TYPE
SYNTAX INTEGER {
unknown(0),
communications(1),
security(2),
mail(3),
replication(4),
resource(5),
miscellaneous(6),
server(7),
alarm(8)
}
ACCESS read-only
STATUS mandatory
DESCRIPTION
"Returns a value for the type of this event."
::= { lnInterceptor 2 }
While OidView was reporting the value 23, it couldn’t interpret it as anything in its “this is what it refers to” pane. The other software, because it couldn’t interpret it, dropped it, meaning the whole trap was invalid.
This has now been resolved on the customer’s site by telling the management software not just to drop lnEvtType just because it can’t interpret it, and the issue has been passed up to IBM so that the MIB can be expanded in future releases.
I’ve posted this here, because it caused us much head-scratching, and both my company and IBM were under pressure to prove that it wasn’t Domino misbehaving, and detailed information available in the public sphere was somewhat light.
I hope this helps someone.
Dave.
PS If someone more knowledgeable spots any technical errors in the above, please post a correction, and any explanatory info or other feedback from the server team would also be very welcome