Subject: re: DDM Health.MemoryUtil.Value
The Health statistics are generated by the Domino Administrator. Quite a bit of documentation can be found in the administrator help under Monitoring / Server Health Monitoring (SHM). This client-side feature was introduced with N/D 6 as a separate product named IBM Tivoli Analyzer for Lotus Domino. For N/D 7 this feature was rolled into the core product. See http://publib.boulder.ibm.com/infocenter/domhelp/v8r0/index.jsp
Specific details about how SHM the memory assessment is generated can be found at the bottom of this post. The algorithm sensitivity can be adjusted via the Configuration / Index Thresholds in the client dommon.nsf. SHM algorithms have not been updated since they were first introduced. If the SHM memory reports are misleading for a particular server, the memory component can be disabled for that server in dommon.nsf under Configuration / Server Components.
DDM is a completely separate server-side feature introduced in N/D 7. The DDM memory probes and their results are easier to understand than the SHM indices. Look in the administrator help under Monitoring / Domino Domain Monitoring for details.
Memory Utilization
When Memory Utilization is included in the Health Report
This component appears if ALL of the following are true:
1. Domino version is R5.0.2 or greater
2. Platform Stats are Enabled
3. OS = Windows NT/2000
OS/400
Solaris (D6 only)
AIX (D6 only)
Note: For Solaris version 5.8, the Memory component may always = 0 because the Scan Rate metric
used in Memory analysis appears to always = 0
Windows NT and Windows 2000
Statistics used:
Amount of Free/Available Memory
Platform.Memory.KBFree R5.x
Platform.Memory.RAM.AvailMBytes R6
Amount of Installed Memory
Mem.PhysicalRAM R5.x
Platform.Memory.RAM.TotalMBytes R6
Note: For Win32 platforms, the Memory Utilization component of the Server Health Monitor
is based on available physical memory. For the sake of simplicity, call the Free Memory statistic
“RAM.Available” (in MB) and the Installed Memory stat “RAM.Total” (also in MB).
For a system with RAM.Total > 2 GB, the maximum usable amount of Memory is actually about 2.1 GB,
in which case, the reported RAM.Usable is misleading. For example, a system with 8 GB RAM, and 1.9 GB used,
will report RAM.Usable = 8 GB - 1.9 GB = 6.1 GB, but on a small amount of the 6.1 GB (~200 MB) is really usable.
So, if the reported RAM.Total > 2.1 GB, the SHM adjusts RAM.Available as follows
RAM.Available = 2150 - (RAM.Total - RAM.Available)).
Memory Utilization Rating =
0 if RAM.Available >= 100 MB
100 - RAM.Available if RAM.Available < 100 MB
Memory Utilization =
0 if RAM.Usable >= 100 MB
100 - RAM.Usable if RAM.Usable < 100 MB
Server health component thresholds are the values at which a component reading is considered
Significant (Yellow) and Critical (Red). The Memory Utilization thresholds are defined in the Server
Health Profile documents. These values are initially set to platform-specific defaults, but are
modifiable (per-platform) by the Administrator. For the purpose of this document, let us identify
the Memory Utilization thresholds as YellowU and RedU. So, given threshold Memory Utilization
values of 50 and 90, which translates to 50 MB available/usable and 10 MB available/usable, we have
0 MB Usable <= Critical < 10 MB Usable <= Warning < 50 MB Usable
Solaris
For Solaris, a more useful metric for Memory analysis may be the “Scan Rate”, which
is provided in the Rnext Domino Platform Statistics for Solaris under the name.
Platform.Memory.ScanRatePagesPerSec.
The threshold values for Scan Rate are YellowS(Significant) = 200, RedS(Critical) = 400. These
values are based on the experience of running performance tests, and examining Scan Rate
values as the load on the server is increased.
In order to normalize the Scan Rate to a 0 - 100 based value that is compatible with the
threshold settings for Memory Utilization, this metric must undergo a number of adjustments:
Memory Utilization =
ScanRate * (YellowU / YellowS)
if ScanRate <= YellowS (GREEN)
YellowU + ((RedU - YellowU) * (ScanRate - YellowS) / (RedS - YellowS))
if YellowS < ScanRate < RedS (YELLOW)
MIN(97, RedU + ((ScanRate - RedS) * RedU) / RedS
if ScanRate >= RedS (RED)
examples
Scan Rate Memory Utilization Condition
0 0 * (50/200) = 0 Healthy
100 100 * (50/200) = 25 Healthy
200 200 * (50/200) = 50 Warning
300 50 + (90-50)*(300-200) / (400 - 200) = 70 Warning
400 min(97, 90 + (400-400)*90/400) = 90 Critical
500 min(97, 90+(500-400)*90/400) = 97 Critical
AIX
For AIX, a more useful metric for Memory analysis may be the ratio of “Scan Rate” to “PagesFreedRate”,
both of which are provided in the Rnext Domino Platform Statistics for AIX.
Platform.Memory.ScanRatePagesPerSec and Platform.Memory.PagesFreedRatePerSec.
For simplicity, call this ratio the “Scan Ratio”.
The threshold values for Scan Ratio are YellowS(Significant) = 5, RedS(Critical) = 9.
In order to normalize the Scan Ratio to 0 - 100 based value that is compatible with the
threshold settings for Memory Utilization, this metric must undergo a number of adjustments:
Memory Utilization =
ScanRatio * (YellowU / YellowS)
if ScanRatio <= YellowS (GREEN)
YellowU + ((RedU - YellowU) * (ScanRatio - YellowS) / (RedS - YellowS))
if YellowS < ScanRatio < RedS (YELLOW)
MIN(100, RedU + ((ScanRatio - RedS) * RedU) / RedS
if ScanRate >= RedS (RED)
examples
Scan Ratio Memory Utilization Condition
0 0 * (50/5) = 0 Healthy
2 2 * (50/5) = 20 Healthy
4 4 * (50/5) = 40 Healthy
6 50 + (90-50)*(6-5)/(9-5) = 60 Warning
8 50 + (90-50)*(8-5) /(9-5) = 80 Warning
9 min(97, 90 + (9-9)*90/9) = 90 Critical
9.5 min(97, 90 + (9.5-9)*90/9) = 95 Critical
10 min(97, 90+(10-9)*90/9) = 97 Critical
OS400
Calculate MemUtil =
10000 * Platform.Memory.FaultRate /
(Server.CPUCount * Platform.System.PctCombinedCpuUtil * (100 - Platform.LogicalDisk.Total.PctUtil))
MemUtil Threshold: Warning - 250, Critical = 350
then Memory Rating =
0.2 * MemUtil if MemUtil <= 250,
0.4 * MemUtil - 50 if (250 < MemUtil < 350)
min(97, (3250 + MemUtil) / 40) if (MemUtil >= 350,
z/OS
First Supported in Domino 6
Statistics used:
Warning Critical
Platform.Memory.AvailableFrameCount 4192 819
Platform.Memory.OutReadyQueue 1 6
Platform.Memory.PagesPerSec 50 90
AvailFrameCount rating =
0 if Platform.Memory.AvailFrameCount >= 8192,
100 - (100 * Platform.Memory.AvailFrameCount / 8192) if Platform.Memory.AvailFrameCount < 8192
OutReadyQueue rating =
50 * Platform.Memory.OutReadyQueue if Platform.Memory.OutReadyQueue <= 1
50 + 8 * (Platform.Memory.OutReadyQueue - 1) if Platform.Memory.OutReadyQueue > 1
PagesPerSec rating = Platform.Memory.PagesPerSec
The Memory Component Rating for z/OS is essentially the worst of the three calculated ratings
Sliding Scale
The sliding scale defines the conditions under which the designated weighting is applied to
the statistic, as calculated by the formula defined above, and conditions under which the
weighting mechanism is abandoned in favor of another method to "escalate" the metric.
Define: w = weighting
tamber = amber trigger
tred = red trigger
Sliding Scale Memory Utilization Rating
w * MemoryUtil (if MemoryUtil < tamber) GREEN
(w * tamber) + ((100 - (w * tamber)) * (MemoryUtil - tamber))/(tred - tamber)
(if tamber <= MemoryUtil <= tred) AMBER
100 (if MemoryUtil > tred) RED
example: MemoryUtilization has 15% weight toward the blended stat
AMBER threshold = 50, RED threshold = 90:
Sliding Scale MemoryUtil Rating =
0.15 * MemoryUtil if MemoryUtil <= 50 GREEN
As MemoryUtil varies from 0 to 50, Sliding Scale varies from 0 to 7.5
7.5 + (92.5 * (MemoryUtil - 50) / 40) if 50 < MemoryUtil < 90 AMBER
As MemoryUtil varies from 50 to 90, Sliding Scale varies from 7.5 to 100
100 if MemoryUtil >= 90 RED
For NT we would have
MemoryFree (MB) MemoryUtil Sliding-Scale MemoryUtil
500 0 0
250 0 0
100 0 0
75 25 25 * .15 = 3.75
50 50 50 * .15 = 7.5
40 60 7.5 + 92.5*(60 - 50)/40 = 30.625
30 70 7.5 + 92.5*(70 - 50)/40 = 53.75
20 80 7.5 + 92.5*(80 - 50)/40 = 76.875
10 90 100
5 100 100
For Solaris we would have
Scan Rate MemoryUtil Sliding-Scale MemoryUtil
40 10 1.5
80 20 3.0
120 30 4.5
160 40 6.0
200 50 7.5
250 60 7.5 + 92.5*(60 - 50)/40 = 30.625
300 70 7.5 + 92.5*(70 - 50)/40 = 53.75
350 80 7.5 + 92.5*(80 - 50)/40 = 76.875
400 90 100
500 100 100