Cluster failover scenario

My Infrastructure contains two Domino servers, Server A and Server B.Both servers are in cluster connected through a WAN Link. On my lotus notes clients, the

users access

there mails using a local replica of the mail file. The client

replicates with the

server every 5 minutes. The connectivity is done using connection

documents on the

clients. Both servers connection documents have normal priority. Server

A is the

Primary mail server of the users and all clients are configured so that

the client

replicates with Server A first, and if not found goes to the cluster

server for

replication.

Now, I add another server in the scenario, Server C. Server A and

Server C are on the

same network. I break up the cluster between Server A and Server B, adn

make new

cluster group containing Server A and Server C. The connection Document

for Server C

has been added on each notes client with normal priority setting. All

clients have been

restarted so that cluster.ncf file is updated as well. The clients can

access all three

servers now.

Now what should happen is that if I close Server A, the Notes clients

should start

replicating with the server whose in cluster group with Server A,

namely Server C. But

the clients are accessing Server B if they dont find Server A. I

started

troubleshooting the issue and found that this is only happening in the

case of local

replicas. If I configure the client to access the mail fiel from the

server instead of

the local replica, its fail over scenario is as expected and it fails

over to Server C

if Server A is unavailable. And only if both Server A and Server C are

down the client

goes to Server B with a prompt. But in case of the local replica, in

case Server A is

down, it tries the server which was used in the last failover. It seems

that in case of

local replicas, while replicating the client doesnt read the cluster.

ncf to find the

cluster server. This was verified by the following tests I conducted:

  1. I closed Server A, the client tried to access Serevr A but on

unavailability,

shifted to Server B because it was the last server it accessed when a

previous failover

was performed.

  1. I also closed Server B. Now the client tried to access Server A,

then Server B and

on unavailability of both servers, shifted to Server C.

  1. Now I started Server A and Server B again so that the client could

go to its primary

server. And after that I again closed down Server A. Now the Client

shifted to Server C

instead of Server B, because during the previous failover, it accessed

server C. It

didnt even try for Server B. I thought that now teh problem was solved

and teh cilent

wont access Server B again, if Server C was available.

  1. Now I closed Server C as well. The client tried to replicate with

Server A, then

Server C and on not finding both, it shifted to Server B.

  1. Finally, i started Server C in the hope that now If i tried to

replicate again, the

client would try Server A first, then Server C and since it was

available now it will

replicate with it. But unfortunately, it tried to access Server A and

on its

unavailability directly tried Server B instead of Server C.

I think this shows that while using local replica scenario, on

replication the client

doesnt failover to the cluster member but instead goes for the server

it last

replicated to when Server A was not found.

So, my question is, do you agree with my assessment? If no, what is

wrong in my

configuration? If yes, is there a workaround for this?

My only requirement is that the notes client first try Server A, then

Server C and if

both are unavaiable, only then it should go to Server B

Subject: Something to Check

You said “1. I closed Server A, the client tried to access Serevr A but on unavailability, shifted to Server B because it was the last server it accessed when a previous failover was performed.” I might think about this a slightly different way and it may help see what is going on. Frequently, I have seen that when replication attempts point to a server that I did not expect, it is because the stacked icons point it to the unexpected server. The stacking order of icons reflects the relative order in which replicas were opened, I believe. While this doesn’t help fix the issue, it might explain the behavior you are seeing.

Something else to check is the cluster.ncf file that is in the Notes client Data directory. It may still have information pointing to Server B as a cluster member (even though it shouldn’t). You should be able to remove the cluster.ncf file and it will be rebuilt with the appropriate cluster members (Servers A and C).

Given all of that, I am not sure what the process is that a Notes client goes through when looking for a replica of it’s mailfile. I would have thought that it would be: Home Server, Home Server Clustermate(s), Other Replica Server. If that is true, than your situation could well be explained by a confused cluster.ncf.

Subject: Free cluster manager tool

Maybe our free cluster manager tool can help you: http://www.lialis.com/Applications/EnglishSite.nsf/dx/Lialis_Cluster_Administration_Tool