Invitation to Open Mic: Clustering Domino Servers

IBM is hosting an Open Mic conference call with Lotus Development and Support Engineering to discuss Clustering Domino Servers (versions 6, 7 & 8)

The Open Mic call will be held on Wed, Sept 17, 2008. This call will take place in one session at 10:00 AM EDT. The call will last 60 minutes. Please dial into the call 5 minutes before the scheduled start. This conference call is designed to be an open question & answer format, so bring your questions.

We also encourage you to share your experiences and perspectives about security and other Notes/Domino topics in the Notes/Domino Best Practices Wiki.

Please refer to the Open Mic Tech Flash (to be published shortly) for details about the conference call numbers. Please post any advanced questions within the ND8 forum by creating a response to this document. This Open Mic call will be recorded for future use, and will be made available via the Flash after the call.

Subject: couple of concerns related to mail cluster

Dear All , I have couple of concerns related to the mail cluster.

We are running Lotus Domino 7.0.3 on SLES 9 SP2.

  1. In few mail replicas on the mail cluster read messages appear as unread messages on the other other replica.However this does not happens for all the mail db replicas

  2. I would like to request for your expert opinion on how the mail dfailover works with internet messages coming in on a messaging cluster.How does my ISP recognise the cluster member if my primary server is not responding.

Subject: Unread Marks - similar issue

We are running CLUSTER servers on SOLARIS 10 mail servers and we have seetings on each mail to “push” unread marks across ALL Servers.

This has a good outcome for the majority but we have “instances” where users categorically state that they have read messages but the UNREAD count states different.

This has been an ongoing battle with Notes/Domino since late R4 version to R8.

Subject: MX records for SMTP delivery to clusters

SMTP mail delivery is based on MX records in your DNS. If your servers are named mail1.host.com and mail2.host.com, your MX records for host.com should be

@ MX 10 mail1

@ MX 20 mail2

, so that MTAs on the Internet will try to deliver incoming mail to mail1 first, and if that fails, deliver to mail2.

Subject: question about cluster replication

  1. It looks to me as “replication conflicts” can cause a lot of “document update” all time.Have 3 servers in a mail-cluster, 2 in office and 1 over wan. Use windows 2k3. 2 server on 8.0.2 and 1 on 8.0.1

In the cluster-replication log there is a lot of update on document each hour. A few have about 1000 update. (I ffind this strange?)

Have just done a test for 1 user with about 150 replication conf. Deleted most of the rep.conf. and the replication update went down. Have now tried to delete all replication conf in users mail-file and when I replicate his mail-file betwen the 2 server in office there is only 1 document update (or replicate to notes-client)

My question is: Do you know if “replication conflict” in mail-file is causing document update in cluster-replication and in normal-replication?

  1. Could you PLEASE update on what is status on “Cluster Streaming”. Are there/will there be hot-fix on 8.0.2? What is downside? (what problem may it cause?)

rgds

Subject: Not quite sure what the problem is that you’re describing

  1. Are you noticing that existing documents get updated during cluster replication only when you have conflict documents in the database? If so, have you investigated what the updates are? The presence or absence of conflict documents should not cause updates to different documents to occur or not occur, respectively.

  2. Fixes for SCR went into 8.0.3. SCR is now disabled by default in 8.0.2.

Subject: Cluster replication questions

1). Have they done any performance testing with Domino 8.0.2 clusters?2). How can we take advantage of new streaming cluster replication features?

3). We have specified a cluster name as the destination in a replication connection doc and are seeing the same database on the same server taking up two threads at the same time during replication. Why does this happen and how can we prevent it?

Subject: Cluster Server recovery & Catch-up.

When the Primary server crashes and the users failover we notice that after the primary comes back up it takes quite a bit of time for the secondary to sync back with the primary. Is there a way to restrict the access to the primary until the mail files are back in sync?

Subject: Response

You can set SERVER_RESTRICTED=1, but you’ll need to monitor replication and disable it once things are normalized.

Subject: Clustering & db encryption

Will clustering work for mail database which is encrypted ?

Subject: Various domino cluster questions

Hi to all, i’ve two domino 8.0.1 clustered servers under windows server 2k3 plaftorm, and we’re fighting with some “problems”

  1. same as above the unread marks problem for mailboxes

  2. the balancing acts strange because the 1st server always works more than the 2nd and sometimes some notes clients(8.0.1) gets a “server busy” warning message without being redirected to the “available” server, so how we can be sure that the cluster balancing/failover works well?

  3. is the cluster analysis the only “tool” available for check the cluster?

  4. adding scheduled replicas for sure helps the database synchronization between the nodes, but doing some tests we’ve got this strange behaviour :

a database that is clustered : i update a record from the server 1 and its replica on the server 2 gets immediately the update; if i do the same on the server 2 the update seems does not arrives on the replica on server1(even we don’t provide a scheduled replica for it)

the question is what is the best way for debug this kind of situations?

  1. ( and last!) ICM could be a way for do balancing/failover for http protocol on a cluster, but adding a ICM server couldn’t be a SPF for our network?

Thanks in advance :slight_smile:

Ciao

Alberto Ernestini

Subject: Some answers

  1. Not sure what issue you are referring to here.

  2. If the clustermate is unavailable, then the user should still gain access to the chosen destination server. Are you saying the user doesn’t get access to any server?

  3. I’m not aware of any other tool to monitor a cluster. You could monitor clustering stats with an event generator/handler.

  4. See if you can replicate manually from Server 2 to Server 1. If this works, make sure all clustering tasks are loaded and that cldbdir is up to date. See if anything cluster replicates or if it’s specific to a single database.

  5. I’m not sure what you’re asking here. If you can re-word the question, I can take a stab at it.

Subject: Server slow during restart even with Transaction logging enabled.

We wanted to know the best practice for forcing customers and mail delivery to a cluster mate after the server crashes and is attempting to restart. During the restart, consistency checks bring the server to its knees and customers complain because of the slowness. We need mail to also fail over to the cluster mate while to server is completing the consistency checks because Cluster replication and scheduled replication is to slow. Consistency checks can take several hours (over 20) to complete.On the Transaction log tab, we have the following set:

Logging – Enabled,

Path – T:,

Logging Style – Archived, Automatic fixup…. – Enabled,

Runtime/Restart performance: Standard,

Quota Enforcement - Check File Size when extending the file.

Thank you for your help.

Jeff

Subject: How about this?

Create a trivial script that sets SERVER_RESTRICTED=1 in the notes.ini, and specify this script to run after a fault (server doc → Run this script after server fault/crash).

Subject: Domino 7 on VMware

Hi,

We have a Domino Cluster running on VMware Infrastructure 3. All the disks on the Windows 2003 servers are connected via iSCSI to a Lefthand SAN.

Is there any reference on how this will perform? We notice that the AvailabilityIndex is not coming above the 70%.

A. van der Reep

Subject: Tuning SAI

You will almost certainly need to tune the server to normalize your SAI.

http://bleedyellow.com/blogs/chadscott/entry/sai_made_easy

Subject: switching back to Main mail Server

We have many customers using Domino clusters. I have the following problem with the majority of them because they are using domino cluster for two purposes - Creates a backup copy of domino data

  • minimize time of responsiveness

so the customer wants that the users primarly must use only the mail server especially when the mail server is responding. This can not be achieved by the notes client, once the client is using for ex. the replica copy on the second server caused for example when the principal mail server was down for some days.

Subject: Technote mentioned in the call?

Can you please post the number / link to the technote referenced in the discussion about how many cluster replicators to run based on work queue depth and seconds of queue?

Subject: Link

http://www.ibm.com/support/docview.wss?rs=899&uid=swg21139259

Subject: running agents within a cluster

Hi, what is the suggested way to run agents within a cluster ? At the moment we disable the amgr on the cluster-server… I think this should not be the way it is wanted