Huge Problem on Clustered servers

Please help. I walked into a big mess and I’m trying to figure it out. We have 2 clustered Compaq R6 servers on Windows 2000 sharing a compaq disk array (E:). The servers are setup to 2 separate data drives on the E drive(domino1 and domino2). We are having big problems with corrupted databases and databases disappearing. The os is reporting chkdisk errors in the event log and running the chkdisk might be the cause of the deleted files. Is this a possible way of running a domino cluster? Thanks for any help.

Subject: Huge Problem on Clustered servers

How are they sharing the disk array - are they fibre connected (SAN)-or are they using a SCSI channel?

If you’re having this big of issues, I’d just take one of the servers down, and see if the corruption issues go away. IF they’re clustered, then your users shouldn’t notice anything, and this will give you time to figure out what’s happening.

Normally - you’re not going to use Microsoft Windows native clustering. I’ve never used it, for the reason you’re running into. Domino native clustering makes it redundant and a needless expense.

Jon Johnston

Creative Business Solutions

IBM, Lotus Premier Partner

http:/www.cbsol.com

Subject: RE: Huge Problem on Clustered servers

I’m not using Microsoft clustering, just sharing the disk array via SAN. The domino service on the first cluster has been down for over a month and the problems still occur.

Subject: RE: Huge Problem on Clustered servers

[The domino service on the first cluster has been down for over a month
and the problems still occur.
]

Then this has nothing to do with the fact that your servers are clustered.
It’s instead related to some other SAN issue.

Subject: Huge Problem on Clustered servers

Whether this will work depends a lot on how that disk array is configured, but generally, this isn’t how you want to set up Domino clustering.

Since Domino clustering is application-level failover/load balancing, it’s designed with the expectation of independent storage systems. The cluster replicator ensures that redundant data on each cluster server is mirrored on the other members of the cluster.

You can get this to work with SANs, but I believe that each server has to see entirely independent volumes on the storage facility. From what you describe, you probably have considerable seek contention.

Subject: RE: Huge Problem on Clustered servers

So you would suggest partitioning the array into two separte drives and have each server access one of the partitions? What is the best way to setup a domino cluster on Windows 2000?

Subject: Huge Problem on Clustered servers

we have a SAN here, not exactly sure which one but it had a serious problem with two machines having simultaneous access to a single partition on the SAN, it’s like both machines have differing FAT tables or something. we tested by placing a file on the SAN partition on machine 1, machine 2 could see the file, but when we deleted the file off machine 1, machine 2 could still see it and the data was there, added a file to the partition from machine 2 and machine 1 could not see the file, it was useless.

i would never share a partion on a SAN anymore, create another partition on the SAN for the second machine and use that instead, it’s what we now do and have not had any problems whatsoever with this configuration

i would also check the SAN management software for alerts on the SAN itself - or look for orange lights on any of the drives or the array itself

Subject: RE: Huge Problem on Clustered servers

that’s exactly what I saw, too. Creating a directory or deletinig and the other server didn’t see the changes. I was going to do exactly what you said. Partition the SAN and have each server access one of the partitions. Thanks for your response.

Subject: Storage Area Network(SAN) / Network Attached Storage(NAS) FAQ

Storage Area Network(SAN) / Network Attached Storage(NAS) FAQ

This FAQ document will answer the most common questions asked with regards to storage solutions and how they work with Lotus Domino.

Q1. What is Network Attached Storage (NAS)?

A1. Network attached storage (NAS) is hard disk storage that is set up with its own network address rather than being attached to the department computer that is serving applications to a network’s workstation users. By removing storage access and its management from the department server, both application programming and files can be served faster because they are not competing for the same processor resources. The network-attached storage device is attached to a local area network (typically, an Ethernet network) and assigned an IP address. File requests are then mapped by the main server to the NAS file server. This type of configuration is very similar to a network file server.

Network-attached storage consists of hard disk storage, including multi-disk RAID systems, and software for configuring and mapping file locations to the network-attached device.

A Lotus customer using Domino would typically use the following configuration when using a NAS solution.

Please note that using a NAS configuration is not optimal since it is highly dependant on the customers “public” network (bandwidth, network traffic, topology, etc.). The issues seen with this type of configuration are usually around network contention and network speed. Therefore, end users may experience poor response time in a NAS configuration if there are any network issues.

Here is a sample of how a request from a Notes client to open their mail file would flow:

It starts at the client, goes over the network to the Domino server

The Domino server then passes the request over to where the database is located (in this case over to the NAS)

This request then travels over the network to the NAS

The information requested is then sent back to the Domino Server from the NAS

The information is sent from the Domino Server to the Client

This entire process can be time consuming if the “public” network is busy.

Q2. What is a Storage Area Network Solution (SAN)?

A2. A storage area network (SAN) is a high-speed special-purpose “private” network (or subnetwork) that interconnects different kinds of data storage devices with an associated data server on behalf of a larger network of users. Typically, a storage area network is part of the overall network of computing resources for an enterprise. A storage area network is usually clustered in close proximity to other computing resources but may also extend to remote locations for backup and archival storage, using wide area network carrier technologies such as asynchronous transfer mode or Synchronous Optical Network.

A storage area network can use existing communication technology such as IBM’s optical fiber ESCON or it may use the newer Fibre Channel technology. These technologies are designed for high-speed data transfer. Having a “private” network design for a SAN reduces network contention and improves performance for applications and end users.

SANs support disk mirroring, backup and restore, archival and retrieval of archived data, data migration from one storage device to another, and the sharing of data among different servers in a network.

A SAN solution is more widely seen in Domino environments because of the speed and dedicated bandwidth. There are generally two configurations that you will find in a Domino deployment.

Configuration #1: Single Domino Server connected to a SAN solution

This configuration shows a Domino server utilizing a SAN solution for storage of the Domino Data directory. The connection between the Domino Server and SAN is usually a private network which limits, if not eliminates, network bandwidth and contention issues.

Configuration #2: Multiple Domino Servers sharing a single SAN.

This configuration has multiple Domino Data directories stored on a single SAN solution. Again the SAN is connected to the Domino Servers over a private network. This configuration is not as common but can be successful. Performance depends on the SAN solution’s ability to support high numbers of concurrent read and write requests from multiple servers.

Q3. Is there a specific version of Domino for a SAN/NAS solution?

A1. NO. To date, Domino is not coded differently for a SAN/NAS environment. The only difference is the way in which you install Domino.

Q4. Who are some SAN/NAS vendors?

A1. Here are a few of the more common storage vendors. (not in an particular order). Many of the vendors provide both SAN and NAS solutions.

Network Appliance, Inc. - www.netapps.com/solutions/lotus_domino.html

IBM - www.storage.ibm.com

Compaq - www5.compaq.com/products/storageworks/solutions/eas/ldindex.html

EMC - www.emc.com/storagenetworking/

Q5. Why would a customer want to use a SAN/NAS?

A1. There may be many reasons why a customer would want to use a SAN/NAS but the more common two reasons are: (1)centralized management and (2) disk requirements. If you have your data in one physical location then that is easier to perform maintenance and backup/recovery. Secondly, data requirements / storage needs are becoming greater than you can meet with DASD.

Q6. Some of these vendors mention backup/recovery. How is that different than Lotus’ backup & recovery?

A1. Many SAN/NAS vendors mention backup & recovery but not all use Lotus’ backup and recovery APIs. In some cases, this may mean that the Domino Server has to be stopped in order to perform a backup.

Q7. Can I use Domino transaction logging in a SAN/NAS solution?

A7. Yes you can use transaction logging. However, Lotus recommends storing the transaction log extent files on a local mirrored disk (RAID1). Storing the extents in a SAN solution requires that the SAN have a very high degree of redundancy and concurrent read/write capability. Storing the extents in a NAS solution is not recommended.

Hans - IBM Norway

Subject: RE: Storage Area Network(SAN) / Network Attached Storage(NAS) FAQ

Hi there,

I’m currently involved in a European Wide Data Storage Project where we’re looking at storing all our Notes Data to Netapps Filers.

From what i understood from our Message Team Manager it isn’t possible to use a “shared” Voume for Notes R6 Clusters so therefor they came up with the following solution :

4 Notes R6 Clustered Servers with all a 1TB Diskarray attached…where 2 Servers are located in Maidenhead(UK) and 2 in Maarssen(NL)

I personally think it’s crazy to have 4 Servers which are all serving the same data especially if it’s only for 850 users.

Load Balancing might be one of the benefits but consolidation is not an option.

Wouldn’t it be possible that both Servers in Maidenhead make use of one volume on their Filer and both servers in Maarssen make use of one volume on their Filer in a sort of Passive / Active Role ? (Like MSCS) that one server is handeling all requests but when the server dies for some reason the other one is becoming the active one automaticly ? Then whenever both Servers die at a location the Server in the other Location will automaticly take over the requests

Because i can’t justify a real big filer only because Lotus Notes doesn’t allow us to use shared volumes

Thanks

Jeroen

Subject: RE: Storage Area Network(SAN) / Network Attached Storage(NAS) FAQ

It is in the nature of Notes Clustering that the data is stored redundantly. That’s why it’s not necessary to get elaborate drive storage solutions for Domino servers. You can have pretty straight-forward RAID systems for each cluster member, because it’s going to automatically mirror data between cluster members.

If you’re going to have 2 UK servers and 2 NL servers, all in a 4-way cluster, you’ll probably need to do some pretty good management for that unless you have wide pipes. You’re going to have a much larger issue of “how do you keep UK users from going to NL for database when one UK cluster member is down?”

Why, by the way, do you need a 1TB file system for 850 users? Do you really expect 1.2GB per user of storage requirements?

Subject: Huge Problem on Clustered servers

Dunno if this will help, but in my experience so far, some SAN solutions are a lot slower than internal SCSI disks. And in that instance, indexer, transaction logging and adminp all go very slow, and start to corrupt things. (In one instance, the SAN disk array was actually slower than a FLOPPY DRIVE… )

Recommend moving your transaction logging onto a local hard drive. If your not running this, enabling it might dramatically reduce your disk IO to the SAN.

Lastly, I’d echo everyone elses recommendations to have each machine see its own SAN partition.

Good luck, and tell us how you get on,.

—* Bill