Re: [Nolug] Quick question on clustering... from -ray on 2003-03-26 (nolugarchives)

From: -ray <ray_at_ops.selu.edu>
Date: Wed, 26 Mar 2003 15:04:29 -0600 (CST)
Message-ID: <Pine.LNX.4.44.0303261351001.27824-100000@romulus.csd.selu.edu>

Guess it depends on what you mean by "clustering", and what you want out
of it. Most common reasons are for high-availability, or
high-performance, or a combination of both.

On Wed, 26 Mar 2003, Scott Harney wrote:

> Yeah. I'm doing a SAN here. SAN's are generally not SCSI but fiber
> channel. A SAN involves attaching your FC adapters to a SAN
> switch. (I've not seen a SCSI SAN switch, but then I'd never looked to
> see if such a thing did/could exist) The switch is cabled to the
> drives as well. It's roughly analogous to a LAN, hence the name. I'm
> not a DB admin, but I don't think you'd have multiple database
> instances writing to the same database though. The locking could get
> pretty complex.

We use a Xiotech SAN here. The servers directly attach with FC but the
drives themselves are SCSI... We don't have a SAN switch yet since you
can attach 8 servers directly, but i know the director class FC switches
are pretty expensive (20k-30k).

> With clustering software you can do this with direct attached storage.
> One box has to have hold of the drives though -- you can't read write
> simultaneously. The software in conjunction with a private network
> and some sort of detection mechanism determines if a cluster node has
> failed and switches the mountpoint (and services if this is all
> implemented properly) to the other box.

This is how Redhat Advanced server works. You have multiple heartbeats
(network/serial) and a quorum partition. Both servers watch the quorum
partition, but the master periodically updates it with current status.
When the slave sees the master has stopped updating the quorum and the
heartbeats are down, it assumes the master is down and will initiate a
takeover of the services (mounts the filesystems, takes over IP addresses,
runs startup scripts).

As far as not being able to read/write from both nodes simutaneously, it
is more a limitation of the filesystem (ext3 for example) not being
"cluster" aware. We could mount an ext3 filesystem on the SAN from both
nodes, but one FS didn't know about changes the other one was making. So
the other node just thought the FS got corrupted (I/O errors)... same
thing happens when one is RW, and one is RO. The RO node doesn't know
about the updates the RW node makes...so it doesn't work. :) There are
cluster-aware linux filesystems out there. IBM GPFS, Sistina GFS, Oracle
OCFS. OCFS is GPL, and Redhat has plans to include it with the next
Advanced Server release. Nice.

As a side note... all this clustering stuff is nothing new. OpenVMS
has been doing stuff like this for years and years. From an OS clustering
standpoint, VMS is light years ahead of anything you'll seen on a *nix
box. Too bad i don't like anything else about VMS, haha.

-ray

-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Ray DeJean  				       	 http://www.r-a-y.org
Systems Engineer                    Southeastern Louisiana University
IBM Certified Specialist  	      AIX Administration, AIX Support
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
___________________
Nolug mailing list
nolug@nolug.org

Received on 03/26/03