Re: [Nolug] Quick question on clustering...

From: Ron Johnson <ron.l.johnson_at_cox.net>
Date: 26 Mar 2003 15:43:52 -0600
Message-Id: <1048715032.16060.96.camel@haggis>

On Wed, 2003-03-26 at 15:04, -ray wrote:
> Guess it depends on what you mean by "clustering", and what you want out
> of it. Most common reasons are for high-availability, or
> high-performance, or a combination of both.
>
> On Wed, 26 Mar 2003, Scott Harney wrote:
>
[snip]
> This is how Redhat Advanced server works. You have multiple heartbeats
> (network/serial) and a quorum partition. Both servers watch the quorum
> partition, but the master periodically updates it with current status.
> When the slave sees the master has stopped updating the quorum and the
> heartbeats are down, it assumes the master is down and will initiate a
> takeover of the services (mounts the filesystems, takes over IP addresses,
> runs startup scripts).

This isn't clustering; it's failover, but that's what MSFT advertised
early on as clustering (Wolfpack, I believe it was).

> As far as not being able to read/write from both nodes simutaneously, it
> is more a limitation of the filesystem (ext3 for example) not being
> "cluster" aware. We could mount an ext3 filesystem on the SAN from both
> nodes, but one FS didn't know about changes the other one was making. So
> the other node just thought the FS got corrupted (I/O errors)... same
> thing happens when one is RW, and one is RO. The RO node doesn't know
> about the updates the RW node makes...so it doesn't work. :) There are
> cluster-aware linux filesystems out there. IBM GPFS, Sistina GFS, Oracle
> OCFS. OCFS is GPL, and Redhat has plans to include it with the next
> Advanced Server release. Nice.
>
> As a side note... all this clustering stuff is nothing new. OpenVMS
> has been doing stuff like this for years and years. From an OS clustering
> standpoint, VMS is light years ahead of anything you'll seen on a *nix
> box. Too bad i don't like anything else about VMS, haha.

Well, VMS *is* The One True OS, and is still being improved, even though
it hasn't been marketed in 10+ years...

The bottom line in regards to VMS-like clustering (only shared-disk) is
that you need what the VMS world calls a Distributed Lock Manager (DLM).
In VMS, the Lock Manager (a tree) is the keeper of all knowledge
regarding which process is using which bit of each object (memory,
disk/file, signals, etc). This was extended to become a Distributed
Lock Manager to do clustering, which, as you can tell, is shared amongst
the clustered nodes to coordinate disk access.

The last bit needed for VMS clustering is "concentrating" hardware in
which all of the nodes, plus the disks, are plugged into. In the early
days, this was powered by a PDP-11. (You can also share disks that are
directly plugged into nodes, but that's really slow, since all data must
pass across slow wires like Ethernet.)

Note that in VMS, cluster-awareness is built deep into the OS, so that
every app, from something boring written in BASIC, COBOL or RPG to big
middleware like relational DBMSs, is automatically cluster-aware.

Thus, as you can see, just plugging a bunch of boxen into a SAN doesn't
a cluster make.

I think that IBM's GPFS (or is it GFS?) is an attempt to replicate the
VMS DLM, and am pretty sure that the Oracle CFS was created by (or ideas
taken from) engineers "bought" from DEC when Oracle bought DEC's
database management software back in 1996.

<Sigh> (Ex-)DEC engineers sure are sharp...

-- 
+---------------------------------------------------------------+
| Ron Johnson, Jr.        mailto:ron.l.johnson@cox.net          |
| Jefferson, LA  USA      http://members.cox.net/ron.l.johnson  |
|                                                               |
| Spit in one hand, and wish for peace in the other.            |
| Guess which is more effective...                              |
+---------------------------------------------------------------+
___________________
Nolug mailing list
nolug@nolug.org
Received on 03/26/03

This archive was generated by hypermail 2.2.0 : 12/19/08 EST