weird cluster problem

From: Harald Husemann (harald.husemann@materna.de)
Date: Wed Jul 07 2004 - 12:40:13 EDT


Hi folks,

I have a weird Sun Cluster problem drivin' me nuts here...
System is a Sun Cluster 3.1 on Sol 9, built out of 2 E480 and two 3310
Storages.
We're using SC generic data service (SUNW.scgds) on it to control a
piece of our own software runnin' on the cluster.
So far, everything was fine, but now, the cluster refuses to start the
GDS resource:

Jul 7 18:22:35 teufel Cluster.PMF.pmfd: [ID 615790 daemon.notice]
"basis-rg,basis-rs,0.svc" Failed to stay up.
Jul 7 18:22:35 teufel Cluster.PMF.pmfd: [ID 615790 daemon.notice]
"basis-rg,basis-rs,0.svc" Failed to stay up

(basis-rg is the RG, basis-rs the resource)
Hmmm... seemed to me to be a problem of our software, so I removed the
start-, stop- and probe-scripts for the applications, and wrote new-ones
with just an "exit 0" in them.
But, no success, still the same failures. I removed the resource, and
recreated it, also no success.
I searched SunSolve, google, docs.sun.com, etc., but found nothin'...

So, this is my last chance, before rebuilding the entire cluster...

Any ideas would be fine, and btw.: What does this "failed to stay up"
mean?? Of course, the script exits with "0", and doesn't run forever -
so, what is monitored by the pmfd in case of a GDS service?? I've found
the error ID at dcos.sun.com, but there's no description, solution or
similar on this page...

Hope someone has an idea how to get out of this, will (of course)
summarize!

Have a nice hackin',

Harald
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:29:01 EDT