X-Windows "staggering" effect

From: LeBar, Russell J (Russell.J.LeBar@erac.com)
Date: Mon Jul 31 2006 - 19:47:07 EDT


Hello everyone. I have a weird problem that I am having a hard time
tracking down. Sometimes user input (both keyboard and mouse are
affected) is delayed (hence the staggering or paused description) for a
couple seconds. Usually this is intermittent but there was one day where
we could duplicate it easily. What we have found out so far is:

1. Affects Sun Ray users.

2. Affects X-Windows on the local display (i.e. :0.0).

3. Seems to be more noticeable for XtoX users.

4. Citrix users do not appear to be affected.

5. Appears to be something that is sleeping for 30 seconds and then
spending roughly 2-5 seconds doing something, then sleeping again and so
on and so forth. At least that's what was going on when we could
duplicate the issue.

The server:

I have a SunFire V880 running Solaris 8 (final hardware release and
recommended patch cluster of about a month ago. We're running Sun Ray
Server Software 3.1 and MetaFrame Presentation Server for UNIX 1.2 (one
patch behind). We have / and /export on DiskSuite mirrored drives (along
with a swap partition on each which are also mirrored). System has 6
900Mhz processors and 12GB of RAM. Only about half of the physical
memory is in use (according to top). Sun Directory Server 5.2 is also
installed but we are not really using it right now.

We have two qfe cards in the system and interfaces are as follows. Al
are forced to 100-Full via /etc/system

qfe3: Restricted server-to-server network. Mostly NFS and backups
(Netbackup).

qfe4: Access to a very small (couple of devices) lab network

qfe5: Main interface. Routed sun ray traffic, Citrix ICA traffic, DHCP,
most X-windows traffic from remote systems, etc.

qfe6: Network management. Mainly for access to network devices. Some
application traffic.

This server functions as an X-Windows display server and a gateway to
other servers and devices. So there is little I/O going on. We have
around 50 Sun Rays but not all are in use at once. We have a total of 30
Citrix user licenses but normally have less than 10 users on at one time
(actually, probably less than 5).

We've looked at vmstat and iostat but nothing obvious stood out. On the
one day we could duplicate things it looked like there may have been a
correlation with minor page faults but we never could establish it.
Since cron's granularity is down to the minute we can rule it out. We
have people running Firefox with pages that auto-refresh but these pages
are either set to 10 second refreshes or 60 seconds which again does not
correlate. Right now I could use some good troubleshooting tricks, ideas
on particular things to look for, etc.

Thanks in advance!! Will post summary!

--
Russ LeBar
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers


This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:40:29 EDT