Solaris 2.6 : some syscalls Hanging (more)

From: DAUBIGNE Sebastien - BOR ( SDaubigne@bordeaux-bersol.sema.slb.com ) (SDaubigne@bordeaux-bersol.sema.slb.com)
Date: Fri May 16 2003 - 11:14:28 EDT


Additionnal information : This is not related to /dev/kstat (However, thank
you Haywood Steven).
It is related to the poll() syscall.
(mp,vm,io)stat commands all issue poll(0,0,<INTERVAL>) syscalls to sleep
between each sample (instead of an alarm() + sigsuspend() combination used
by sar and sleep commands).
For instance on a sane host :
# truss -aeflv all vmstat 5 3
(....)
12687/1: poll(0x00000000, 0, 5000) (sleeping...)
(......... 5 seconds sleep .............)
12687/1: poll(0x00000000, 0, 5000) = 0
(...)

This didn't work when our host went bad. The poll(0x00000000, 0, 5000)
waited indefinitely (instead of 5 seconds).
I've reproduced the bug with the following small C program :
#include <poll.h>
#include<stdio.h>
main(){
  if(poll(0,0,5000)<0)
    perror("poll");
}

On a sane host, it waits 5 seconds. On our bad host, it waits indefinitely.
Other applications hang on semop() calls.
Looks strange.

Any suggestion ?

---
Sebastien DAUBIGNE
sdaubigne@bordeaux-bersol.sema.slb.com
<mailto:sdaubigne@bordeaux-bersol.sema.slb.com>  - (+33)5.57.26.56.36
SchlumbergerSema - SGS/DWH/Pessac
	-----Message d'origine-----
	De:	DAUBIGNE Sebastien  - BOR (
SDaubigne@bordeaux-bersol.sema.slb.com )
	Date:	vendredi 16 mai 2003 14:36
	@:	sunmanagers@sunmanagers.org; sunhelp@sunhelp.org
	Objet:	Solaris 2.6 : some syscalls Hanging
	Solaris 2.6, kernel 105181-28.
	We'got some processes "hanging" (i.e. with blocking syscalls):
	vmstat, iostat, mpstat are hanging on poll() syscall : they open
	"/dev/kstat", issue some ioctl() on.it, and wait for data with
poll(). But,
	poll() never returns.
	sar is working fine (note that it doesn't use poll()).
	We also have some Oracle background and shadow processes hanging on
semop()
	Other processes are working fine (those that don't call poll() or
semop()).
	It seems some kernel syscall (at least poll() and semop()) are
waiting
	indefinitely :  Looks like some deadlock.
	System activity is low (sar shows 70% CPU free, lots of memory free,
no page
	scan). The only thing I can't see is mutex contention, as mpstat is
hanging.
	Any idea ?
	---
	Sebastien DAUBIGNE
	sdaubigne@bordeaux-bersol.sema.slb.com
	<mailto:sdaubigne@bordeaux-bersol.sema.slb.com>  -
(+33)5.57.26.56.36
	SchlumbergerSema - SGS/DWH/Pessac
	_______________________________________________
	sunmanagers mailing list
	sunmanagers@sunmanagers.org
	http://www.sunmanagers.org/mailman/listinfo/sunmanagers
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers


This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:26:25 EDT