Subject: Re: Tunnel collapses with "Bad file descriptor"
From: Michael Fischer
Date: Sun, 18 Apr 2004 18:27:01 -0400
Dear Mark,

I've had problems in the past with my cipe daemon dying at this same spot in the code, but I got the error "Interrupted system call", which can happen normally during a blocking read and is not handled properly by the cipe code. (See ciped.c in the source around the line that contains the text "kxchg: read(r)".) What that code is trying to do is to get some random bits from /dev/urandom. /dev/urandom is opened by main(). Its file descriptor is passed to mainloop() as the third argument and is in turn passed to kxchg(), which generates the observed error message. The return from the original open() call is checked for validity, so it isn't at all clear why the read() call in kxchg() finds the descriptor bad. Perhaps /dev/urandom is somehow getting closed, or the descriptor is getting corrupted, or an incorrect error message is getting logged and you're really encountering the same problem that I was.

You can find my old posting about the "Interrupted system call" error on the cipe-l archives at
Good luck at tracking this one down!


Mark wrote:


I have a cipe (v.1.4.5) tunnel running between a Redhat9 and a Fedora core 1
machine. The tunnel gets established successfully and works fine until at
some point - sometimes after a few hours, sometimes after a few days - it
collapses with this error message:

Apr 14 19:22:50 lvd1 ciped-cb[2658]: kxchg: read(r): Bad file descriptor
Apr 14 19:22:50 lvd1 ciped-cb[2658]: Interface stats 22552096  201358    4
0    0     1          0         0        0
      0    0    0    0     0       0          0
Apr 14 19:22:50 lvd1 ciped-cb[2658]: KX stats: rreq=0, req=335, ind=336,
indb=0, ack=328, ackb=0, unknown=0 Apr 14 19:22:50 lvd1 ciped-cb[2658]:
cipcb1: daemon exiting

Any idea where this might come from?
The machines are sitting locally within a LAN - it's just a test setup for



