[ << | Thread Index | >> ]    [ << | Date Index | >> ]

To: <cipe-l,AT,inka,DOT,de>
Subject: RE: Tunnel collapses with "Bad file descriptor"
From: "Mark" <msalists,AT,gmx,DOT,net>
Date: Thu, 13 May 2004 13:16:46 -0700
Importance: Normal
In-reply-to: <408300B5.4060307@cs.yale.edu>

I just noticed that the two tunnel ends where running different versions of
cipe.
One had 1.4.5-18 (Fedora/RedHat), the other had 1.4.5-16 (Fedora/RedHat).
Could this have been the reason for the problems?

Thanks,

MARK

-----Original Message-----
From: owner-cipe-l,AT,inka,DOT,de [mailto:owner-cipe-l,AT,inka,DOT,de On 
Behalf Of
Michael Fischer
Sent: Sunday, April 18, 2004 3:27 PM
To: Mark
Cc: cipe-l,AT,inka,DOT,de
Subject: Re: Tunnel collapses with "Bad file descriptor"

Dear Mark,

I've had problems in the past with my cipe daemon dying at this same 
spot in the code, but I got the error "Interrupted system call", which 
can happen normally during a blocking read and is not handled properly 
by the cipe code.  (See ciped.c in the source around the line that 
contains the text "kxchg: read(r)".)  What that code is trying to do is 
to get some random bits from /dev/urandom.  /dev/urandom is opened by 
main().  Its file descriptor is passed to mainloop() as the third 
argument and is in turn passed to kxchg(), which generates the observed 
error message.  The return from the original open() call is checked for 
validity, so it isn't at all clear why the read() call in kxchg() finds 
the descriptor bad.  Perhaps /dev/urandom is somehow getting closed, or 
the descriptor is getting corrupted, or an incorrect error message is 
getting logged and you're really encountering the same problem that I was.

You can find my old posting about the "Interrupted system call" error on 
the cipe-l archives at
http://sites.inka.de/bigred/archive/cipe-l/2003-12/msg00003.html
Good luck at tracking this one down!

--Mike

Mark wrote:

>Hi,
>
>I have a cipe (v.1.4.5) tunnel running between a Redhat9 and a Fedora 
>core 1 machine. The tunnel gets established successfully and works fine 
>until at some point - sometimes after a few hours, sometimes after a 
>few days - it collapses with this error message:
>
>Apr 14 19:22:50 lvd1 ciped-cb[2658]: kxchg: read(r): Bad file descriptor
>Apr 14 19:22:50 lvd1 ciped-cb[2658]: Interface stats 22552096  201358    4
>0    0     1          0         0        0
>       0    0    0    0     0       0          0
>Apr 14 19:22:50 lvd1 ciped-cb[2658]: KX stats: rreq=0, req=335, 
>ind=336, indb=0, ack=328, ackb=0, unknown=0 Apr 14 19:22:50 lvd1 
>ciped-cb[2658]:
>cipcb1: daemon exiting
>
>Any idea where this might come from?
>The machines are sitting locally within a LAN - it's just a test setup 
>for now...
>
>Thanks,
>
>MARK
>
>
>--
>Message sent by the cipe-l,AT,inka,DOT,de mailing list.
>Unsubscribe: mail majordomo,AT,inka,DOT,de, "unsubscribe cipe-l" in body 
>Other 
>commands available with "help" in body to the same address. CIPE info 
>and list archive: <URL:http://sites.inka.de/~bigred/devel/cipe.html
>

--
Message sent by the cipe-l,AT,inka,DOT,de mailing list.
Unsubscribe: mail majordomo,AT,inka,DOT,de, "unsubscribe cipe-l" in body Other
commands available with "help" in body to the same address. CIPE info and
list archive: <URL:http://sites.inka.de/~bigred/devel/cipe.html>


[ << | Thread Index | >> ]    [ << | Date Index | >> ]