<< | Thread Index | >> ]    [ << | Date Index | >> ]

Subject: RE: Service exiting
From: Jan Olderdissen <jolderdissen,AT,ixiacom,DOT,com>
Date: Wed, 13 Mar 2002 02:28:58 +0100


I've tracked down the problem to memory locations being peppered with the
value 0x10. In the case of CipeTapIO::RequestAsyncReceive(), the ebp
register is trashed with the value 0x10. I'm assuming this happens during
one of the prior function calls. When RequestAsyncReceive attempts to
return, the epilog copies ebp back into esp. When the subsequent return
executes, the processor causes an exception. I assume the trashed state of
esp then causes a secondary exception during standard exception handling
which makes the process exit without comment. When running as a debugged
process under MSVC, the debugger does catch the problem nicely though.

Similar things happen in CipeTapIO::CompleteAsyncReceive(). There, the local
variable l_SocketCount is trashed with the value 0x10 in the first iteration
of the loop. Subsequently, the second iteration of the loop causes an

Sadly, I've yet to discover who or what is trashing the stack. As soon as I
add code to verify ebp integrity the problem goes away. The other error case
doesn't happen often enough to be useful. Do you have any inkling whatsoever
where this 0x10 might be coming from?


-----Original Message-----
From: Damion Wilson [mailto:dwilson,AT,ibl,DOT,bm
Sent: Tuesday, March 12, 2002 16:46
To: Jan Olderdissen
Cc: cipe-l,AT,inka,DOT,de
Subject: Re: Service exiting

It may even be one of the error statuses passed out by CompleteIRP in 
cipdrvr.c when it's upset about a packet size or something. I may have to 
reevaluate returning STATUS_UNSUCCESSFUL to the read IRP substituting
a successful read of 0 bytes.

Another thought. This might be occurring when there's just enough traffic
the packets to queue up in AdapterTransmit (cipdrvr.c) so that the driver 
answers read IRPs with packets stored for a while. Since this code was even 
working properly (re queued packets) in my original CIPE-Win32 (different 
design), I don't know if that's a red herring or not. Again, this behaviour 
does not appear to exhibit itself on NT4.

(Sigh) well we're almost there aren't we ?


On Tuesday 12 March 2002 02:18 am, you wrote:
> Hi Damion,
> I'm finally able to devote company time on CIPE. Today I've made some
> inroads into the issue with the service exiting at higher bandwidth. What
> appears to be happening is that under certain conditions,
> CipeTapIO::RequestAsyncReceive() fails to exit and returns to the command
> prompt instead. I'm suspecting some kind of asynchronous interaction with
> the ReadFile() call that sometimes messes up the stack. I can make the
> error much more likely to occur by adding DbgPrint ("a") as the last line
> of the function. If I do that, my CIPE test system becomes completely
> unstable when I do any kind of volume transfer. Sometimes, both sides exit
> simultaneously!
> My current pet theory is that under the failure condition, ReadFile will
> fail but the driver will get a time slice before the function exits.
> Something is messed up at this point and the function's return fails. I
> will research this more, of course. It would be very interesting to find
> out whether you can make the error occur when you add that DbgPrint()
> statement as discussed above.
> I've also found a bizarre workaround that I'll have to research some more.
> It is totally incomprehensible to me why it should make a difference. But
> it does make the system stable. More on that later.
> Jan

<< | Thread Index | >> ]    [ << | Date Index | >> ]