RE: Service exiting|
Jan Olderdissen <jolderdissen,AT,ixiacom,DOT,com>|
Wed, 13 Mar 2002 19:26:53 +0100|
Yes, I definitely have no blue screens under any conditions with the flags
I'm not sure what drivers are supposed to do on hibernation. But a regular
shutdown will also crash my W2k machine with the flags present. I suspect
the driver gets unloaded by default during hibernation.
From: Damion Wilson [mailto:dwilson,AT,ibl,DOT,bm
Sent: Wednesday, March 13, 2002 10:12 AM
To: Jan Olderdissen
Subject: Re: Service exiting
You're saying it definitely does NOT crash on hibernate with the flags
If so, there's something else at play here. I don't have a test machine that
does hibernates. In fact, I don't have code in the driver specifically to
handle hibernates. What is supposed to happen to each device when the driver
is told to hibernate/suspend ?
See, there's no problem in allocating these structures with the flags. The
IRQL is right and there's more than enough of this type of memory to do it,
so that's not the issue. The only thing that can happen to this memory is
freeing it at the wrong IRQL or with the wrong flags. Do the devices get
deallocated during hibernate and the driver unloaded ? (I need to read up on
this when I get home)
More to come...
On Wednesday 13 March 2002 01:49 pm, you wrote:
> please point me to the place in the documentation where non-cached,
> contiguous memory is required for the device structure. All of my kernel
> mode drivers never use these flags. A couple of them have been running in
> the field for years without any detrimental effects. And yes, I've already
> removed the flags and run without them for a while because my system will
> crash on hibernation attempt with them.
> -----Original Message-----
> From: Damion Wilson [mailto:dwilson,AT,ibl,DOT,bm
> Sent: Wednesday, March 13, 2002 9:45 AM
> To: Jan Olderdissen
> Cc: cipe-l,AT,inka,DOT,de
> Subject: Re: Service exiting
> How did I reverse those ? It must have been a late night :-) I should
> probably go and cull some more of the extraneous DbgPrint calls.
> That's not completely correct. The device structure does indeed need to
> those flags and is valid in using them as it happens in driver
> initialization. You're welcome to try it out without them. The only other
> instances were within the linked list management routines which now, do
> use them.
> I don't know what WinZip did but I didn't add those paths. They don't even
> exist in my development environment.
> On Wednesday 13 March 2002 11:51 am, you wrote:
> > You might consider also fixing the other two issues I had brought up a
> > while ago:
> > 1. Incorrect argument to DbgPrint
> > In CipeSocketIO::CipeSocketIO:
> > DbgPrint ("[%s] Can't bind to port [%d]\n", ntohs
> > Name().c_str());
> > should be
> > DbgPrint ("[%s] Can't bind to port [%d]\n", Name().c_str(), ntohs
> > (l_LocalInfo.sin_port));
> > This currently causes the service to crash when the system is
> > misconfigured.
> > 2. non-cached, contiguous flags
> > In the driver, all instances of NDIS_MEMORY_NONCACHED |
> > NDIS_MEMORY_CONTIGUOUS flags should be replaced by 0.
> > I noticed that the released zip package currently contains a bunch of
> > directories that weren't there before. This item has nuisance value
> > Jan
> > -----Original Message-----
> > From: Damion Wilson [mailto:dwilson,AT,ibl,DOT,bm
> > Sent: Wednesday, March 13, 2002 7:38 AM
> > To: Jan Olderdissen
> > Subject: Re: Service exiting
> > Good man ! I wish someone would pay me to work on CIPE all day ! I will
> > it out tonight and put out the bugfix release (with your credits added).
> > It's interesting that the call stack should be arranged just so for the
> > socket overlapped I/O operations to corrupt the TAP I/O method
> > This and other "gotchas" seem to imply that, regardless of how long they
> > took, Microsoft really didn't think out the asynchronous I/O stuff like
> > they
> > should have. It also tends to explain why Unixen typically don't
> > implement asynchronous I/O (instead using explicit threading or multiple
> > processes w/shared memory).
> > DKW
> > On Wednesday 13 March 2002 12:58 am, you wrote:
> > > Damion,
> > >
> > > I think I got it! Check out the call to WSARecvFrom() in
> > > CipeSocketIO::RequestAsyncReceive(). The argument lpFromlen has to be
> > > persistent until the IO completes according to the documentation.
> > > However, it is an automatic variable! And, not surprisingly, the
> > > is set to 0x10 before the call to WSARevcFrom. It stands to reason
> > > it is set to 0x10 again upon completion of the asynchronous IO.
> > >
> > > Jan
> > >
> > > -----Original Message-----
> > > From: Damion Wilson [mailto:dwilson,AT,ibl,DOT,bm
> > > Sent: Tuesday, March 12, 2002 16:46
> > > To: Jan Olderdissen
> > > Cc: cipe-l,AT,inka,DOT,de
> > > Subject: Re: Service exiting
> > >
> > >
> > > It may even be one of the error statuses passed out by CompleteIRP in
> > > cipdrvr.c when it's upset about a packet size or something. I may have
> > > reevaluate returning STATUS_UNSUCCESSFUL to the read IRP substituting
> > > instead
> > > a successful read of 0 bytes.
> > >
> > > Another thought. This might be occurring when there's just enough
> > > for
> > > the packets to queue up in AdapterTransmit (cipdrvr.c) so that the
> > > answers read IRPs with packets stored for a while. Since this code was
> > even
> > > working properly (re queued packets) in my original CIPE-Win32
> > > design), I don't know if that's a red herring or not. Again, this
> > behaviour
> > > does not appear to exhibit itself on NT4.
> > >
> > > (Sigh) well we're almost there aren't we ?
> > >
> > > DKW
> > >
> > > On Tuesday 12 March 2002 02:18 am, you wrote:
> > > > Hi Damion,
> > > >
> > > > I'm finally able to devote company time on CIPE. Today I've made
> > > > inroads into the issue with the service exiting at higher bandwidth.
> > What
> > > > appears to be happening is that under certain conditions,
> > > > CipeTapIO::RequestAsyncReceive() fails to exit and returns to the
> > command
> > > > prompt instead. I'm suspecting some kind of asynchronous interaction
> > with
> > > > the ReadFile() call that sometimes messes up the stack. I can make
> > > > the error much more likely to occur by adding DbgPrint ("a") as the
> > > > last
> > line
> > > > of the function. If I do that, my CIPE test system becomes
> > > > unstable when I do any kind of volume transfer. Sometimes, both
> > > > exit simultaneously!
> > > >
> > > > My current pet theory is that under the failure condition, ReadFile
> > > > will fail but the driver will get a time slice before the function
> > > > exits. Something is messed up at this point and the function's
> > > > fails. I will research this more, of course. It would be very
> > > > interesting to find out whether you can make the error occur when
> > > > add that DbgPrint() statement as discussed above.
> > > >
> > > > I've also found a bizarre workaround that I'll have to research some
> > > > more. It is totally incomprehensible to me why it should make a
> > > > difference. But it does make the system stable. More on that later.
> > > >
> > > > Jan