| Subject: | Re: BUG: crasher [IMPORTANT PATCH] |
| From: | Olaf Titz <olaf,AT,bigred,DOT,inka,DOT,de> |
| Date: | Sat, 19 Jan 2002 23:08:06 +0100 |
| In-reply-to: | <E16NgCt-0001da-00@bigred.inka.de> |
> This, IMHO, is a kernel bug. I mean, even if the cipe lkml > oopses and continues to work, the kernel should eventually > close the socket. The problem with the thing above is, that > even though the device is not listed in the kernel space > anymore we can still send packets to this IP and fill up > the RX queue and waste kernel memory. This goes up to 132kb! Not so much a bug as normal behaviour. An oops can leave the kernel in an inconsistent state. In this case, the process context (of ciped) is forcibly removed without doing all of the cleanup work usually required. This cleanup work includes closing all file descriptors which in turn causes the cipe module to release the socket. So the socket remains in use and can't be reclaimed, since all access paths to it are gone. This is common with kernel Oopses. Since an Oops can occur _anywhere_ in the kernel (think corrupted memory management data structures) and it is not possible in the general case to determine the cleanup work needed, the only reliable way to get the kernel into consistent state after an Oops is to reboot. Of course, the _cause_ of an Oops is always a severe kernel (or in this case, module) bug. Olaf