<< | Thread Index | >> ]    [ << | Date Index | >> ]

Subject: Re: [PATCH 2.2.17] _SMP_ doesn't export tasklist_lock needed in cipe.h
From: Olaf Titz <olaf,AT,bigred,DOT,inka,DOT,de>
Date: Wed, 29 Nov 2000 23:19:49 +0100
In-reply-to: <3A22A136.35147233@tac.ch>

> DON'T try to run cipe-1.4.x on a SMP machine. It crashes really
> badly. I try to reconstruct it but this will take some time.

For which .x and (more importantly) for which Linux version? I'd very
much like to know. I don't know much about the whole SMP stuff and
don't have an SMP system to test it.

About the tasklist lock: I have changed tasklist_LOCK() to NOP in CIPE
1.4.5 and 1.5 for Linux 2.2. I'm not really sure if this is correct
but if it isn't correct, the result should be a race condition with
fork() and exit(), which could cause CIPE to accidentally re-assign a
used device, leading to EBUSY in ciped, at worst. Surely not a
reproducable crasher.

If it does crash because of this problem, then the fix for the Linux
kernel is the right one, as modules which need to access the task list
_must_ have access to that lock, then.

> the ../include/asm/spinlock.h is causing the trouble:

Are you _sure_ you're reading this right? (Not my code, btw.)
That not working would basically mean Linux under SMP were unusable at
all.

> /*
>  * On x86, we implement read-write locks as a 32-bit counter
>  * with the high bit (sign) being the "write" bit.
>  *
>  * The inline assembly is non-obvious. Think about it.
>  */
> #define read_lock(rw)   \
>         asm volatile("\n1:\t" \
>                      "lock ; incl %0\n\t" \
>                      "js 2f\n" \
>                      ".section .text.lock,\"ax\"\n" \
>                      "2:\tlock ; decl %0\n" \
>                      "3:\trep; nop\n\t" \
>                      "cmpl $0,%0\n\t" \
>                      "js 3b\n\t" \
>                      "jmp 1b\n" \
>                      ".previous" \
>                      :"=m" (__dummy_lock(&(rw)->lock)))
> #define read_unlock(rw) \
>         asm volatile("lock ; decl %0" \
>                 :"=m" (__dummy_lock(&(rw)->lock)))
>
> Are you sure about this, Olaf?

As stated in the header file, it is non-obvious. ;-)
The .section and .previous commands cause code to be moved around, and
the actual generated machine code looks like this:

  ; Start of read_lock();
1:      lock; incl %0
        js 2f           ; Jump to spinning code if write-locked
  ; End of read_lock(); C code continues here

  ; Elsewhere in the code segment:
2:      lock; decl %0
3:      rep; nop
        cmpl $0, %0
        js 3b           ; Respin if still locked
        jmp 1b          ; Unlocked, retry to acquire lock

which looks much less weird. The reason for moving the code around is to
make the common case (i.e. unlocked and getting the lock succeeds) as
short as possible and get the non-common case out of the cache lines.

> > This trivial patch includes it. However I ask the author to cross-
> > check its correctness since I'm not sure if he really wanted it
> > that way because I can load the module but I can't get a tunnel
> > interface configured and running with ciped-cb -o myconfig. But
> > this may also be due to my lack of knowledge about how cipe works.

What are the error messages, if any? (First, the argument to -o has to
be an absolute path.)

> Is Olaf reading this mailinglist or should I contact him privately?

I'm reading the mailinglist and trying to respond to anything I
consider important (and possible bugs in my code do certainly fall
into that category ;-)

Olaf





<< | Thread Index | >> ]    [ << | Date Index | >> ]