1. 25 6月, 2008 2 次提交
  2. 24 6月, 2008 21 次提交
  3. 23 6月, 2008 5 次提交
    • T
      futexes: fix fault handling in futex_lock_pi · 1b7558e4
      Thomas Gleixner 提交于
      This patch addresses a very sporadic pi-futex related failure in
      highly threaded java apps on large SMP systems.
      
      David Holmes reported that the pi_state consistency check in
      lookup_pi_state triggered with his test application. This means that
      the kernel internal pi_state and the user space futex variable are out
      of sync. First we assumed that this is a user space data corruption,
      but deeper investigation revieled that the problem happend because the
      pi-futex code is not handling a fault in the futex_lock_pi path when
      the user space variable needs to be fixed up.
      
      The fault happens when a fork mapped the anon memory which contains
      the futex readonly for COW or the page got swapped out exactly between
      the unlock of the futex and the return of either the new futex owner
      or the task which was the expected owner but failed to acquire the
      kernel internal rtmutex. The current futex_lock_pi() code drops out
      with an inconsistent in case it faults and returns -EFAULT to user
      space. User space has no way to fixup that state.
      
      When we wrote this code we thought that we could not drop the hash
      bucket lock at this point to handle the fault.
      
      After analysing the code again it turned out to be wrong because there
      are only two tasks involved which might modify the pi_state and the
      user space variable:
      
       - the task which acquired the rtmutex
       - the pending owner of the pi_state which did not get the rtmutex
      
      Both tasks drop into the fixup_pi_state() function before returning to
      user space. The first task which acquired the hash bucket lock faults
      in the fixup of the user space variable, drops the spinlock and calls
      futex_handle_fault() to fault in the page. Now the second task could
      acquire the hash bucket lock and tries to fixup the user space
      variable as well. It either faults as well or it succeeds because the
      first task already faulted the page in.
      
      One caveat is to avoid a double fixup. After returning from the fault
      handling we reacquire the hash bucket lock and check whether the
      pi_state owner has been modified already.
      Reported-by: NDavid Holmes <david.holmes@sun.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: David Holmes <david.holmes@sun.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: <stable@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      
       kernel/futex.c |   93 ++++++++++++++++++++++++++++++++++++++++++++-------------
       1 file changed, 73 insertions(+), 20 deletions(-)
      1b7558e4
    • T
      ALSA: sb - Fix wrong assertions · 3e14b50d
      Takashi Iwai 提交于
      snd_assert() in save_mixer() and restore_mixer() in sb_mixer.c is
      just wrong.  The debug code wasn't tested at all, obviously...
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      3e14b50d
    • T
      ALSA: aw2 - Fix Oops at initialization · 44e05177
      Takashi Iwai 提交于
      The irq handler may be called before the proper initialization of hardware.
      Call snd_aw2_saa7146_setup() before the irq handler registration.
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      44e05177
    • I
      Merge branch 'linus' into sched/urgent · 198bb971
      Ingo Molnar 提交于
      198bb971
    • L
      Fix performance regression on lmbench select benchmark · 55d85384
      Linus Torvalds 提交于
      Christian Borntraeger reported that reinstating cond_resched() with
      CONFIG_PREEMPT caused a performance regression on lmbench:
      
      	For example select file 500:
      	23 microseconds
      	32 microseconds
      
      and that's really because we totally unnecessarily do the cond_resched()
      in the innermost loop of select(), which is just silly.
      
      This moves it out from the innermost loop (which only ever loops ove the
      bits in a single "unsigned long" anyway), which makes the performance
      regression go away.
      Reported-and-tested-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      55d85384
  4. 22 6月, 2008 4 次提交
  5. 21 6月, 2008 8 次提交
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 · a1921443
      Linus Torvalds 提交于
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
        netns: Don't receive new packets in a dead network namespace.
        sctp: Make sure N * sizeof(union sctp_addr) does not overflow.
        pppoe: warning fix
        ipv6: Drop packets for loopback address from outside of the box.
        ipv6: Remove options header when setsockopt's optlen is 0
        mac80211: detect driver tx bugs
      a1921443
    • E
      netns: Don't receive new packets in a dead network namespace. · b9f75f45
      Eric W. Biederman 提交于
      Alexey Dobriyan <adobriyan@gmail.com> writes:
      > Subject: ICMP sockets destruction vs ICMP packets oops
      
      > After icmp_sk_exit() nuked ICMP sockets, we get an interrupt.
      > icmp_reply() wants ICMP socket.
      >
      > Steps to reproduce:
      >
      > 	launch shell in new netns
      > 	move real NIC to netns
      > 	setup routing
      > 	ping -i 0
      > 	exit from shell
      >
      > BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
      > IP: [<ffffffff803fce17>] icmp_sk+0x17/0x30
      > PGD 17f3cd067 PUD 17f3ce067 PMD 0 
      > Oops: 0000 [1] PREEMPT SMP DEBUG_PAGEALLOC
      > CPU 0 
      > Modules linked in: usblp usbcore
      > Pid: 0, comm: swapper Not tainted 2.6.26-rc6-netns-ct #4
      > RIP: 0010:[<ffffffff803fce17>]  [<ffffffff803fce17>] icmp_sk+0x17/0x30
      > RSP: 0018:ffffffff8057fc30  EFLAGS: 00010286
      > RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff81017c7db900
      > RDX: 0000000000000034 RSI: ffff81017c7db900 RDI: ffff81017dc41800
      > RBP: ffffffff8057fc40 R08: 0000000000000001 R09: 000000000000a815
      > R10: 0000000000000000 R11: 0000000000000001 R12: ffffffff8057fd28
      > R13: ffffffff8057fd00 R14: ffff81017c7db938 R15: ffff81017dc41800
      > FS:  0000000000000000(0000) GS:ffffffff80525000(0000) knlGS:0000000000000000
      > CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
      > CR2: 0000000000000000 CR3: 000000017fcda000 CR4: 00000000000006e0
      > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      > Process swapper (pid: 0, threadinfo ffffffff8053a000, task ffffffff804fa4a0)
      > Stack:  0000000000000000 ffff81017c7db900 ffffffff8057fcf0 ffffffff803fcfe4
      >  ffffffff804faa38 0000000000000246 0000000000005a40 0000000000000246
      >  000000000001ffff ffff81017dd68dc0 0000000000005a40 0000000055342436
      > Call Trace:
      >  <IRQ>  [<ffffffff803fcfe4>] icmp_reply+0x44/0x1e0
      >  [<ffffffff803d3a0a>] ? ip_route_input+0x23a/0x1360
      >  [<ffffffff803fd645>] icmp_echo+0x65/0x70
      >  [<ffffffff803fd300>] icmp_rcv+0x180/0x1b0
      >  [<ffffffff803d6d84>] ip_local_deliver+0xf4/0x1f0
      >  [<ffffffff803d71bb>] ip_rcv+0x33b/0x650
      >  [<ffffffff803bb16a>] netif_receive_skb+0x27a/0x340
      >  [<ffffffff803be57d>] process_backlog+0x9d/0x100
      >  [<ffffffff803bdd4d>] net_rx_action+0x18d/0x250
      >  [<ffffffff80237be5>] __do_softirq+0x75/0x100
      >  [<ffffffff8020c97c>] call_softirq+0x1c/0x30
      >  [<ffffffff8020f085>] do_softirq+0x65/0xa0
      >  [<ffffffff80237af7>] irq_exit+0x97/0xa0
      >  [<ffffffff8020f198>] do_IRQ+0xa8/0x130
      >  [<ffffffff80212ee0>] ? mwait_idle+0x0/0x60
      >  [<ffffffff8020bc46>] ret_from_intr+0x0/0xf
      >  <EOI>  [<ffffffff80212f2c>] ? mwait_idle+0x4c/0x60
      >  [<ffffffff80212f23>] ? mwait_idle+0x43/0x60
      >  [<ffffffff8020a217>] ? cpu_idle+0x57/0xa0
      >  [<ffffffff8040f380>] ? rest_init+0x70/0x80
      > Code: 10 5b 41 5c 41 5d 41 5e c9 c3 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 53
      > 48 83 ec 08 48 8b 9f 78 01 00 00 e8 2b c7 f1 ff 89 c0 <48> 8b 04 c3 48 83 c4 08
      > 5b c9 c3 66 66 66 66 66 2e 0f 1f 84 00
      > RIP  [<ffffffff803fce17>] icmp_sk+0x17/0x30
      >  RSP <ffffffff8057fc30>
      > CR2: 0000000000000000
      > ---[ end trace ea161157b76b33e8 ]---
      > Kernel panic - not syncing: Aiee, killing interrupt handler!
      
      Receiving packets while we are cleaning up a network namespace is a
      racy proposition. It is possible when the packet arrives that we have
      removed some but not all of the state we need to fully process it.  We
      have the choice of either playing wack-a-mole with the cleanup routines
      or simply dropping packets when we don't have a network namespace to
      handle them.
      
      Since the check looks inexpensive in netif_receive_skb let's just
      drop the incoming packets.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b9f75f45
    • D
      sctp: Make sure N * sizeof(union sctp_addr) does not overflow. · 735ce972
      David S. Miller 提交于
      As noticed by Gabriel Campana, the kmalloc() length arg
      passed in by sctp_getsockopt_local_addrs_old() can overflow
      if ->addr_num is large enough.
      
      Therefore, enforce an appropriate limit.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      735ce972
    • S
      pppoe: warning fix · 2645a3c3
      Stephen Hemminger 提交于
      Fix warning:
      drivers/net/pppoe.c: In function 'pppoe_recvmsg':
      drivers/net/pppoe.c:945: warning: comparison of distinct pointer types lacks a cast
      because skb->len is unsigned int and total_len is size_t
      Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2645a3c3
    • L
      Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 · b732d968
      Linus Torvalds 提交于
      * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
        [IA64] SN2: security hole in sn2_ptc_proc_write
      b732d968
    • I
      alpha: resurrect Cypress IDE quirk · a744e016
      Ivan Kokshaysky 提交于
      Which was removed in the hope that generic legacy IDE quirk in
      drivers/pci/probe.c is sufficient for Cypress IDE.
      It isn't, as this controller has non-standard BAR layout:
      secondary channel registers are in the BAR0-1 of the second
      PCI function - not in the BAR2-3 of the same function, as the
      generic quirk routine assumes.
      Signed-off-by: NIvan Kokshaysky <ink@jurassic.park.msu.ru>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a744e016
    • I
      alpha: fix compile failures with gcc-4.3 (bug #10438) · d559d4a2
      Ivan Kokshaysky 提交于
      Vast majority of these build failures are gcc-4.3 warnings
      about static functions and objects being referenced from
      non-static (read: "extern inline") functions, in conjunction
      with our -Werror.
      
      We cannot just convert "extern inline" to "static inline",
      as people keep suggesting all the time, because "extern inline"
      logic is crucial for generic kernel build.
      So
      - just make sure that all callees of critical "extern inline"
        functions are also "extern inline";
      - use "static inline", wherever it's possible.
      
      traps.c: work around gcc-4.3 being too smart about array
      bounds-checking.
      
      TODO: add "gnu_inline" attribute to all our "extern inline"
      functions to ensure desired behaviour with future compilers.
      Signed-off-by: NIvan Kokshaysky <ink@jurassic.park.msu.ru>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d559d4a2
    • I
      alpha: link failure fix · ede42692
      Ivan Kokshaysky 提交于
      With built-in scsi disk driver, the final link fails with a following
      error:
      `.exit.text' referenced in section `.rodata' of drivers/built-in.o:
      defined in discarded section `.exit.text' of drivers/built-in.o
      
      This happens with -Os (CONFIG_CC_OPTIMIZE_FOR_SIZE=y) with all gcc-4
      versions, and also with -O2 and gcc-4.3.
      
      The problem is in sd.c:sd_major() being inlined into __exit function
      exit_sd(), and the compiler generating a jump table in .rodata section
      for the 'switch' statement in sd_major(). So we have references to
      discarded section.
      
      Fixed with a big hammer in the form of -fno-jump-tables.
      
      Note that jump tables vs. discarded sections is a generic problem,
      other architectures are just lucky not to suffer from it. But with
      a slightly more complex switch/case statement it can be reproduced
      on x86 as well. So maybe at some point we should consider
      -fno-jump-tables as a generic compile option...
      Signed-off-by: NIvan Kokshaysky <ink@jurassic.park.msu.ru>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ede42692