1. 19 10月, 2010 9 次提交
    • A
      sunrpc: remove the big kernel lock · a6f8dbc6
      Arnd Bergmann 提交于
      The sunrpc cache_ioctl function does not need the big kernel lock
      because it uses its own queue_lock already.
      
      rpc_pipe_ioctl apparently should be using i_lock like the other
      operations on the pipe file descriptor do.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      a6f8dbc6
    • N
      init/main.c: remove BKL notations · 1fa4f3b5
      Namhyung Kim 提交于
      According to commit 5e3d20a6
      (init: Remove the BKL from startup code) these sparse notations
      should be removed also.
      Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      1fa4f3b5
    • A
      blktrace: remove the big kernel lock · 01b284f9
      Arnd Bergmann 提交于
      According to Jens, this code does not need the BKL at all,
      it is sufficiently serialized by bd_mutex.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Jens Axboe <jaxboe@fusionio.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      01b284f9
    • A
      rtmutex-tester: make it build without BKL · 0fc86c7b
      Arnd Bergmann 提交于
      The big kernel lock is going away, so make sure
      that if it is disabled by Kconfig, we do not
      try to validate it, which would result in
      compile errors.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      0fc86c7b
    • A
      dvb-core: kill the big kernel lock · 72024f1e
      Arnd Bergmann 提交于
      The dvb core only uses the big kernel lock in the open
      and ioctl functions, which means it can be replaced with
      a dvb specific mutex. Fortunately, all the ioctl functions
      go through dvb_usercopy, so we can move the serialization
      in there.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
      Cc: linux-media@vger.kernel.org
      72024f1e
    • A
      dvb/bt8xx: kill the big kernel lock · adfedd21
      Arnd Bergmann 提交于
      The bt8xx driver only uses the big kernel lock in its dst_ca_ioctl
      function and never to serialize against other code, so we can
      trivially replace it with a private mutex.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: linux-media@vger.kernel.org
      Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
      adfedd21
    • A
      tlclk: remove big kernel lock · efbec1cd
      Arnd Bergmann 提交于
      This driver already has a global mutex, so let's just
      use that in the open function instead of the BKL.
      It may not even be needed there, but this patch should
      have the smallest impact.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Mark Gross <mark.gross@intel.com>
      efbec1cd
    • A
      fix rawctl compat ioctls breakage on amd64 and itanic · c4a04727
      Al Viro 提交于
      RAW_SETBIND and RAW_GETBIND 32bit versions are fscked in interesting ways.
      
      1) fs/compat_ioctl.c has COMPATIBLE_IOCTL(RAW_SETBIND) followed by
      HANDLE_IOCTL(RAW_SETBIND, raw_ioctl).  The latter is ignored.
      
      2) on amd64 (and itanic) the damn thing is broken - we have int + u64 + u64
      and layouts on i386 and amd64 are _not_ the same.  raw_ioctl() would
      work there, but it's never called due to (1).  As it is, i386 /sbin/raw
      definitely doesn't work on amd64 boxen.
      
      3) switching to raw_ioctl() as is would *not* work on e.g. sparc64 and ppc64,
      which would be rather sad, seeing that normal userland there is 32bit.
      The thing is, slapping __packed on the struct in question does not DTRT -
      it eliminates *all* padding.  The real solution is to use compat_u64.
      
      4) of course, all that stuff has no business being outside of raw.c in the
      first place - there should be ->compat_ioctl() for /dev/rawctl instead of
      messing with compat_ioctl.c.
      
      [akpm@linux-foundation.org: coding-style fixes]
      [arnd@arndb.de: port to 2.6.36]
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      c4a04727
    • A
      uml: kill big kernel lock · 9a181c58
      Arnd Bergmann 提交于
      Three uml device drivers still use the big kernel lock,
      but all of them can be safely converted to using
      a per-driver mutex instead. Most likely this is not
      even necessary, so after further review these can
      and should be removed as well.
      
      The exec system call no longer requires the BKL either,
      so remove it from there, too.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: user-mode-linux-devel@lists.sourceforge.net
      9a181c58
  2. 17 10月, 2010 1 次提交
    • A
      parisc: remove big kernel lock · fa0d4c26
      Arnd Bergmann 提交于
      The parisc version of the perf code is sufficiently
      protected by its own spinlock, no need to use the BKL.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Helge Deller <deller@gmx.de>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: linux-parisc@vger.kernel.org
      fa0d4c26
  3. 26 9月, 2010 4 次提交
    • A
      cris: autoconvert trivial BKL users · 0890b588
      Arnd Bergmann 提交于
      All uses of the big kernel lock in the cris architecture
      are for ioctl and open functions of character device drivers,
      which can be trivially converted to a per-driver mutex.
      
      Most of these are probably unnecessary, so it may make sense
      to audit them and eventually remove the extra mutex introduced
      by this patch.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: linux-cris-kernel@axis.com
      0890b588
    • A
      alpha: kill big kernel lock · 80eb4a6f
      Arnd Bergmann 提交于
      All uses of the BKL on alpha are totally bogus, nothing
      is really protected by this. Remove the remaining users
      so we don't have to mark alpha as 'depends on BKL'.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: linux-alpha@vger.kernel.org
      80eb4a6f
    • A
      isapnp: BKL removal · 6117d213
      Arnd Bergmann 提交于
      Remove BKL use from isapnp_proc_bus_lseek(), like was done for
      proc_bus_pci_lseek() a long time ago and recently for Zorro
      by Geert Uytterhoeven.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Jaroslav Kysela <perex@perex.cz>
      6117d213
    • A
      s390/block: kill the big kernel lock · cfdb00a7
      Arnd Bergmann 提交于
      The dasd and dcssblk drivers gained the big
      kernel lock in the recent pushdown from the
      block layer, but they don't really need it,
      so remove the calls without a replacement.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: linux-s390@vger.kernel.org
      cfdb00a7
  4. 16 9月, 2010 1 次提交
    • A
      hpet: kill BKL, add compat_ioctl · 54066a57
      Arnd Bergmann 提交于
      hpet uses the big kernel lock in its ioctl and open
      functions. Replace this with a private mutex to be
      sure. Since we're already touching the ioctl function,
      add the compat_ioctl version as well -- all commands
      except HPET_INFO are compatible and that one is easy
      to add.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Clemens Ladisch <clemens@ladisch.de>
      Cc: Bob Picco <bob.picco@hp.com>
      54066a57
  5. 13 9月, 2010 1 次提交
  6. 12 9月, 2010 6 次提交
  7. 11 9月, 2010 8 次提交
  8. 10 9月, 2010 10 次提交
    • D
      xfs: log IO completion workqueue is a high priority queue · 51749e47
      Dave Chinner 提交于
      The workqueue implementation in 2.6.36-rcX has changed, resulting
      in the workqueues no longer having dedicated threads for work
      processing. This has caused severe livelocks under heavy parallel
      create workloads because the log IO completions have been getting
      held up behind metadata IO completions.  Hence log commits would
      stall, memory allocation would stall because pages could not be
      cleaned, and lock contention on the AIL during inode IO completion
      processing was being seen to slow everything down even further.
      
      By making the log Io completion workqueue a high priority workqueue,
      they are queued ahead of all data/metadata IO completions and
      processed before the data/metadata completions. Hence the log never
      gets stalled, and operations needed to clean memory can continue as
      quickly as possible. This avoids the livelock conditions and allos
      the system to keep running under heavy load as per normal.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      51749e47
    • R
      execve: make responsive to SIGKILL with large arguments · 9aea5a65
      Roland McGrath 提交于
      An execve with a very large total of argument/environment strings
      can take a really long time in the execve system call.  It runs
      uninterruptibly to count and copy all the strings.  This change
      makes it abort the exec quickly if sent a SIGKILL.
      
      Note that this is the conservative change, to interrupt only for
      SIGKILL, by using fatal_signal_pending().  It would be perfectly
      correct semantics to let any signal interrupt the string-copying in
      execve, i.e. use signal_pending() instead of fatal_signal_pending().
      We'll save that change for later, since it could have user-visible
      consequences, such as having a timer set too quickly make it so that
      an execve can never complete, though it always happened to work before.
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9aea5a65
    • R
      execve: improve interactivity with large arguments · 7993bc1f
      Roland McGrath 提交于
      This adds a preemption point during the copying of the argument and
      environment strings for execve, in copy_strings().  There is already
      a preemption point in the count() loop, so this doesn't add any new
      points in the abstract sense.
      
      When the total argument+environment strings are very large, the time
      spent copying them can be much more than a normal user time slice.
      So this change improves the interactivity of the rest of the system
      when one process is doing an execve with very large arguments.
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7993bc1f
    • R
      setup_arg_pages: diagnose excessive argument size · 1b528181
      Roland McGrath 提交于
      The CONFIG_STACK_GROWSDOWN variant of setup_arg_pages() does not
      check the size of the argument/environment area on the stack.
      When it is unworkably large, shift_arg_pages() hits its BUG_ON.
      This is exploitable with a very large RLIMIT_STACK limit, to
      create a crash pretty easily.
      
      Check that the initial stack is not too large to make it possible
      to map in any executable.  We're not checking that the actual
      executable (or intepreter, for binfmt_elf) will fit.  So those
      mappings might clobber part of the initial stack mapping.  But
      that is just userland lossage that userland made happen, not a
      kernel problem.
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1b528181
    • L
      Merge branch 'kvm-updates/2.6.36' of git://git.kernel.org/pub/scm/virt/kvm/kvm · be6200aa
      Linus Torvalds 提交于
      * 'kvm-updates/2.6.36' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: x86: Perform hardware_enable in CPU_STARTING callback
        KVM: i8259: fix migration
        KVM: fix i8259 oops when no vcpus are online
        KVM: x86 emulator: fix regression with cmpxchg8b on i386 hosts
      be6200aa
    • L
      Merge branch 'perf-fixes-for-linus' of... · f2955b49
      Linus Torvalds 提交于
      Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
      
      * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
        tracing: t_start: reset FTRACE_ITER_HASH in case of seek/pread
        perf symbols: Fix multiple initialization of symbol system
        perf: Fix CPU hotplug
        perf, trace: Fix module leak
        tracing/kprobe: Fix handling of C-unlike argument names
        tracing/kprobes: Fix handling of argument names
        perf probe: Fix handling of arguments names
        perf probe: Fix return probe support
        tracing/kprobe: Fix a memory leak in error case
        tracing: Do not allow llseek to set_ftrace_filter
      f2955b49
    • D
      KEYS: Fix bug in keyctl_session_to_parent() if parent has no session keyring · 3d96406c
      David Howells 提交于
      Fix a bug in keyctl_session_to_parent() whereby it tries to check the ownership
      of the parent process's session keyring whether or not the parent has a session
      keyring [CVE-2010-2960].
      
      This results in the following oops:
      
        BUG: unable to handle kernel NULL pointer dereference at 00000000000000a0
        IP: [<ffffffff811ae4dd>] keyctl_session_to_parent+0x251/0x443
        ...
        Call Trace:
         [<ffffffff811ae2f3>] ? keyctl_session_to_parent+0x67/0x443
         [<ffffffff8109d286>] ? __do_fault+0x24b/0x3d0
         [<ffffffff811af98c>] sys_keyctl+0xb4/0xb8
         [<ffffffff81001eab>] system_call_fastpath+0x16/0x1b
      
      if the parent process has no session keyring.
      
      If the system is using pam_keyinit then it mostly protected against this as all
      processes derived from a login will have inherited the session keyring created
      by pam_keyinit during the log in procedure.
      
      To test this, pam_keyinit calls need to be commented out in /etc/pam.d/.
      Reported-by: NTavis Ormandy <taviso@cmpxchg8b.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NTavis Ormandy <taviso@cmpxchg8b.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3d96406c
    • D
      KEYS: Fix RCU no-lock warning in keyctl_session_to_parent() · 9d1ac65a
      David Howells 提交于
      There's an protected access to the parent process's credentials in the middle
      of keyctl_session_to_parent().  This results in the following RCU warning:
      
        ===================================================
        [ INFO: suspicious rcu_dereference_check() usage. ]
        ---------------------------------------------------
        security/keys/keyctl.c:1291 invoked rcu_dereference_check() without protection!
      
        other info that might help us debug this:
      
        rcu_scheduler_active = 1, debug_locks = 0
        1 lock held by keyctl-session-/2137:
         #0:  (tasklist_lock){.+.+..}, at: [<ffffffff811ae2ec>] keyctl_session_to_parent+0x60/0x236
      
        stack backtrace:
        Pid: 2137, comm: keyctl-session- Not tainted 2.6.36-rc2-cachefs+ #1
        Call Trace:
         [<ffffffff8105606a>] lockdep_rcu_dereference+0xaa/0xb3
         [<ffffffff811ae379>] keyctl_session_to_parent+0xed/0x236
         [<ffffffff811af77e>] sys_keyctl+0xb4/0xb6
         [<ffffffff81001eab>] system_call_fastpath+0x16/0x1b
      
      The code should take the RCU read lock to make sure the parents credentials
      don't go away, even though it's holding a spinlock and has IRQ disabled.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9d1ac65a
    • L
      Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block · ff3cb3fe
      Linus Torvalds 提交于
      * 'for-linus' of git://git.kernel.dk/linux-2.6-block:
        block: Range check cpu in blk_cpu_to_group
        scatterlist: prevent invalid free when alloc fails
        writeback: Fix lost wake-up shutting down writeback thread
        writeback: do not lose wakeup events when forking bdi threads
        cciss: fix reporting of max queue depth since init
        block: switch s390 tape_block and mg_disk to elevator_change()
        block: add function call to switch the IO scheduler from a driver
        fs/bio-integrity.c: return -ENOMEM on kmalloc failure
        bio-integrity.c: remove dependency on __GFP_NOFAIL
        BLOCK: fix bio.bi_rw handling
        block: put dev->kobj in blk_register_queue fail path
        cciss: handle allocation failure
        cfq-iosched: Documentation help for new tunables
        cfq-iosched: blktrace print per slice sector stats
        cfq-iosched: Implement tunable group_idle
        cfq-iosched: Do group share accounting in IOPS when slice_idle=0
        cfq-iosched: Do not idle if slice_idle=0
        cciss: disable doorbell reset on reset_devices
        blkio: Fix return code for mkdir calls
      ff3cb3fe
    • L
      Merge branch 'at91-fixes-for-linus' of git://github.com/at91linux/linux-2.6-at91 · 6ccaa317
      Linus Torvalds 提交于
      * 'at91-fixes-for-linus' of git://github.com/at91linux/linux-2.6-at91:
        AT91: at91sam9261ek: remove C99 comments but keep information
        AT91: at91sam9261ek board: remove warnings related to use of SPI or SD/MMC
        AT91: dm9000 initialization update
        AT91: SAM9G45 - add a separate clock entry for every single TC block
        AT91: clock: peripheral clocks can have other parent than mck
        AT91: change dma resource index
      6ccaa317