1. 13 5月, 2011 8 次提交
    • I
      xen: tidy up whitespace in drivers/xen/Makefile · bdf51674
      Ian Campbell 提交于
      Various merges over time have led to rather a mish-mash of indentation.
      Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
      Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Cc: xen-devel@lists.xensource.com
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      bdf51674
    • L
      Merge branch 'stable/bug-fixes-for-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen · 0c5e1577
      Linus Torvalds 提交于
      * 'stable/bug-fixes-for-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
        x86/mm: Fix section mismatch derived from native_pagetable_reserve()
        x86,xen: introduce x86_init.mapping.pagetable_reserve
        Revert "xen/mmu: Add workaround "x86-64, mm: Put early page table high""
      0c5e1577
    • L
      Revert "drm/i915: Only enable the plane after setting the fb base (pre-ILK)" · 982b2035
      Linus Torvalds 提交于
      This reverts commit 49183b28.
      
      Quoth Franz Melchior:
      
        "This patch introduces a bug on my infamous "Acer Travelmate
         5735Z-452G32Mnss": when KMS takes over, the frame buffer contents get
         completely garbled up on screen, with colored stripes and unreadable
         text (photo on request).  Only when X11 is started, the screen gets
         restored again.  Closing and re-opening the lid partly cures the
         mess, too: it makes the font readable, though horizontally stretched."
      Acked-by: NKeith Packard <keithp@keithp.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      982b2035
    • L
      Merge branch 'fbmem' · df43938b
      Linus Torvalds 提交于
      * fbmem:
        fbmem: make read/write/ioctl use the frame buffer at open time
        fbcon: add lifetime refcount to opened frame buffers
      df43938b
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · 49f019c1
      Linus Torvalds 提交于
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
        Input: ads7846 - remove unused variable from struct ads7845_ser_req
        Input: ads7846 - make transfer buffers DMA safe
      49f019c1
    • S
      x86/mm: Fix section mismatch derived from native_pagetable_reserve() · 53f8023f
      Sedat Dilek 提交于
      With CONFIG_DEBUG_SECTION_MISMATCH=y I see these warnings in next-20110415:
      
        LD      vmlinux.o
        MODPOST vmlinux.o
      WARNING: vmlinux.o(.text+0x1ba48): Section mismatch in reference from the function native_pagetable_reserve() to the function .init.text:memblock_x86_reserve_range()
      The function native_pagetable_reserve() references
      the function __init memblock_x86_reserve_range().
      This is often because native_pagetable_reserve lacks a __init
      annotation or the annotation of memblock_x86_reserve_range is wrong.
      
      This patch fixes the issue.
      Thanks to pipacs from PaX project for help on IRC.
      Acked-by: N"H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: NSedat Dilek <sedat.dilek@gmail.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      53f8023f
    • S
      x86,xen: introduce x86_init.mapping.pagetable_reserve · 279b706b
      Stefano Stabellini 提交于
      Introduce a new x86_init hook called pagetable_reserve that at the end
      of init_memory_mapping is used to reserve a range of memory addresses for
      the kernel pagetable pages we used and free the other ones.
      
      On native it just calls memblock_x86_reserve_range while on xen it also
      takes care of setting the spare memory previously allocated
      for kernel pagetable pages from RO to RW, so that it can be used for
      other purposes.
      
      A detailed explanation of the reason why this hook is needed follows.
      
      As a consequence of the commit:
      
      commit 4b239f45
      Author: Yinghai Lu <yinghai@kernel.org>
      Date:   Fri Dec 17 16:58:28 2010 -0800
      
          x86-64, mm: Put early page table high
      
      at some point init_memory_mapping is going to reach the pagetable pages
      area and map those pages too (mapping them as normal memory that falls
      in the range of addresses passed to init_memory_mapping as argument).
      Some of those pages are already pagetable pages (they are in the range
      pgt_buf_start-pgt_buf_end) therefore they are going to be mapped RO and
      everything is fine.
      Some of these pages are not pagetable pages yet (they fall in the range
      pgt_buf_end-pgt_buf_top; for example the page at pgt_buf_end) so they
      are going to be mapped RW.  When these pages become pagetable pages and
      are hooked into the pagetable, xen will find that the guest has already
      a RW mapping of them somewhere and fail the operation.
      The reason Xen requires pagetables to be RO is that the hypervisor needs
      to verify that the pagetables are valid before using them. The validation
      operations are called "pinning" (more details in arch/x86/xen/mmu.c).
      
      In order to fix the issue we mark all the pages in the entire range
      pgt_buf_start-pgt_buf_top as RO, however when the pagetable allocation
      is completed only the range pgt_buf_start-pgt_buf_end is reserved by
      init_memory_mapping. Hence the kernel is going to crash as soon as one
      of the pages in the range pgt_buf_end-pgt_buf_top is reused (b/c those
      ranges are RO).
      
      For this reason we need a hook to reserve the kernel pagetable pages we
      used and free the other ones so that they can be reused for other
      purposes.
      On native it just means calling memblock_x86_reserve_range, on Xen it
      also means marking RW the pagetable pages that we allocated before but
      that haven't been used before.
      
      Another way to fix this is without using the hook is by adding a 'if
      (xen_pv_domain)' in the 'init_memory_mapping' code and calling the Xen
      counterpart, but that is just nasty.
      Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
      Acked-by: NYinghai Lu <yinghai@kernel.org>
      Acked-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      279b706b
    • K
      Revert "xen/mmu: Add workaround "x86-64, mm: Put early page table high"" · 92bdaef7
      Konrad Rzeszutek Wilk 提交于
      This reverts commit a3864783.
      
      It does not work with certain AMD machines.
      
      last_pfn = 0x100000 max_arch_pfn = 0x400000000
      initial memory mapped : 0 - 02c3a000
      Base memory trampoline at [ffff88000009b000] 9b000 size 20480
      init_memory_mapping: 0000000000000000-0000000100000000
       0000000000 - 0100000000 page 4k
      kernel direct mapping tables up to 100000000 @ ff7fb000-100000000
      init_memory_mapping: 0000000100000000-00000001e0800000
       0100000000 - 01e0800000 page 4k
      kernel direct mapping tables up to 1e0800000 @ 1df0f3000-1e0000000
      xen: setting RW the range fffdc000 - 100000000
      RAMDISK: 0203b000 - 02c3a000
      No NUMA configuration found
      Faking a node at 0000000000000000-00000001e0800000
      NUMA: Using 63 for the hash shift.
      Initmem setup node 0 0000000000000000-00000001e0800000
        NODE_DATA [00000001dfffb000 - 00000001dfffffff]
      BUG: unable to handle kernel NULL pointer dereference at           (null)
      IP: [<ffffffff81cf6a75>] setup_node_bootmem+0x18a/0x1ea
      PGD 0
      Oops: 0003 [#1] SMP
      last sysfs file:
      CPU 0
      Modules linked in:
      
      Pid: 0, comm: swapper Not tainted 2.6.39-0-virtual #6~smb1
      RIP: e030:[<ffffffff81cf6a75>]  [<ffffffff81cf6a75>] setup_node_bootmem+0x18a/0x1ea
      RSP: e02b:ffffffff81c01e38  EFLAGS: 00010046
      RAX: 0000000000000000 RBX: 00000001e0800000 RCX: 0000000000001040
      RDX: 0000000000004100 RSI: 0000000000000000 RDI: ffff8801dfffb000
      RBP: ffffffff81c01e58 R08: 0000000000000020 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
      R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000bfe400
      FS:  0000000000000000(0000) GS:ffffffff81cca000(0000) knlGS:0000000000000000
      CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000000000 CR3: 0000000001c03000 CR4: 0000000000000660
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process swapper (pid: 0, threadinfo ffffffff81c00000, task ffffffff81c0b020)
      Stack:
       0000000000000040 0000000000000001 0000000000000000 ffffffffffffffff
       ffffffff81c01e88 ffffffff81cf6c25 0000000000000000 0000000000000000
       ffffffff81cf687f 0000000000000000 ffffffff81c01ea8 ffffffff81cf6e45
      Call Trace:
       [<ffffffff81cf6c25>] numa_register_memblks.constprop.3+0x150/0x181
       [<ffffffff81cf687f>] ? numa_add_memblk+0x7c/0x7c
       [<ffffffff81cf6e45>] numa_init.part.2+0x1c/0x7c
       [<ffffffff81cf687f>] ? numa_add_memblk+0x7c/0x7c
       [<ffffffff81cf6f67>] numa_init+0x6c/0x70
       [<ffffffff81cf7057>] initmem_init+0x39/0x3b
       [<ffffffff81ce5865>] setup_arch+0x64e/0x769
       [<ffffffff815e43c1>] ? printk+0x51/0x53
       [<ffffffff81cdf92b>] start_kernel+0xd4/0x3f3
       [<ffffffff81cdf388>] x86_64_start_reservations+0x132/0x136
       [<ffffffff81ce2ed4>] xen_start_kernel+0x588/0x58f
      Code: 41 00 00 48 8b 3c c5 a0 24 cc 81 31 c0 40 f6 c7 01 74 05 aa 66 ba ff 40 40 f6 c7 02 74 05 66 ab 83 ea 02 89 d1 c1 e9 02 f6 c2 02 <f3> ab 74 02 66 ab 80 e2 01 74 01 aa 49 63 c4 48 c1 eb 0c 44 89
      RIP  [<ffffffff81cf6a75>] setup_node_bootmem+0x18a/0x1ea
       RSP <ffffffff81c01e38>
      CR2: 0000000000000000
      ---[ end trace a7919e7f17c0a725 ]---
      Kernel panic - not syncing: Attempted to kill the idle task!
      Pid: 0, comm: swapper Tainted: G      D     2.6.39-0-virtual #6~smb1
      Reported-by: NStefan Bader <stefan.bader@canonical.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      92bdaef7
  2. 12 5月, 2011 31 次提交
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse · 6eaed0a4
      Linus Torvalds 提交于
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
        fuse: fix oops in revalidate when called with NULL nameidata
      6eaed0a4
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6 · 8043f4eb
      Linus Torvalds 提交于
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
        sparc32: Fixed unaligned memory copying in function __csum_partial_copy_sparc_generic
        sparc32: fix sparcstation 5 boot
        sparc32: fix section mismatch warnings in apc, pmc and time_32
      8043f4eb
    • L
      Merge branch 'fixes' of master.kernel.org:/home/rmk/linux-2.6-arm · 75c0b3b4
      Linus Torvalds 提交于
      * 'fixes' of master.kernel.org:/home/rmk/linux-2.6-arm:
        ARM: 6870/1: The mandatory barrier rmb() must be a dsb() in for device accesses
        ARM: 6892/1: handle ptrace requests to change PC during interrupted system calls
        ARM: 6890/1: memmap: only free allocated memmap entries when using SPARSEMEM
        ARM: zImage: the page table memory must be considered before relocation
        ARM: zImage: make sure not to relocate on top of the relocation code
        ARM: zImage: Fix bad SP address after relocating kernel
        ARM: zImage: make sure the stack is 64-bit aligned
        ARM: RiscPC: acornfb: fix section mismatches
        ARM: RiscPC: etherh: fix section mismatches
      75c0b3b4
    • L
      fbmem: make read/write/ioctl use the frame buffer at open time · c47747fd
      Linus Torvalds 提交于
      read/write/ioctl on a fbcon file descriptor has traditionally used the
      fbcon not when it was opened, but as it was at the time of the call.
      That makes no sense, but the lack of sense is much more obvious now that
      we properly ref-count the usage - it means that the ref-counting doesn't
      actually protect operations we do on the frame buffer.
      
      This changes it to look at the fb_info that we got at open time, but in
      order to avoid using a frame buffer long after it has been unregistered,
      we do verify that it is still current, and return -ENODEV if not.
      Acked-by: NTim Gardner <tim.gardner@canonical.com>
      Tested-by: NDaniel J Blueman <daniel.blueman@gmail.com>
      Tested-by: NAnca Emanuel <anca.emanuel@gmail.com>
      Cc: Bruno Prémont <bonbons@linux-vserver.org>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Dave Airlie <airlied@redhat.com>
      Cc: Andy Whitcroft <andy.whitcroft@canonical.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c47747fd
    • L
      fbcon: add lifetime refcount to opened frame buffers · 698b3682
      Linus Torvalds 提交于
      This just adds the refcount and the new registration lock logic.  It
      does not (for example) actually change the read/write/ioctl routines to
      actually use the frame buffer that was opened: those function still end
      up alway susing whatever the current frame buffer is at the time of the
      call.
      
      Without this, if something holds the frame buffer open over a
      framebuffer switch, the close() operation after the switch will access a
      fb_info that has been free'd by the unregistering of the old frame
      buffer.
      
      (The read/write/ioctl operations will normally not cause problems,
      because they will - illogically - pick up the new fbcon instead.  But a
      switch that happens just as one of those is going on might see problems
      too, the window is just much smaller: one individual op rather than the
      whole open-close sequence.)
      
      This use-after-free is apparently fairly easily triggered by the Ubuntu
      11.04 boot sequence.
      Acked-by: NTim Gardner <tim.gardner@canonical.com>
      Tested-by: NDaniel J Blueman <daniel.blueman@gmail.com>
      Tested-by: NAnca Emanuel <anca.emanuel@gmail.com>
      Cc: Bruno Prémont <bonbons@linux-vserver.org>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Dave Airlie <airlied@redhat.com>
      Cc: Andy Whitcroft <andy.whitcroft@canonical.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      698b3682
    • C
      ARM: 6870/1: The mandatory barrier rmb() must be a dsb() in for device accesses · a904f5f9
      Catalin Marinas 提交于
      Since mandatory barriers may be used (explicitly or implicitly via readl
      etc.) to ensure the ordering between Device and Normal memory accesses,
      a DMB is not enough. This patch converts it to a DSB.
      
      Cc: Colin Cross <ccross@android.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      a904f5f9
    • A
      ARM: 6892/1: handle ptrace requests to change PC during interrupted system calls · 2af68df0
      Arnd Bergmann 提交于
      GDB's interrupt.exp test cases currenly fail on ARM.  The problem is how do_signal
      handled restarting interrupted system calls:
      
      The entry.S assembler code determines that we come from a system call; and that
      information is passed as "syscall" parameter to do_signal.  That routine then
      calls get_signal_to_deliver [*] and if a signal is to be delivered, calls into
      handle_signal.  If a system call is to be restarted either after the signal
      handler returns, or if no handler is to be called in the first place, the PC
      is updated after the get_signal_to_deliver call, either in handle_signal (if
      we have a handler) or at the end of do_signal (otherwise).
      
      Now the problem is that during [*], the call to get_signal_to_deliver, a ptrace
      intercept may happen.  During this intercept, the debugger may change registers,
      including the PC.  This is done by GDB if it wants to execute an "inferior call",
      i.e. the execution of some code in the debugged program triggered by GDB.
      
      To this purpose, GDB will save all registers, allocate a stack frame, set up
      PC and arguments as appropriate for the call, and point the link register to
      a dummy breakpoint instruction.  Once the process is restarted, it will execute
      the call and then trap back to the debugger, at which point GDB will restore
      all registers and continue original execution.
      
      This generally works fine.  However, now consider what happens when GDB attempts
      to do exactly that while the process was interrupted during execution of a to-be-
      restarted system call:  do_signal is called with the syscall flag set; it calls
      get_signal_to_deliver, at which point the debugger takes over and changes the PC
      to point to a completely different place.  Now get_signal_to_deliver returns
      without a signal to deliver; but now do_signal decides it should be restarting
      a system call, and decrements the PC by 2 or 4 -- so it now points to 2 or 4
      bytes before the function GDB wants to call -- which leads to a subsequent crash.
      
      To fix this problem, two things need to be supported:
      - do_signal must be able to recognize that get_signal_to_deliver changed the PC
        to a different location, and skip the restart-syscall sequence
      - once the debugger has restored all registers at the end of the inferior call
        sequence, do_signal must recognize that *now* it needs to restart the pending
        system call, even though it was now entered from a breakpoint instead of an
        actual svc instruction
      
      This set of issues is solved on other platforms, usually by one of two
      mechanisms:
      
      - The status information "do_signal is handling a system call that may need
        restarting" is itself carried in some register that can be accessed via
        ptrace.  This is e.g. on Intel the "orig_eax" register; on Sparc the kernel
        defines a magic extra bit in the flags register for this purpose.
        This allows GDB to manage that state: reset it when doing an inferior call,
        and restore it after the call is finished.
      
      - On s390, do_signal transparently handles this problem without requiring
        GDB interaction, by performing system call restarting in the following
        way: first, adjust the PC as necessary for restarting the call.  Then,
        call get_signal_to_deliver; and finally just continue execution at the
        PC.  This way, if GDB does not change the PC, everything is as before.
        If GDB *does* change the PC, execution will simply continue there --
        and once GDB restores the PC it saved at that point, it will automatically
        point to the *restarted* system call.  (There is the minor twist how to
        handle system calls that do *not* need restarting -- do_signal will undo
        the PC change in this case, after get_signal_to_deliver has returned, and
        only if ptrace did not change the PC during that call.)
      
      Because there does not appear to be any obvious register to carry the
      syscall-restart information on ARM, we'd either have to introduce a new
      artificial ptrace register just for that purpose, or else handle the issue
      transparently like on s390.  The patch below implements the second option;
      using this patch makes the interrupt.exp test cases pass on ARM, with no
      regression in the GDB test suite otherwise.
      
      Cc: patches@linaro.org
      Signed-off-by: NUlrich Weigand <ulrich.weigand@linaro.org>
      Signed-off-by: NArnd Bergmann <arnd.bergmann@linaro.org>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      2af68df0
    • W
      ARM: 6890/1: memmap: only free allocated memmap entries when using SPARSEMEM · 9af386c8
      Will Deacon 提交于
      The SPARSEMEM code allocates memmap entries only for sections which are
      present (i.e. those which contain some valid memory). The membank checks
      in free_unused_memmap do not take this into account and can incorrectly
      attempt to free memory which is not allocated, resulting in a BUG() in
      the bootmem code.
      
      However, if memory is configured as follows:
      
          |<----section---->|<----hole---->|<----section---->|
          +--------+--------+--------------+--------+--------+
          | bank 0 | unused |              | bank 1 | unused |
          +--------+--------+--------------+--------+--------+
      
      where a bank only occupies part of a section, the memmap allocated for
      the remainder of the section *can* be freed.
      
      This patch modifies the checks in free_unused_memmap so that only valid
      memmap entries are considered for removal.
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      9af386c8
    • T
      sparc32: Fixed unaligned memory copying in function __csum_partial_copy_sparc_generic · b1054282
      Tkhai Kirill 提交于
      When we are in the label cc_dword_align, registers %o0 and %o1 have the same last 2 bits,
      but it's not guaranteed one of them is zero. So we can get unaligned memory access
      in label ccte. Example of parameters which lead to this:
      %o0=0x7ff183e9, %o1=0x8e709e7d, %g1=3
      
      With the parameters I had a memory corruption, when the additional 5 bytes were rewritten.
      This patch corrects the error.
      
      One comment to the patch. We don't care about the third bit in %o1, because cc_end_cruft
      stores word or less.
      Signed-off-by: NTkhai Kirill <tkhai@yandex.ru>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b1054282
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client · 3568bd97
      Linus Torvalds 提交于
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
        ceph: do not use i_wrbuffer_ref as refcount for Fb cap
        ceph: fix list_add in ceph_put_snap_realm
        ceph: print debug message before put mds session
      3568bd97
    • L
      Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 · fad63209
      Linus Torvalds 提交于
      * 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
        drm/radeon/nouveau: fix build regression on alpha due to Xen changes.
        drm/radeon/kms: fix cayman acceleration
        drm/radeon: fix cayman struct accessors.
      fad63209
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6 · b5121290
      Linus Torvalds 提交于
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6:
        mfd: Fix for the TWL4030 PM sleep/wakeup sequence
        mfd: Fix asic3 build error
        mfd: Fixed gpio polarity of omap-usb gpio USB-phy reset
      b5121290
    • L
      Merge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6 · 409ab140
      Linus Torvalds 提交于
      * 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6:
        [S390] fix alloc_pgste check in init_new_context
        [S390] oprofile: fix min/max interval query checks
        [S390] replace diag10() with diag10_range() function
        [S390] disassembler: handle b280/spp instruction
        [S390] kernel: Initialize register 14 when starting new CPU
        [S390] dasd: prevent IO error during reserve/release loop
        [S390] sclp/memory hotplug: fix initial usecount of increments
      409ab140
    • L
      Revert "Bluetooth: fix shutdown on SCO sockets" · ce845377
      Linus Torvalds 提交于
      This reverts commit f21ca5ff.
      
      Quoth Gustavo F. Padovan:
        "Commit f21ca5ff can cause a NULL
         dereference if we call shutdown in a bluetooth SCO socket and doesn't
         wait the shutdown completion to call close().  Please revert it.  I
         may have a fix for it soon, but we don't have time anymore, so revert
         is the way to go.  ;)"
      Requested-by: NGustavo F. Padovan <padovan@profusion.mobi>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ce845377
    • L
      Merge branch 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6 · 0e6f76c7
      Linus Torvalds 提交于
      * 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6:
        PM / Hibernate: Fix ioctl SNAPSHOT_S2RAM
        PM / Hibernate: Make snapshot_release() restore GFP mask
        PM: Fix warning in pm_restrict_gfp_mask() during SNAPSHOT_S2RAM ioctl
      0e6f76c7
    • M
      mm: tracing: add missing GFP flags to tracing · 1d929b7a
      Mel Gorman 提交于
      include/linux/gfp.h and include/trace/events/gfpflags.h are out of sync.
      When tracing is enabled, certain flags are not recognised and the text
      output is less useful as a result.  Add the missing flags.
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1d929b7a
    • H
      tmpfs: fix spurious ENOSPC when racing with unswap · 59a16ead
      Hugh Dickins 提交于
      Testing the shmem_swaplist replacements for igrab() revealed another bug:
      writes to /dev/loop0 on a tmpfs file which fills its filesystem were
      sometimes failing with "Buffer I/O error"s.
      
      These came from ENOSPC failures of shmem_getpage(), when racing with
      swapoff: the same could happen when racing with another shmem_getpage(),
      pulling the page in from swap in between our find_lock_page() and our
      taking the info->lock (though not in the single-threaded loop case).
      
      This is unacceptable, and surprising that I've not noticed it before:
      it dates back many years, but (presumably) was made a lot easier to
      reproduce in 2.6.36, which sited a page preallocation in the race window.
      
      Fix it by rechecking the page cache before settling on an ENOSPC error.
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      59a16ead
    • H
      tmpfs: fix race between umount and swapoff · 778dd893
      Hugh Dickins 提交于
      The use of igrab() in swapoff's shmem_unuse_inode() is just as vulnerable
      to umount as that in shmem_writepage().
      
      Fix this instance by extending the protection of shmem_swaplist_mutex
      right across shmem_unuse_inode(): while it's on the list, the inode cannot
      be evicted (and the filesystem cannot be unmounted) without
      shmem_evict_inode() taking that mutex to remove it from the list.
      
      But since shmem_writepage() might take that mutex, we should avoid making
      memory allocations or memcg charges while holding it: prepare them at the
      outer level in shmem_unuse().  When mem_cgroup_cache_charge() was
      originally placed, we didn't know until that point that the page from swap
      was actually a shmem page; but nowadays it's noted in the swap_map, so
      we're safe to charge upfront.  For the radix_tree, do as is done in
      shmem_getpage(): preload upfront, but don't pin to the cpu; so we make a
      habit of refreshing the node pool, but might dip into GFP_NOWAIT reserves
      on occasion if subsequently preempted.
      
      With the allocation and charge moved out from shmem_unuse_inode(),
      we can also hold index map and info->lock over from finding the entry.
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      778dd893
    • H
      tmpfs: fix race between umount and writepage · b1dea800
      Hugh Dickins 提交于
      Konstanin Khlebnikov reports that a dangerous race between umount and
      shmem_writepage can be reproduced by this script:
      
        for i in {1..300} ; do
      	mkdir $i
      	while true ; do
      		mount -t tmpfs none $i
      		dd if=/dev/zero of=$i/test bs=1M count=$(($RANDOM % 100))
      		umount $i
      	done &
        done
      
      on a 6xCPU node with 8Gb RAM: kernel very unstable after this accident. =)
      
      Kernel log:
      
        VFS: Busy inodes after unmount of tmpfs.
                       Self-destruct in 5 seconds.  Have a nice day...
      
        WARNING: at lib/list_debug.c:53 __list_del_entry+0x8d/0x98()
        list_del corruption. prev->next should be ffff880222fdaac8, but was (null)
        Pid: 11222, comm: mount.tmpfs Not tainted 2.6.39-rc2+ #4
        Call Trace:
         warn_slowpath_common+0x80/0x98
         warn_slowpath_fmt+0x41/0x43
         __list_del_entry+0x8d/0x98
         evict+0x50/0x113
         iput+0x138/0x141
        ...
        BUG: unable to handle kernel paging request at ffffffffffffffff
        IP: shmem_free_blocks+0x18/0x4c
        Pid: 10422, comm: dd Tainted: G        W   2.6.39-rc2+ #4
        Call Trace:
         shmem_recalc_inode+0x61/0x66
         shmem_writepage+0xba/0x1dc
         pageout+0x13c/0x24c
         shrink_page_list+0x28e/0x4be
         shrink_inactive_list+0x21f/0x382
        ...
      
      shmem_writepage() calls igrab() on the inode for the page which came from
      page reclaim, to add it later into shmem_swaplist for swapoff operation.
      
      This igrab() can race with super-block deactivating process:
      
        shrink_inactive_list()          deactivate_super()
        pageout()                       tmpfs_fs_type->kill_sb()
        shmem_writepage()               kill_litter_super()
                                        generic_shutdown_super()
                                         evict_inodes()
         igrab()
                                          atomic_read(&inode->i_count)
                                           skip-inode
         iput()
                                         if (!list_empty(&sb->s_inodes))
                                                printk("VFS: Busy inodes after...
      
      This igrap-iput pair was added in commit 1b1b32f2 "tmpfs: fix
      shmem_swaplist races" based on incorrect assumptions: igrab() protects the
      inode from concurrent eviction by deletion, but it does nothing to protect
      it from concurrent unmounting, which goes ahead despite the raised
      i_count.
      
      So this use of igrab() was wrong all along, but the race made much worse
      in 2.6.37 when commit 63997e98 "split invalidate_inodes()" replaced
      two attempts at invalidate_inodes() by a single evict_inodes().
      
      Konstantin posted a plausible patch, raising sb->s_active too: I'm unsure
      whether it was correct or not; but burnt once by igrab(), I am sure that
      we don't want to rely more deeply upon externals here.
      
      Fix it by adding the inode to shmem_swaplist earlier, while the page lock
      on page in page cache still secures the inode against eviction, without
      artifically raising i_count.  It was originally added later because
      shmem_unuse_inode() is liable to remove an inode from the list while it's
      unswapped; but we can guard against that by taking spinlock before
      dropping mutex.
      Reported-by: NKonstantin Khlebnikov <khlebnikov@openvz.org>
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Tested-by: NKonstantin Khlebnikov <khlebnikov@openvz.org>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b1dea800
    • A
      memcg: allocate memory cgroup structures in local nodes · 21a3c964
      Andi Kleen 提交于
      Commit dde79e00 ("page_cgroup: reduce allocation overhead for
      page_cgroup array for CONFIG_SPARSEMEM") added a regression that the
      memory cgroup data structures all end up in node 0 because the first
      attempt at allocating them would not pass in a node hint.  Since the
      initialization runs on CPU #0 it would all end up node 0.  This is a
      problem on large memory systems, where node 0 would lose a lot of
      memory.
      
      Change the alloc_pages_exact() to alloc_pages_exact_nid().  This will
      still fall back to other nodes if not enough memory is available.
      
       [ RED-PEN: right now it would fall back first before trying
         vmalloc_node.  Probably not the best strategy ...  But I left it like
         that for now. ]
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Reported-by: Doug Nelson
      Cc: David Rientjes <rientjes@google.com>
      Reviewed-by: NMichal Hocko <mhocko@suse.cz>
      Cc: Dave Hansen <dave@linux.vnet.ibm.com>
      Acked-by: NBalbir Singh <balbir@linux.vnet.ibm.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      21a3c964
    • A
      mm: add alloc_pages_exact_nid() · ee85c2e1
      Andi Kleen 提交于
      Add a alloc_pages_exact_nid() that allocates on a specific node.
      
      The naming is quite broken, but fixing that would need a larger renaming
      action.
      
      [akpm@linux-foundation.org: coding-style fixes]
      [akpm@linux-foundation.org: tweak comment]
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Dave Hansen <dave@linux.vnet.ibm.com>
      Cc: David Rientjes <rientjes@google.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ee85c2e1
    • H
      MAINTAINERS: fix sorting · 71a6d0af
      Harry Wei 提交于
      Take alphabetical orders for MAINTAINERS file.
      Signed-off-by: NHarry Wei <harryxiyou@gmail.com>
      Cc: Joe Perches <joe@perches.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      71a6d0af
    • Y
      mm: use alloc_bootmem_node_nopanic() on really needed path · 8f389a99
      Yinghai Lu 提交于
      Stefan found nobootmem does not work on his system that has only 8M of
      RAM.  This causes an early panic:
      
        BIOS-provided physical RAM map:
         BIOS-88: 0000000000000000 - 000000000009f000 (usable)
         BIOS-88: 0000000000100000 - 0000000000840000 (usable)
        bootconsole [earlyser0] enabled
        Notice: NX (Execute Disable) protection missing in CPU or disabled in BIOS!
        DMI not present or invalid.
        last_pfn = 0x840 max_arch_pfn = 0x100000
        init_memory_mapping: 0000000000000000-0000000000840000
        8MB LOWMEM available.
          mapped low ram: 0 - 00840000
          low ram: 0 - 00840000
        Zone PFN ranges:
          DMA      0x00000001 -> 0x00001000
          Normal   empty
        Movable zone start PFN for each node
        early_node_map[2] active PFN ranges
            0: 0x00000001 -> 0x0000009f
            0: 0x00000100 -> 0x00000840
        BUG: Int 6: CR2 (null)
             EDI c034663c  ESI (null)  EBP c0329f38  ESP c0329ef4
             EBX c0346380  EDX 00000006  ECX ffffffff  EAX fffffff4
             err (null)  EIP c0353191   CS c0320060  flg 00010082
        Stack: (null) c030c533 000007cd (null) c030c533 00000001 (null) (null)
               00000003 0000083f 00000018 00000002 00000002 c0329f6c c03534d6 (null)
               (null) 00000100 00000840 (null) c0329f64 00000001 00001000 (null)
        Pid: 0, comm: swapper Not tainted 2.6.36 #5
        Call Trace:
         [<c02e3707>] ? 0xc02e3707
         [<c035e6e5>] 0xc035e6e5
         [<c0353191>] ? 0xc0353191
         [<c03534d6>] 0xc03534d6
         [<c034f1cd>] 0xc034f1cd
         [<c034a824>] 0xc034a824
         [<c03513cb>] ? 0xc03513cb
         [<c0349432>] 0xc0349432
         [<c0349066>] 0xc0349066
      
      It turns out that we should ignore the low limit of 16M.
      
      Use alloc_bootmem_node_nopanic() in this case.
      
      [akpm@linux-foundation.org: less mess]
      Signed-off-by: NYinghai LU <yinghai@kernel.org>
      Reported-by: NStefan Hellermann <stefan@the2masters.de>
      Tested-by: NStefan Hellermann <stefan@the2masters.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: "H. Peter Anvin" <hpa@linux.intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: <stable@kernel.org>		[2.6.34+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8f389a99
    • M
      mm: check PageUnevictable in lru_deactivate_fn() · bad49d9c
      Minchan Kim 提交于
      The lru_deactivate_fn should not move page which in on unevictable lru
      into inactive list.  Otherwise, we can meet BUG when we use
      isolate_lru_pages as __isolate_lru_page could return -EINVAL.
      Reported-by: NYing Han <yinghan@google.com>
      Tested-by: NYing Han <yinghan@google.com>
      Signed-off-by: NMinchan Kim <minchan.kim@gmail.com>
      Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Reviewed-by: Rik van Riel<riel@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bad49d9c
    • B
      drivers/rtc/rtc-s3c.c: fixup wake support for rtc · 52cd4e5c
      Ben Dooks 提交于
      The driver is not balancing set_irq and disable_irq_wake() calls, so
      ensure that it keeps track of whether the wake is enabled.
      
      The fixes the following error on S3C6410 devices:
      
        WARNING: at kernel/irq/manage.c:382 set_irq_wake+0x84/0xec()
        Unbalanced IRQ 92 wake disable
      Signed-off-by: NBen Dooks <ben-linux@fluff.org>
      Signed-off-by: NMark Brown <broonie@opensource.wolfsonmicro.com>
      Cc: Alessandro Zummo <a.zummo@towertech.it>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      52cd4e5c
    • R
      PM / Hibernate: Fix ioctl SNAPSHOT_S2RAM · 36cb7035
      Rafael J. Wysocki 提交于
      The SNAPSHOT_S2RAM ioctl used for implementing the feature allowing
      one to suspend to RAM after creating a hibernation image is currently
      broken, because it doesn't clear the "ready" flag in the struct
      snapshot_data object handled by it.  As a result, the
      SNAPSHOT_UNFREEZE doesn't work correctly after SNAPSHOT_S2RAM has
      returned and the user space hibernate task cannot thaw the other
      processes as appropriate.  Make SNAPSHOT_S2RAM clear data->ready
      to fix this problem.
      Tested-by: NAlexandre Felipe Muller de Souza <alexandrefm@mandriva.com.br>
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Cc: stable@kernel.org
      36cb7035
    • R
      PM / Hibernate: Make snapshot_release() restore GFP mask · 9744997a
      Rafael J. Wysocki 提交于
      If the process using the hibernate user space interface closes
      /dev/snapshot after creating a hibernation image without thawing
      tasks, snapshot_release() should call pm_restore_gfp_mask() to
      restore the GFP mask used before the creation of the image.  Make
      that happen.
      Tested-by: NAlexandre Felipe Muller de Souza <alexandrefm@mandriva.com.br>
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Cc: stable@kernel.org
      9744997a
    • R
      PM: Fix warning in pm_restrict_gfp_mask() during SNAPSHOT_S2RAM ioctl · 87186475
      Rafael J. Wysocki 提交于
      A warning is printed by pm_restrict_gfp_mask() while the
      SNAPSHOT_S2RAM ioctl is being executed after creating a hibernation
      image, because pm_restrict_gfp_mask() has been called once already
      before the image creation and suspend_devices_and_enter() calls it
      once again.  This happens after commit 452aa699
      (mm/pm: force GFP_NOIO during suspend/hibernation and resume).
      
      To avoid this issue, move pm_restrict_gfp_mask() and
      pm_restore_gfp_mask() from suspend_devices_and_enter() to its caller
      in kernel/power/suspend.c.
      Reported-by: NAlexandre Felipe Muller de Souza <alexandrefm@mandriva.com.br>
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Cc: stable@kernel.org
      87186475
    • H
      ceph: do not use i_wrbuffer_ref as refcount for Fb cap · d3d0720d
      Henry C Chang 提交于
      We increments i_wrbuffer_ref when taking the Fb cap. This breaks
      the dirty page accounting and causes looping in
      __ceph_do_pending_vmtruncate, and ceph client hangs.
      
      This bug can be reproduced occasionally by running blogbench.
      
      Add a new field i_wb_ref to inode and dedicate it to Fb reference
      counting.
      Signed-off-by: NHenry C Chang <henry.cy.chang@gmail.com>
      Signed-off-by: NSage Weil <sage@newdream.net>
      d3d0720d
    • H
      ceph: fix list_add in ceph_put_snap_realm · a26a185d
      Henry C Chang 提交于
      Signed-off-by: NHenry C Chang <henry.cy.chang@gmail.com>
      Signed-off-by: NSage Weil <sage@newdream.net>
      a26a185d
    • H
      ceph: print debug message before put mds session · 7d8e18a6
      Henry C Chang 提交于
      The mds session, s, could be freed during ceph_put_mds_session.
      Move dout before ceph_put_mds_session.
      Signed-off-by: NHenry C Chang <henry.cy.chang@gmail.com>
      Signed-off-by: NSage Weil <sage@newdream.net>
      7d8e18a6
  3. 11 5月, 2011 1 次提交