1. 17 12月, 2010 1 次提交
  2. 07 12月, 2010 1 次提交
    • R
      PM / Hibernate: Fix memory corruption related to swap · c9e664f1
      Rafael J. Wysocki 提交于
      There is a problem that swap pages allocated before the creation of
      a hibernation image can be released and used for storing the contents
      of different memory pages while the image is being saved.  Since the
      kernel stored in the image doesn't know of that, it causes memory
      corruption to occur after resume from hibernation, especially on
      systems with relatively small RAM that need to swap often.
      
      This issue can be addressed by keeping the GFP_IOFS bits clear
      in gfp_allowed_mask during the entire hibernation, including the
      saving of the image, until the system is finally turned off or
      the hibernation is aborted.  Unfortunately, for this purpose
      it's necessary to rework the way in which the hibernate and
      suspend code manipulates gfp_allowed_mask.
      
      This change is based on an earlier patch from Hugh Dickins.
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Reported-by: NOndrej Zary <linux@rainbow-software.org>
      Acked-by: NHugh Dickins <hughd@google.com>
      Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: stable@kernel.org
      c9e664f1
  3. 03 12月, 2010 2 次提交
    • K
      mem-hotplug: introduce {un}lock_memory_hotplug() · 20d6c96b
      KOSAKI Motohiro 提交于
      Presently hwpoison is using lock_system_sleep() to prevent a race with
      memory hotplug.  However lock_system_sleep() is a no-op if
      CONFIG_HIBERNATION=n.  Therefore we need a new lock.
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Suggested-by: NHugh Dickins <hughd@google.com>
      Acked-by: NHugh Dickins <hughd@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      20d6c96b
    • J
      vmalloc: eagerly clear ptes on vunmap · 64141da5
      Jeremy Fitzhardinge 提交于
      On stock 2.6.37-rc4, running:
      
        # mount lilith:/export /mnt/lilith
        # find  /mnt/lilith/ -type f -print0 | xargs -0 file
      
      crashes the machine fairly quickly under Xen.  Often it results in oops
      messages, but the couple of times I tried just now, it just hung quietly
      and made Xen print some rude messages:
      
          (XEN) mm.c:2389:d80 Bad type (saw 7400000000000001 != exp
          3000000000000000) for mfn 1d7058 (pfn 18fa7)
          (XEN) mm.c:964:d80 Attempt to create linear p.t. with write perms
          (XEN) mm.c:2389:d80 Bad type (saw 7400000000000010 != exp
          1000000000000000) for mfn 1d2e04 (pfn 1d1fb)
          (XEN) mm.c:2965:d80 Error while pinning mfn 1d2e04
      
      Which means the domain tried to map a pagetable page RW, which would
      allow it to map arbitrary memory, so Xen stopped it.  This is because
      vm_unmap_ram() left some pages mapped in the vmalloc area after NFS had
      finished with them, and those pages got recycled as pagetable pages
      while still having these RW aliases.
      
      Removing those mappings immediately removes the Xen-visible aliases, and
      so it has no problem with those pages being reused as pagetable pages.
      Deferring the TLB flush doesn't upset Xen because it can flush the TLB
      itself as needed to maintain its invariants.
      
      When unmapping a region in the vmalloc space, clear the ptes
      immediately.  There's no point in deferring this because there's no
      amortization benefit.
      
      The TLBs are left dirty, and they are flushed lazily to amortize the
      cost of the IPIs.
      
      This specific motivation for this patch is an oops-causing regression
      since 2.6.36 when using NFS under Xen, triggered by the NFS client's use
      of vm_map_ram() introduced in 56e4ebf8 ("NFS: readdir with vmapped
      pages") .  XFS also uses vm_map_ram() and could cause similar problems.
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Bryan Schumaker <bjschuma@netapp.com>
      Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
      Cc: Alex Elder <aelder@sgi.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      64141da5
  4. 01 12月, 2010 2 次提交
    • O
      exec: copy-and-paste the fixes into compat_do_execve() paths · 114279be
      Oleg Nesterov 提交于
      Note: this patch targets 2.6.37 and tries to be as simple as possible.
      That is why it adds more copy-and-paste horror into fs/compat.c and
      uglifies fs/exec.c, this will be cleanuped later.
      
      compat_copy_strings() plays with bprm->vma/mm directly and thus has
      two problems: it lacks the RLIMIT_STACK check and argv/envp memory
      is not visible to oom killer.
      
      Export acct_arg_size() and get_arg_page(), change compat_copy_strings()
      to use get_arg_page(), change compat_do_execve() to do acct_arg_size(0)
      as do_execve() does.
      
      Add the fatal_signal_pending/cond_resched checks into compat_count() and
      compat_copy_strings(), this matches the code in fs/exec.c and certainly
      makes sense.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: stable@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      114279be
    • O
      exec: make argv/envp memory visible to oom-killer · 3c77f845
      Oleg Nesterov 提交于
      Brad Spengler published a local memory-allocation DoS that
      evades the OOM-killer (though not the virtual memory RLIMIT):
      http://www.grsecurity.net/~spender/64bit_dos.c
      
      execve()->copy_strings() can allocate a lot of memory, but
      this is not visible to oom-killer, nobody can see the nascent
      bprm->mm and take it into account.
      
      With this patch get_arg_page() increments current's MM_ANONPAGES
      counter every time we allocate the new page for argv/envp. When
      do_execve() succeds or fails, we change this counter back.
      
      Technically this is not 100% correct, we can't know if the new
      page is swapped out and turn MM_ANONPAGES into MM_SWAPENTS, but
      I don't think this really matters and everything becomes correct
      once exec changes ->mm or fails.
      Reported-by: NBrad Spengler <spender@grsecurity.net>
      Reviewed-and-discussed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Cc: stable@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3c77f845
  5. 30 11月, 2010 1 次提交
    • J
      TTY: open/hangup race fixup · acfa747b
      Jiri Slaby 提交于
      Like in the "TTY: don't allow reopen when ldisc is changing" patch,
      this one fixes a TTY WARNING as described in the option 1) there:
      1) __tty_hangup from tty_ldisc_hangup to tty_ldisc_enable. During this
      section tty_lock is held. However tty_lock is temporarily dropped in
      the middle of the function by tty_ldisc_hangup.
      
      The fix is to introduce a new flag which we set during the unlocked
      window and check it in tty_reopen too. The flag is TTY_HUPPING and is
      cleared after TTY_HUPPED is set.
      
      While at it, remove duplicate TTY_HUPPED set_bit. The one after
      calling ops->hangup seems to be more correct. But anyway, we hold
      tty_lock, so there should be no difference.
      
      Also document the function it does that kind of crap.
      
      Nicely reproducible with two forked children:
      static void do_work(const char *tty)
      {
      	if (signal(SIGHUP, SIG_IGN) == SIG_ERR) exit(1);
      	setsid();
      	while (1) {
      		int fd = open(tty, O_RDWR|O_NOCTTY);
      		if (fd < 0) continue;
      		if (ioctl(fd, TIOCSCTTY)) continue;
      		if (vhangup()) continue;
      		close(fd);
      	}
      	exit(0);
      }
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Reported-by: <Valdis.Kletnieks@vt.edu>
      Reported-by: NKyle McMartin <kyle@mcmartin.ca>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: stable <stable@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      acfa747b
  6. 29 11月, 2010 2 次提交
    • L
      Un-inline get_pipe_info() helper function · 72083646
      Linus Torvalds 提交于
      This avoids some include-file hell, and the function isn't really
      important enough to be inlined anyway.
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      72083646
    • L
      Export 'get_pipe_info()' to other users · c66fb347
      Linus Torvalds 提交于
      And in particular, use it in 'pipe_fcntl()'.
      
      The other pipe functions do not need to use the 'careful' version, since
      they are only ever called for things that are already known to be pipes.
      
      The normal read/write/ioctl functions are called through the file
      operations structures, so if a file isn't a pipe, they'd never get
      called.  But pipe_fcntl() is special, and called directly from the
      generic fcntl code, and needs to use the same careful function that the
      splice code is using.
      
      Cc: Jens Axboe <jaxboe@fusionio.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Dave Jones <davej@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c66fb347
  7. 26 11月, 2010 3 次提交
  8. 25 11月, 2010 2 次提交
    • K
      memcg: fix false positive VM_BUG on non-SMP · 112bc2e1
      Kirill A. Shutemov 提交于
      Fix this:
      
        kernel BUG at mm/memcontrol.c:2155!
        invalid opcode: 0000 [#1]
        last sysfs file:
      
        Pid: 18, comm: sh Not tainted 2.6.37-rc3 #3 /Bochs
        EIP: 0060:[<c10731b2>] EFLAGS: 00000246 CPU: 0
        EIP is at mem_cgroup_move_account+0xe2/0xf0
        EAX: 00000004 EBX: c6f931d4 ECX: c681c300 EDX: c681c000
        ESI: c681c300 EDI: ffffffea EBP: c681c000 ESP: c46f3e30
         DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
        Process sh (pid: 18, ti=c46f2000 task=c6826e60 task.ti=c46f2000)
        Stack:
         00000155 c681c000 0805f000 c46ee180 c46f3e5c c7058820 c1074d37 00000000
         08060000 c46db9a0 c46ec080 c7058820 0805f000 08060000 c46f3e98 c1074c50
         c106c75e c46f3e98 c46ec080 08060000 0805ffff c46db9a0 c46f3e98 c46e0340
        Call Trace:
         [<c1074d37>] ? mem_cgroup_move_charge_pte_range+0xe7/0x130
         [<c1074c50>] ? mem_cgroup_move_charge_pte_range+0x0/0x130
         [<c106c75e>] ? walk_page_range+0xee/0x1d0
         [<c10725d6>] ? mem_cgroup_move_task+0x66/0x90
         [<c1074c50>] ? mem_cgroup_move_charge_pte_range+0x0/0x130
         [<c1072570>] ? mem_cgroup_move_task+0x0/0x90
         [<c1042616>] ? cgroup_attach_task+0x136/0x200
         [<c1042878>] ? cgroup_tasks_write+0x48/0xc0
         [<c1041e9e>] ? cgroup_file_write+0xde/0x220
         [<c101398d>] ? do_page_fault+0x17d/0x3f0
         [<c108a79d>] ? alloc_fd+0x2d/0xd0
         [<c1041dc0>] ? cgroup_file_write+0x0/0x220
         [<c1077ba2>] ? vfs_write+0x92/0xc0
         [<c1077c81>] ? sys_write+0x41/0x70
         [<c1140e3d>] ? syscall_call+0x7/0xb
        Code: 03 00 74 09 8b 44 24 04 e8 1c f1 ff ff 89 73 04 8d 86 b0 00 00 00 b9 01 00 00 00 89 da 31 ff e8 65 f5 ff ff e9 4d ff ff ff 0f 0b <0f> 0b 0f 0b 0f 0b 90 8d b4 26 00 00 00 00 83 ec 10 8b 0d f4 e3
        EIP: [<c10731b2>] mem_cgroup_move_account+0xe2/0xf0 SS:ESP 0068:c46f3e30
        ---[ end trace 7daa1582159b6532 ]---
      
      lock_page_cgroup and unlock_page_cgroup are implemented using
      bit_spinlock.  bit_spinlock doesn't touch the bit if we are on non-SMP
      machine, so we can't use the bit to check whether the lock was taken.
      
      Let's introduce is_page_cgroup_locked based on bit_spin_is_locked instead
      of PageCgroupLocked to fix it.
      
      [akpm@linux-foundation.org: s/is_page_cgroup_locked/page_is_cgroup_locked/]
      Signed-off-by: NKirill A. Shutemov <kirill@shutemov.name>
      Reviewed-by: NJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujtisu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      112bc2e1
    • L
      include/linux/fs.h: fix userspace build · 3a3a1af3
      Loïc Minier 提交于
      dpkg uses fiemap but didn't particularly need to include stdint.h so far.
      Since 367a51a3 ("fs: Add FITRIM ioctl"), build of linux/fs.h failed in
      dpkg with:
      
        In file included from ../../src/filesdb.c:27:0:
        /usr/include/linux/fs.h:37:2: error: expected specifier-qualifier-list before 'uint64_t'
      
      Use exportable type __u64 to avoid the dependency on stdint.h.
      
      b31d42a5 ("Fix compile brekage with !CONFIG_BLOCK") fixed only the
      kernel build by including linux/types.h, but this also fixed "make
      headers_check", so don't revert it.
      Signed-off-by: NLoïc Minier <loic.minier@linaro.org>
      Tested-by: NArnd Bergmann <arnd.bergmann@linaro.org>
      Cc: Lukas Czerner <lczerner@redhat.com>
      Cc: Dmitry Monakhov <dmonakhov@openvz.org>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3a3a1af3
  9. 24 11月, 2010 1 次提交
  10. 23 11月, 2010 3 次提交
  11. 22 11月, 2010 1 次提交
    • A
      usb: musb: do not use dma for control transfers · 07a8cdd2
      Anand Gadiyar 提交于
      The Inventra DMA engine used with the MUSB controller in many
      SoCs cannot use DMA for control transfers on EP0, but can use
      DMA for all other transfers.
      
      The USB core maps urbs for DMA if hcd->self.uses_dma is true.
      (hcd->self.uses_dma is true for MUSB as well).
      
      Split the uses_dma flag into two - one that says if the
      controller needs to use PIO for control transfers, and
      another which says if the controller uses DMA (for all
      other transfers).
      
      Also, populate this flag for all MUSB by default.
      
      (Tested on OMAP3 and OMAP4 boards, with EHCI and MUSB HCDs
      simultaneously in use).
      Signed-off-by: NMaulik Mankad <x0082077@ti.com>
      Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@ti.com>
      Signed-off-by: NAnand Gadiyar <gadiyar@ti.com>
      Cc: Oliver Neukum <oliver@neukum.org>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: Praveena NADAHALLY <praveen.nadahally@stericsson.com>
      Cc: Ajay Kumar Gupta <ajay.gupta@ti.com>
      Signed-off-by: NFelipe Balbi <balbi@ti.com>
      07a8cdd2
  12. 20 11月, 2010 2 次提交
  13. 19 11月, 2010 1 次提交
    • L
      hardirq.h: needs sched.h if using BKL · ed1d77b1
      Linus Torvalds 提交于
      This really isn't the right thing to do, and strictly speaking we should
      have the BKL depth count in the thread info right next to the preempt
      count.  The two really do go together.
      
      However, since that would involve a patch to all architectures, and the
      BKL is finally going away, it's simply not worth the effort to do the
      RightThing(tm).  Just re-instate the <linux/sched.h> include that we
      used to get accidentally from the smp_lock.h one.
      
      This is all fallout from the same old "BKL: remove extraneous #include
      <smp_lock.h>" commit.
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Tested-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ed1d77b1
  14. 18 11月, 2010 3 次提交
  15. 17 11月, 2010 3 次提交
    • D
      fbcmap: integer overflow bug · 1e7c7804
      Dan Carpenter 提交于
      There is an integer overflow in fb_set_user_cmap() because cmap->len * 2
      can wrap.  It's basically harmless.  Your terminal will be messed up
      until you type reset.
      
      This patch does three things to fix the bug.
      
      First, it checks the return value of fb_copy_cmap() in fb_alloc_cmap().
      That is enough to fix address the overflow.
      
      Second it checks for the integer overflow in fb_set_user_cmap().
      
      Lastly I wanted to cap "cmap->len" in fb_set_user_cmap() much lower
      because it gets used to determine the size of allocation.  Unfortunately
      no one knows what the limit should be.  Instead what this patch does
      is makes the allocation happen with GFP_KERNEL instead of GFP_ATOMIC
      and lets the kmalloc() decide what values of cmap->len are reasonable.
      To do this, the patch introduces a function called fb_alloc_cmap_gfp()
      which is like fb_alloc_cmap() except that it takes a GFP flag.
      Signed-off-by: NDan Carpenter <error27@gmail.com>
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      1e7c7804
    • J
      SCSI host lock push-down · f281233d
      Jeff Garzik 提交于
      Move the mid-layer's ->queuecommand() invocation from being locked
      with the host lock to being unlocked to facilitate speeding up the
      critical path for drivers who don't need this lock taken anyway.
      
      The patch below presents a simple SCSI host lock push-down as an
      equivalent transformation.  No locking or other behavior should change
      with this patch.  All existing bugs and locking orders are preserved.
      
      Additionally, add one parameter to queuecommand,
      	struct Scsi_Host *
      and remove one parameter from queuecommand,
      	void (*done)(struct scsi_cmnd *)
      
      Scsi_Host* is a convenient pointer that most host drivers need anyway,
      and 'done' is redundant to struct scsi_cmnd->scsi_done.
      
      Minimal code disturbance was attempted with this change.  Most drivers
      needed only two one-line modifications for their host lock push-down.
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      Acked-by: NJames Bottomley <James.Bottomley@suse.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f281233d
    • J
      nfs: trivial: remove unused nfs_wait_event macro · 5685b971
      Jeff Layton 提交于
      Nothing uses this macro anymore.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      5685b971
  16. 16 11月, 2010 6 次提交
  17. 15 11月, 2010 4 次提交
  18. 13 11月, 2010 1 次提交
  19. 12 11月, 2010 1 次提交
    • A
      backlight: add low threshold to pwm backlight · fef7764f
      Arun Murthy 提交于
      The intensity of the backlight can be varied from a range of
      max_brightness to zero.  Though most, if not all the pwm based backlight
      devices start flickering at lower brightness value.  And also for each
      device there exists a brightness value below which the backlight appears
      to be turned off though the value is not equal to zero.
      
      If the range of brightness for a device is from zero to max_brightness.  A
      graph is plotted for brightness Vs intensity for the pwm based backlight
      device has to be a linear graph.
      
      intensity
      	  |   /
      	  |  /
      	  | /
      	  |/
      	  ---------
      	 0	max_brightness
      
      But pratically on measuring the above we note that the intensity of
      backlight goes to zero(OFF) when the value in not zero almost nearing to
      zero(some x%).  so the graph looks like
      
      intensity
      	  |    /
      	  |   /
      	  |  /
      	  |  |
      	  ------------
      	 0   x	 max_brightness
      
      In order to overcome this drawback knowing this x% i.e nothing but the low
      threshold beyond which the backlight is off and will have no effect, the
      brightness value is being offset by the low threshold value(retaining the
      linearity of the graph).  Now the graph becomes
      
      intensity
      	  |     /
      	  |    /
      	  |   /
      	  |  /
      	  -------------
      	   0	  max_brightness
      
      With this for each and every digit increment in the brightness from zero
      there is a change in the intensity of backlight.  Devices having this
      behaviour can set the low threshold brightness(lth_brightness) and pass
      the same as platform data else can have it as zero.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NArun Murthy <arun.murthy@stericsson.com>
      Acked-by: NLinus Walleij <linus.walleij@stericsson.com>
      Acked-by: NRichard Purdie <rpurdie@linux.intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fef7764f