1. 05 6月, 2014 28 次提交
  2. 24 5月, 2014 5 次提交
    • N
      mm/memory-failure.c: fix memory leak by race between poison and unpoison · 3e030ecc
      Naoya Horiguchi 提交于
      When a memory error happens on an in-use page or (free and in-use)
      hugepage, the victim page is isolated with its refcount set to one.
      
      When you try to unpoison it later, unpoison_memory() calls put_page()
      for it twice in order to bring the page back to free page pool (buddy or
      free hugepage list).  However, if another memory error occurs on the
      page which we are unpoisoning, memory_failure() returns without
      releasing the refcount which was incremented in the same call at first,
      which results in memory leak and unconsistent num_poisoned_pages
      statistics.  This patch fixes it.
      Signed-off-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: <stable@vger.kernel.org>    [2.6.32+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3e030ecc
    • M
      memcg: fix swapcache charge from kernel thread context · 6f6acb00
      Michal Hocko 提交于
      Commit 284f39af ("mm: memcg: push !mm handling out to page cache
      charge function") explicitly checks for page cache charges without any
      mm context (from kernel thread context[1]).
      
      This seemed to be the only possible case where memory could be charged
      without mm context so commit 03583f1a ("memcg: remove unnecessary
      !mm check from try_get_mem_cgroup_from_mm()") removed the mm check from
      get_mem_cgroup_from_mm().  This however caused another NULL ptr
      dereference during early boot when loopback kernel thread splices to
      tmpfs as reported by Stephan Kulow:
      
        BUG: unable to handle kernel NULL pointer dereference at 0000000000000360
        IP: get_mem_cgroup_from_mm.isra.42+0x2b/0x60
        Oops: 0000 [#1] SMP
        Modules linked in: btrfs dm_multipath dm_mod scsi_dh multipath raid10 raid456 async_raid6_recov async_memcpy async_pq raid6_pq async_xor xor async_tx raid1 raid0 md_mod parport_pc parport nls_utf8 isofs usb_storage iscsi_ibft iscsi_boot_sysfs arc4 ecb fan thermal nfs lockd fscache nls_iso8859_1 nls_cp437 sg st hid_generic usbhid af_packet sunrpc sr_mod cdrom ata_generic uhci_hcd virtio_net virtio_blk ehci_hcd usbcore ata_piix floppy processor button usb_common virtio_pci virtio_ring virtio edd squashfs loop ppa]
        CPU: 0 PID: 97 Comm: loop1 Not tainted 3.15.0-rc5-5-default #1
        Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
        Call Trace:
          __mem_cgroup_try_charge_swapin+0x40/0xe0
          mem_cgroup_charge_file+0x8b/0xd0
          shmem_getpage_gfp+0x66b/0x7b0
          shmem_file_splice_read+0x18f/0x430
          splice_direct_to_actor+0xa2/0x1c0
          do_lo_receive+0x5a/0x60 [loop]
          loop_thread+0x298/0x720 [loop]
          kthread+0xc6/0xe0
          ret_from_fork+0x7c/0xb0
      
      Also Branimir Maksimovic reported the following oops which is tiggered
      for the swapcache charge path from the accounting code for kernel threads:
      
        CPU: 1 PID: 160 Comm: kworker/u8:5 Tainted: P           OE 3.15.0-rc5-core2-custom #159
        Hardware name: System manufacturer System Product Name/MAXIMUSV GENE, BIOS 1903 08/19/2013
        task: ffff880404e349b0 ti: ffff88040486a000 task.ti: ffff88040486a000
        RIP: get_mem_cgroup_from_mm.isra.42+0x2b/0x60
        Call Trace:
          __mem_cgroup_try_charge_swapin+0x45/0xf0
          mem_cgroup_charge_file+0x9c/0xe0
          shmem_getpage_gfp+0x62c/0x770
          shmem_write_begin+0x38/0x40
          generic_perform_write+0xc5/0x1c0
          __generic_file_aio_write+0x1d1/0x3f0
          generic_file_aio_write+0x4f/0xc0
          do_sync_write+0x5a/0x90
          do_acct_process+0x4b1/0x550
          acct_process+0x6d/0xa0
          do_exit+0x827/0xa70
          kthread+0xc3/0xf0
      
      This patch fixes the issue by reintroducing mm check into
      get_mem_cgroup_from_mm.  We could do the same trick in
      __mem_cgroup_try_charge_swapin as we do for the regular page cache path
      but it is not worth troubles.  The check is not that expensive and it is
      better to have get_mem_cgroup_from_mm more robust.
      
      [1] - http://marc.info/?l=linux-mm&m=139463617808941&w=2
      
      Fixes: 03583f1a ("memcg: remove unnecessary !mm check from try_get_mem_cgroup_from_mm()")
      Reported-and-tested-by: NStephan Kulow <coolo@suse.com>
      Reported-by: NBranimir Maksimovic <branimir.maksimovic@gmail.com>
      Signed-off-by: NMichal Hocko <mhocko@suse.cz>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: Hugh Dickins <hughd@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6f6acb00
    • J
      mm: madvise: fix MADV_WILLNEED on shmem swapouts · 55231e5c
      Johannes Weiner 提交于
      MADV_WILLNEED currently does not read swapped out shmem pages back in.
      
      Commit 0cd6144a ("mm + fs: prepare for non-page entries in page
      cache radix trees") made find_get_page() filter exceptional radix tree
      entries but failed to convert all find_get_page() callers that WANT
      exceptional entries over to find_get_entry().  One of them is shmem swap
      readahead in madvise, which now skips over any swap-out records.
      
      Convert it to find_get_entry().
      
      Fixes: 0cd6144a ("mm + fs: prepare for non-page entries in page cache radix trees")
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Reported-by: NHugh Dickins <hughd@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      55231e5c
    • J
      mm/filemap.c: avoid always dirtying mapping->flags on O_DIRECT · 7fcbbaf1
      Jens Axboe 提交于
      In some testing I ran today (some fio jobs that spread over two nodes),
      we end up spending 40% of the time in filemap_check_errors().  That
      smells fishy.  Looking further, this is basically what happens:
      
      blkdev_aio_read()
          generic_file_aio_read()
              filemap_write_and_wait_range()
                  if (!mapping->nr_pages)
                      filemap_check_errors()
      
      and filemap_check_errors() always attempts two test_and_clear_bit() on
      the mapping flags, thus dirtying it for every single invocation.  The
      patch below tests each of these bits before clearing them, avoiding this
      issue.  In my test case (4-socket box), performance went from 1.7M IOPS
      to 4.0M IOPS.
      Signed-off-by: NJens Axboe <axboe@fb.com>
      Acked-by: NJeff Moyer <jmoyer@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7fcbbaf1
    • C
      hwpoison, hugetlb: lock_page/unlock_page does not match for handling a free hugepage · b985194c
      Chen Yucong 提交于
      For handling a free hugepage in memory failure, the race will happen if
      another thread hwpoisoned this hugepage concurrently.  So we need to
      check PageHWPoison instead of !PageHWPoison.
      
      If hwpoison_filter(p) returns true or a race happens, then we need to
      unlock_page(hpage).
      Signed-off-by: NChen Yucong <slaoub@gmail.com>
      Reviewed-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Tested-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Reviewed-by: NAndi Kleen <ak@linux.intel.com>
      Cc: <stable@vger.kernel.org>	[2.6.36+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b985194c
  3. 20 5月, 2014 3 次提交
  4. 15 5月, 2014 1 次提交
    • H
      parisc,metag: Do not hardcode maximum userspace stack size · 042d27ac
      Helge Deller 提交于
      This patch affects only architectures where the stack grows upwards
      (currently parisc and metag only). On those do not hardcode the maximum
      initial stack size to 1GB for 32-bit processes, but make it configurable
      via a config option.
      
      The main problem with the hardcoded stack size is, that we have two
      memory regions which grow upwards: stack and heap. To keep most of the
      memory available for heap in a flexmap memory layout, it makes no sense
      to hard allocate up to 1GB of the memory for stack which can't be used
      as heap then.
      
      This patch makes the stack size for 32-bit processes configurable and
      uses 80MB as default value which has been in use during the last few
      years on parisc and which hasn't showed any problems yet.
      Signed-off-by: NHelge Deller <deller@gmx.de>
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: linux-parisc@vger.kernel.org
      Cc: linux-metag@vger.kernel.org
      Cc: John David Anglin <dave.anglin@bell.net>
      042d27ac
  5. 11 5月, 2014 2 次提交
  6. 07 5月, 2014 1 次提交
新手
引导
客服 返回
顶部