1. 04 7月, 2014 16 次提交
    • T
      ptrace,x86: force IRET path after a ptrace_stop() · b9cd18de
      Tejun Heo 提交于
      The 'sysret' fastpath does not correctly restore even all regular
      registers, much less any segment registers or reflags values.  That is
      very much part of why it's faster than 'iret'.
      
      Normally that isn't a problem, because the normal ptrace() interface
      catches the process using the signal handler infrastructure, which
      always returns with an iret.
      
      However, some paths can get caught using ptrace_event() instead of the
      signal path, and for those we need to make sure that we aren't going to
      return to user space using 'sysret'.  Otherwise the modifications that
      may have been done to the register set by the tracer wouldn't
      necessarily take effect.
      
      Fix it by forcing IRET path by setting TIF_NOTIFY_RESUME from
      arch_ptrace_stop_needed() which is invoked from ptrace_stop().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NAndy Lutomirski <luto@amacapital.net>
      Acked-by: NOleg Nesterov <oleg@redhat.com>
      Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b9cd18de
    • L
      Merge branch 'akpm' (patches from Andrew Morton) · 5170a3b2
      Linus Torvalds 提交于
      Merge fixes from Andrew Morton:
       "14 fixes"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        shmem: fix init_page_accessed use to stop !PageLRU bug
        kernel/printk/printk.c: revert "printk: enable interrupts before calling console_trylock_for_printk()"
        tools/testing/selftests/ipc/msgque.c: improve error handling when not running as root
        fs/seq_file: fallback to vmalloc allocation
        /proc/stat: convert to single_open_size()
        hwpoison: fix the handling path of the victimized page frame that belong to non-LRU
        mm:vmscan: update the trace-vmscan-postprocess.pl for event vmscan/mm_vmscan_lru_isolate
        msync: fix incorrect fstart calculation
        zram: revalidate disk after capacity change
        tools: memory-hotplug fix unexpected operator error
        tools: cpu-hotplug fix unexpected operator error
        autofs4: fix false positive compile error
        slub: fix off by one in number of slab tests
        mm: page_alloc: fix CMA area initialisation when pageblock > MAX_ORDER
      5170a3b2
    • H
      shmem: fix init_page_accessed use to stop !PageLRU bug · 66d2f4d2
      Hugh Dickins 提交于
      Under shmem swapping load, I sometimes hit the VM_BUG_ON_PAGE(!PageLRU)
      in isolate_lru_pages() at mm/vmscan.c:1281!
      
      Commit 2457aec6 ("mm: non-atomically mark page accessed during page
      cache allocation where possible") looks like interrupted work-in-progress.
      
      mm/filemap.c's call to init_page_accessed() is fine, but not mm/shmem.c's
      - shmem_write_begin() is clearly wrong to use it after shmem_getpage(),
      when the page is always visible in radix_tree, and often already on LRU.
      
      Revert change to shmem_write_begin(), and use init_page_accessed() or
      mark_page_accessed() appropriately for SGP_WRITE in shmem_getpage_gfp().
      
      SGP_WRITE also covers shmem_symlink(), which did not mark_page_accessed()
      before; but since many other filesystems use [__]page_symlink(), which did
      and does mark the page accessed, consider this as rectifying an oversight.
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Acked-by: NMel Gorman <mgorman@suse.de>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Prabhakar Lad <prabhakar.csengg@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      66d2f4d2
    • A
      kernel/printk/printk.c: revert "printk: enable interrupts before calling... · d18bbc21
      Andrew Morton 提交于
      kernel/printk/printk.c: revert "printk: enable interrupts before calling console_trylock_for_printk()"
      
      Revert commit 939f04be ("printk: enable interrupts before calling
      console_trylock_for_printk()").
      
      Andreas reported:
      
      : None of the post 3.15 kernel boot for me. They all hang at the GRUB
      : screen telling me it loaded and started the kernel, but the kernel
      : itself stops before it prints anything (or even replaces the GRUB
      : background graphics).
      
      939f04be is modest latency reduction.  Revert it until we understand
      the reason for these failures.
      Reported-by: NAndreas Bombe <aeb@debian.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d18bbc21
    • S
      tools/testing/selftests/ipc/msgque.c: improve error handling when not running as root · e84f1ab3
      Shuah Khan 提交于
      The test fails in the middle when it is not run as root while accessing
      /proc/sys/kernel/msg_next_id.  Changed it to check for root at the
      beginning of the test and exit if not root.
      Signed-off-by: NShuah Khan <shuah.kh@samsung.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Davidlohr Bueso <davidlohr@hp.com>
      Cc: Colin Ian King <colin.king@canonical.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e84f1ab3
    • H
      fs/seq_file: fallback to vmalloc allocation · 058504ed
      Heiko Carstens 提交于
      There are a couple of seq_files which use the single_open() interface.
      This interface requires that the whole output must fit into a single
      buffer.
      
      E.g.  for /proc/stat allocation failures have been observed because an
      order-4 memory allocation failed due to memory fragmentation.  In such
      situations reading /proc/stat is not possible anymore.
      
      Therefore change the seq_file code to fallback to vmalloc allocations
      which will usually result in a couple of order-0 allocations and hence
      also work if memory is fragmented.
      
      For reference a call trace where reading from /proc/stat failed:
      
        sadc: page allocation failure: order:4, mode:0x1040d0
        CPU: 1 PID: 192063 Comm: sadc Not tainted 3.10.0-123.el7.s390x #1
        [...]
        Call Trace:
          show_stack+0x6c/0xe8
          warn_alloc_failed+0xd6/0x138
          __alloc_pages_nodemask+0x9da/0xb68
          __get_free_pages+0x2e/0x58
          kmalloc_order_trace+0x44/0xc0
          stat_open+0x5a/0xd8
          proc_reg_open+0x8a/0x140
          do_dentry_open+0x1bc/0x2c8
          finish_open+0x46/0x60
          do_last+0x382/0x10d0
          path_openat+0xc8/0x4f8
          do_filp_open+0x46/0xa8
          do_sys_open+0x114/0x1f0
          sysc_tracego+0x14/0x1a
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Tested-by: NDavid Rientjes <rientjes@google.com>
      Cc: Ian Kent <raven@themaw.net>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Thorsten Diehl <thorsten.diehl@de.ibm.com>
      Cc: Andrea Righi <andrea@betterlinux.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Stefan Bader <stefan.bader@canonical.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      058504ed
    • H
      /proc/stat: convert to single_open_size() · f74373a5
      Heiko Carstens 提交于
      These two patches are supposed to "fix" failed order-4 memory
      allocations which have been observed when reading /proc/stat.  The
      problem has been observed on s390 as well as on x86.
      
      To address the problem change the seq_file memory allocations to
      fallback to use vmalloc, so that allocations also work if memory is
      fragmented.
      
      This approach seems to be simpler and less intrusive than changing
      /proc/stat to use an interator.  Also it "fixes" other users as well,
      which use seq_file's single_open() interface.
      
      This patch (of 2):
      
      Use seq_file's single_open_size() to preallocate a buffer that is large
      enough to hold the whole output, instead of open coding it.  Also
      calculate the requested size using the number of online cpus instead of
      possible cpus, since the size of the output only depends on the number
      of online cpus.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Cc: Ian Kent <raven@themaw.net>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Thorsten Diehl <thorsten.diehl@de.ibm.com>
      Cc: Andrea Righi <andrea@betterlinux.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Stefan Bader <stefan.bader@canonical.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f74373a5
    • C
      hwpoison: fix the handling path of the victimized page frame that belong to non-LRU · 0bc1f8b0
      Chen Yucong 提交于
      Until now, the kernel has the same policy to handle victimized page
      frames that belong to kernel-space(reserved/slab-subsystem) or
      non-LRU(unknown page state).  In other word, the result of handling
      either of these victimized page frames is (IGNORED | FAILED), and the
      return value of memory_failure() is -EBUSY.
      
      This patch is to avoid that memory_failure() returns very soon due to
      the "true" value of (!PageLRU(p)), and it also ensures that
      action_result() can report more precise information("reserved kernel",
      "kernel slab", and "unknown page state") instead of "non LRU",
      especially for memory errors which are detected by memory-scrubbing.
      
      Andi said:
      
      : While running the mcelog test suite on 3.14 I hit the following VM_BUG_ON:
      :
      : soft_offline: 0x56d4: unknown non LRU page type 3ffff800008000
      : page:ffffea000015b400 count:3 mapcount:2097169 mapping:          (null) index:0xffff8800056d7000
      : page flags: 0x3ffff800004081(locked|slab|head)
      : ------------[ cut here ]------------
      : kernel BUG at mm/rmap.c:1495!
      :
      : I think what happened is that a LRU page turned into a slab page in
      : parallel with offlining.  memory_failure initially tests for this case,
      : but doesn't retest later after the page has been locked.
      :
      : ...
      :
      : I ran this patch in a loop over night with some stress plus
      : the mcelog test suite running in a loop. I cannot guarantee it hit it,
      : but it should have given it a good beating.
      :
      : The kernel survived with no messages, although the mcelog test suite
      : got killed at some point because it couldn't fork anymore. Probably
      : some unrelated problem.
      :
      : So the patch is ok for me for .16.
      Signed-off-by: NChen Yucong <slaoub@gmail.com>
      Acked-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Reported-by: NAndi Kleen <andi@firstfloor.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0bc1f8b0
    • C
      mm:vmscan: update the trace-vmscan-postprocess.pl for event vmscan/mm_vmscan_lru_isolate · b27ebf77
      Chen Yucong 提交于
      When using trace-vmscan-postprocess.pl for checking the file/anon rate
      of scanning, we can find that it can not be performed.  At the same
      time, the following message will be reported:
      
        WARNING: Format not as expected for event vmscan/mm_vmscan_lru_isolate
        'file' != 'contig_taken' Fewer fields than expected in format at
        ./trace-vmscan-postprocess.pl line 171, <FORMAT> line 76.
      
      In trace-vmscan-postprocess.pl, (contig_taken, contig_dirty, and
      contig_failed) are be associated respectively to (nr_lumpy_taken,
      nr_lumpy_dirty, and nr_lumpy_failed) for lumpy reclaim.  Via commit
      c53919ad ("mm: vmscan: remove lumpy reclaim"), lumpy reclaim had
      already been removed by Mel, but the update for
      trace-vmscan-postprocess.pl was missed.
      Signed-off-by: NChen Yucong <slaoub@gmail.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b27ebf77
    • N
      msync: fix incorrect fstart calculation · 496a8e68
      Namjae Jeon 提交于
      Fix a regression caused by 7fc34a62 ("mm/msync.c: sync only the
      requested range in msync()").
      
      xfstests generic/075 fail occured on ext4 data=journal mode because the
      intended range was not syncing due to wrong fstart calculation.
      Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
      Signed-off-by: NAshish Sangwan <a.sangwan@samsung.com>
      Reported-by: NEric Whitney <enwlinux@gmail.com>
      Tested-by: NEric Whitney <enwlinux@gmail.com>
      Acked-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>
      Reviewed-by: NLukas Czerner <lczerner@redhat.com>
      Tested-by: NLukas Czerner <lczerner@redhat.com>
      Reviewed-by: NTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      496a8e68
    • M
      zram: revalidate disk after capacity change · 2e32baea
      Minchan Kim 提交于
      Alexander reported mkswap on /dev/zram0 is failed if other process is
      opening the block device file.
      
      Step is as follows,
      
      0. Reset the unused zram device.
      1. Use a program that opens /dev/zram0 with O_RDWR and sleeps
         until killed.
      2. While that program sleeps, echo the correct value to
         /sys/block/zram0/disksize.
      3. Verify (e.g. in /proc/partitions) that the disk size is applied
         correctly. It is.
      4. While that program still sleeps, attempt to mkswap /dev/zram0.
         This fails: mkswap: error: swap area needs to be at least 40 KiB
      
      When I investigated, the size get by ioctl(fd, BLKGETSIZE64, xxx) on
      mkswap to get a size of blockdev was zero although zram0 has right size by
      2.
      
      The reason is zram didn't revalidate disk after changing capacity so that
      size of blockdev's inode is not uptodate until all of file is close.
      
      This patch should fix the BUG.
      Signed-off-by: NMinchan Kim <minchan@kernel.org>
      Reported-by: NAlexander E. Patrakov <patrakov@gmail.com>
      Tested-by: NAlexander E. Patrakov <patrakov@gmail.com>
      Reviewed-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Acked-by: NJerome Marchand <jmarchan@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2e32baea
    • S
      tools: memory-hotplug fix unexpected operator error · e98f7762
      Shuah Khan 提交于
      on-off-test uses "$UID != 0" to test for root, but $UID is a construct
      specific to bash.  Using /bin/sh that isn't bash results in the
      following error (due to the "$UID" part expanding to nothing):
      
        ./on-off-test.sh: 9: [: !=: unexpected operator
      
      Change Makefile to use bash instead.
      Signed-off-by: NShuah Khan <shuah.kh@samsung.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e98f7762
    • S
      tools: cpu-hotplug fix unexpected operator error · 1bd702e6
      Shuah Khan 提交于
      on-off-test uses "$UID != 0" to test for root, but $UID is a construct
      specific to bash.  Using /bin/sh that isn't bash results in the
      following error (due to the "$UID" part expanding to nothing):
      
        ./on-off-test.sh: 9: [: !=: unexpected operator
      
      Change Makefile to use bash instead.
      Signed-off-by: NShuah Khan <shuah.kh@samsung.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1bd702e6
    • I
      autofs4: fix false positive compile error · 571ff473
      Ian Kent 提交于
      On strict build environments we can see:
      
        fs/autofs4/inode.c: In function 'autofs4_fill_super':
        fs/autofs4/inode.c:312: error: 'pgrp' may be used uninitialized in this function
        make[2]: *** [fs/autofs4/inode.o] Error 1
        make[1]: *** [fs/autofs4] Error 2
        make: *** [fs] Error 2
        make: *** Waiting for unfinished jobs....
      
      This is due to the use of pgrp_set being used to indicate pgrp has has
      been set rather than initializing pgrp itself.
      Signed-off-by: NIan Kent <raven@themaw.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      571ff473
    • J
      slub: fix off by one in number of slab tests · 8a5b20ae
      Joonsoo Kim 提交于
      min_partial means minimum number of slab cached in node partial list.
      So, if nr_partial is less than it, we keep newly empty slab on node
      partial list rather than freeing it.  But if nr_partial is equal or
      greater than it, it means that we have enough partial slabs so should
      free newly empty slab.  Current implementation missed the equal case so
      if we set min_partial is 0, then, at least one slab could be cached.
      This is critical problem to kmemcg destroying logic because it doesn't
      works properly if some slabs is cached.  This patch fixes this problem.
      
      Fixes 91cb69620284 ("slub: make dead memcg caches discard free slabs
      immediately").
      Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Acked-by: NVladimir Davydov <vdavydov@parallels.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8a5b20ae
    • M
      mm: page_alloc: fix CMA area initialisation when pageblock > MAX_ORDER · dc78327c
      Michal Nazarewicz 提交于
      With a kernel configured with ARM64_64K_PAGES && !TRANSPARENT_HUGEPAGE,
      the following is triggered at early boot:
      
        SMP: Total of 8 processors activated.
        devtmpfs: initialized
        Unable to handle kernel NULL pointer dereference at virtual address 00000008
        pgd = fffffe0000050000
        [00000008] *pgd=00000043fba00003, *pmd=00000043fba00003, *pte=00e0000078010407
        Internal error: Oops: 96000006 [#1] SMP
        Modules linked in:
        CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.15.0-rc864k+ #44
        task: fffffe03bc040000 ti: fffffe03bc080000 task.ti: fffffe03bc080000
        PC is at __list_add+0x10/0xd4
        LR is at free_one_page+0x270/0x638
        ...
        Call trace:
          __list_add+0x10/0xd4
          free_one_page+0x26c/0x638
          __free_pages_ok.part.52+0x84/0xbc
          __free_pages+0x74/0xbc
          init_cma_reserved_pageblock+0xe8/0x104
          cma_init_reserved_areas+0x190/0x1e4
          do_one_initcall+0xc4/0x154
          kernel_init_freeable+0x204/0x2a8
          kernel_init+0xc/0xd4
      
      This happens because init_cma_reserved_pageblock() calls
      __free_one_page() with pageblock_order as page order but it is bigger
      than MAX_ORDER.  This in turn causes accesses past zone->free_list[].
      
      Fix the problem by changing init_cma_reserved_pageblock() such that it
      splits pageblock into individual MAX_ORDER pages if pageblock is bigger
      than a MAX_ORDER page.
      
      In cases where !CONFIG_HUGETLB_PAGE_SIZE_VARIABLE, which is all
      architectures expect for ia64, powerpc and tile at the moment, the
      “pageblock_order > MAX_ORDER” condition will be optimised out since both
      sides of the operator are constants.  In cases where pageblock size is
      variable, the performance degradation should not be significant anyway
      since init_cma_reserved_pageblock() is called only at boot time at most
      MAX_CMA_AREAS times which by default is eight.
      Signed-off-by: NMichal Nazarewicz <mina86@mina86.com>
      Reported-by: NMark Salter <msalter@redhat.com>
      Tested-by: NMark Salter <msalter@redhat.com>
      Tested-by: NChristopher Covington <cov@codeaurora.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: <stable@vger.kernel.org>	[3.5+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dc78327c
  2. 03 7月, 2014 4 次提交
  3. 02 7月, 2014 5 次提交
  4. 01 7月, 2014 8 次提交
  5. 30 6月, 2014 7 次提交