1. 20 12月, 2012 3 次提交
  2. 29 11月, 2012 4 次提交
  3. 28 11月, 2012 2 次提交
  4. 20 11月, 2012 1 次提交
    • F
      vtime: Warn if irqs aren't disabled on system time accounting APIs · 1b2852b1
      Frederic Weisbecker 提交于
      System time accounting APIs such as vtime_account_system() and
      vtime_account_idle() need to be irqsafe. Current callers include
      irq entry, exit and kvm, all of which have been checked against that
      requirement. Now it's better to grow that with an automatic check
      in case we have further callers or we missed something.
      Suggested-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      1b2852b1
  5. 19 11月, 2012 4 次提交
    • F
      vtime: Consolidate a bit the ctx switch code · e3942ba0
      Frederic Weisbecker 提交于
      On ia64 and powerpc, vtime context switch only consists
      in flushing system and user pending time, plus a few
      arch housekeeping.
      
      Consolidate that into a generic implementation. s390 is
      a special case because pending user and system time accounting
      there is hard to dissociate. So it's keeping its own implementation.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Reviewed-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      e3942ba0
    • F
      vtime: Explicitly account pending user time on process tick · bcebdf84
      Frederic Weisbecker 提交于
      All vtime implementations just flush the user time on process
      tick. Consolidate that in generic code by calling a user time
      accounting helper. This avoids an indirect call in ia64 and
      prepare to also consolidate vtime context switch code.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Reviewed-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      bcebdf84
    • F
      vtime: Remove the underscore prefix invasion · fd25b4c2
      Frederic Weisbecker 提交于
      Prepending irq-unsafe vtime APIs with underscores was actually
      a bad idea as the result is a big mess in the API namespace that
      is even waiting to be further extended. Also these helpers
      are always called from irq safe callers except kvm. Just
      provide a vtime_account_system_irqsafe() for this specific
      case so that we can remove the underscore prefix on other
      vtime functions.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Reviewed-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      fd25b4c2
    • M
      d7558148
  6. 17 11月, 2012 1 次提交
    • A
      revert "mm: fix-up zone present pages" · 5576646f
      Andrew Morton 提交于
      Revert commit 7f1290f2 ("mm: fix-up zone present pages")
      
      That patch tried to fix a issue when calculating zone->present_pages,
      but it caused a regression on 32bit systems with HIGHMEM.  With that
      change, reset_zone_present_pages() resets all zone->present_pages to
      zero, and fixup_zone_present_pages() is called to recalculate
      zone->present_pages when the boot allocator frees core memory pages into
      buddy allocator.  Because highmem pages are not freed by bootmem
      allocator, all highmem zones' present_pages becomes zero.
      
      Various options for improving the situation are being discussed but for
      now, let's return to the 3.6 code.
      
      Cc: Jianguo Wu <wujianguo@huawei.com>
      Cc: Jiang Liu <jiang.liu@huawei.com>
      Cc: Petr Tesarik <ptesarik@suse.cz>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Tested-by: NChris Clayton <chris2553@googlemail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5576646f
  7. 16 11月, 2012 1 次提交
    • J
      TTY: call tty_port_destroy in the rest of drivers · 191c5f10
      Jiri Slaby 提交于
      After commit "TTY: move tty buffers to tty_port", the tty buffers are
      not freed in some drivers. This is because tty_port_destructor is not
      called whenever a tty_port is freed. This was an assumption I counted
      with but was unfortunately untrue. So fix the drivers to fulfil this
      assumption.
      
      To be sure, the TTY buffers (and later some stuff) are gone along with
      the tty_port, we have to call tty_port_destroy at tear-down places.
      This is mostly where the structure containing a tty_port is freed.
      This patch does exactly that -- put tty_port_destroy at those places.
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      191c5f10
  8. 15 11月, 2012 2 次提交
  9. 14 11月, 2012 1 次提交
  10. 08 11月, 2012 1 次提交
    • T
      [IA64] Resolve name space collision for cache_show() · 4fafc8c2
      Tony Luck 提交于
      We have a local static function named rather generically
      "cache_show()". Changes in progress in the slab code want
      to use this same name globally - so they are adding their
      declaration to <linux/slab.h> which then causes the compiler
      to choke with:
      
      arch/ia64/kernel/topology.c:278: error: conflicting types for 'cache_show'
      
      Fix by adding an "ia64_" prefix to our local function.
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      4fafc8c2
  11. 01 11月, 2012 1 次提交
    • P
      sk-filter: Add ability to get socket filter program (v2) · a8fc9277
      Pavel Emelyanov 提交于
      The SO_ATTACH_FILTER option is set only. I propose to add the get
      ability by using SO_ATTACH_FILTER in getsockopt. To be less
      irritating to eyes the SO_GET_FILTER alias to it is declared. This
      ability is required by checkpoint-restore project to be able to
      save full state of a socket.
      
      There are two issues with getting filter back.
      
      First, kernel modifies the sock_filter->code on filter load, thus in
      order to return the filter element back to user we have to decode it
      into user-visible constants. Fortunately the modification in question
      is interconvertible.
      
      Second, the BPF_S_ALU_DIV_K code modifies the command argument k to
      speed up the run-time division by doing kernel_k = reciprocal(user_k).
      Bad news is that different user_k may result in same kernel_k, so we
      can't get the original user_k back. Good news is that we don't have
      to do it. What we need to is calculate a user2_k so, that
      
        reciprocal(user2_k) == reciprocal(user_k) == kernel_k
      
      i.e. if it's re-loaded back the compiled again value will be exactly
      the same as it was. That said, the user2_k can be calculated like this
      
        user2_k = reciprocal(kernel_k)
      
      with an exception, that if kernel_k == 0, then user2_k == 1.
      
      The optlen argument is treated like this -- when zero, kernel returns
      the amount of sock_fprog elements in filter, otherwise it should be
      large enough for the sock_fprog array.
      
      changes since v1:
      * Declared SO_GET_FILTER in all arch headers
      * Added decode of vlan-tag codes
      Signed-off-by: NPavel Emelyanov <xemul@parallels.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a8fc9277
  12. 30 10月, 2012 1 次提交
    • F
      vtime: Make vtime_account_system() irqsafe · 11113334
      Frederic Weisbecker 提交于
      vtime_account_system() currently has only one caller with
      vtime_account() which is irq safe.
      
      Now we are going to call it from other places like kvm where
      irqs are not always disabled by the time we account the cputime.
      
      So let's make it irqsafe. The arch implementation part is now
      prefixed with "__".
      
      vtime_account_idle() arch implementation is prefixed accordingly
      to stay consistent.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      11113334
  13. 27 10月, 2012 1 次提交
  14. 25 10月, 2012 1 次提交
  15. 20 10月, 2012 2 次提交
  16. 19 10月, 2012 1 次提交
    • D
      UAPI: Make arch/ia64/include/asm/kvm_para.h generic · 7e3cdf24
      David Howells 提交于
      Make arch/ia64/include/asm/kvm_para.h generic since it's the same as the
      asm-generic version.  This deals with the problem when ia64 is compiled with
      allmodconfig you get:
      
      In file included from include/uapi/linux/kvm_para.h:26:0,
                       from include/linux/kvm_para.h:4,
                       from kernel/watchdog.c:28:
      arch/ia64/include/asm/kvm_para.h:24:28: error: redefinition of 'kvm_arch_para_features'
      include/asm-generic/kvm_para.h:16:28: note: previous definition of 'kvm_arch_para_features' was here
      arch/ia64/include/asm/kvm_para.h:29:20: error: redefinition of 'kvm_check_and_clear_guest_paused'
      include/asm-generic/kvm_para.h:11:20: note: previous definition of 'kvm_check_and_clear_guest_paused' was here
      
      This does not happen for ia64 defconfig.
      Reported-by: NYuanhan Liu <yuanhan.liu@intel.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      cc: Yuanhan Liu <yuanhan.liu@intel.com>
      cc: fengguang.wu@intel.com
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      7e3cdf24
  17. 17 10月, 2012 1 次提交
    • D
      UAPI: Put a comment into uapi/asm-generic/kvm_para.h and use it from arches · 0420c87e
      David Howells 提交于
      Make uapi/asm-generic/kvm_para.h non-empty by addition of a comment to stop
      the patch program from deleting it when it creates it.
      
      Then delete empty arch-specific uapi/asm/kvm_para.h files and tell the Kbuild
      files to use the generic instead.
      
      Should this perhaps instead be a #warning or #error that the facility is
      unsupported on this arch?
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      cc: Arnd Bergmann <arnd@arndb.de>
      cc: Avi Kivity <avi@redhat.com>
      cc: Marcelo Tosatti <mtosatti@redhat.com>
      cc: kvm@vger.kernel.org
      0420c87e
  18. 15 10月, 2012 3 次提交
  19. 13 10月, 2012 1 次提交
    • J
      vfs: define struct filename and have getname() return it · 91a27b2a
      Jeff Layton 提交于
      getname() is intended to copy pathname strings from userspace into a
      kernel buffer. The result is just a string in kernel space. It would
      however be quite helpful to be able to attach some ancillary info to
      the string.
      
      For instance, we could attach some audit-related info to reduce the
      amount of audit-related processing needed. When auditing is enabled,
      we could also call getname() on the string more than once and not
      need to recopy it from userspace.
      
      This patchset converts the getname()/putname() interfaces to return
      a struct instead of a string. For now, the struct just tracks the
      string in kernel space and the original userland pointer for it.
      
      Later, we'll add other information to the struct as it becomes
      convenient.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      91a27b2a
  20. 09 10月, 2012 5 次提交
    • D
    • J
      mm: fix-up zone present pages · 7f1290f2
      Jianguo Wu 提交于
      I think zone->present_pages indicates pages that buddy system can management,
      it should be:
      
      	zone->present_pages = spanned pages - absent pages - bootmem pages,
      
      but is now:
      	zone->present_pages = spanned pages - absent pages - memmap pages.
      
      spanned pages: total size, including holes.
      absent pages: holes.
      bootmem pages: pages used in system boot, managed by bootmem allocator.
      memmap pages: pages used by page structs.
      
      This may cause zone->present_pages less than it should be.  For example,
      numa node 1 has ZONE_NORMAL and ZONE_MOVABLE, it's memmap and other
      bootmem will be allocated from ZONE_MOVABLE, so ZONE_NORMAL's
      present_pages should be spanned pages - absent pages, but now it also
      minus memmap pages(free_area_init_core), which are actually allocated from
      ZONE_MOVABLE.  When offlining all memory of a zone, this will cause
      zone->present_pages less than 0, because present_pages is unsigned long
      type, it is actually a very large integer, it indirectly caused
      zone->watermark[WMARK_MIN] becomes a large
      integer(setup_per_zone_wmarks()), than cause totalreserve_pages become a
      large integer(calculate_totalreserve_pages()), and finally cause memory
      allocating failure when fork process(__vm_enough_memory()).
      
      [root@localhost ~]# dmesg
      -bash: fork: Cannot allocate memory
      
      I think the bug described in
      
        http://marc.info/?l=linux-mm&m=134502182714186&w=2
      
      is also caused by wrong zone present pages.
      
      This patch intends to fix-up zone->present_pages when memory are freed to
      buddy system on x86_64 and IA64 platforms.
      Signed-off-by: NJianguo Wu <wujianguo@huawei.com>
      Signed-off-by: NJiang Liu <jiang.liu@huawei.com>
      Reported-by: NPetr Tesarik <ptesarik@suse.cz>
      Tested-by: NPetr Tesarik <ptesarik@suse.cz>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7f1290f2
    • S
      readahead: fault retry breaks mmap file read random detection · 45cac65b
      Shaohua Li 提交于
      .fault now can retry.  The retry can break state machine of .fault.  In
      filemap_fault, if page is miss, ra->mmap_miss is increased.  In the second
      try, since the page is in page cache now, ra->mmap_miss is decreased.  And
      these are done in one fault, so we can't detect random mmap file access.
      
      Add a new flag to indicate .fault is tried once.  In the second try, skip
      ra->mmap_miss decreasing.  The filemap_fault state machine is ok with it.
      
      I only tested x86, didn't test other archs, but looks the change for other
      archs is obvious, but who knows :)
      Signed-off-by: NShaohua Li <shaohua.li@fusionio.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      45cac65b
    • W
      mm: hugetlb: add arch hook for clearing page flags before entering pool · 5d3a551c
      Will Deacon 提交于
      The core page allocator ensures that page flags are zeroed when freeing
      pages via free_pages_check.  A number of architectures (ARM, PPC, MIPS)
      rely on this property to treat new pages as dirty with respect to the data
      cache and perform the appropriate flushing before mapping the pages into
      userspace.
      
      This can lead to cache synchronisation problems when using hugepages,
      since the allocator keeps its own pool of pages above the usual page
      allocator and does not reset the page flags when freeing a page into the
      pool.
      
      This patch adds a new architecture hook, arch_clear_hugepage_flags, so
      that architectures which rely on the page flags being in a particular
      state for fresh allocations can adjust the flags accordingly when a page
      is freed into the pool.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Reviewed-by: NMichal Hocko <mhocko@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5d3a551c
    • K
      mm: kill vma flag VM_RESERVED and mm->reserved_vm counter · 314e51b9
      Konstantin Khlebnikov 提交于
      A long time ago, in v2.4, VM_RESERVED kept swapout process off VMA,
      currently it lost original meaning but still has some effects:
      
       | effect                 | alternative flags
      -+------------------------+---------------------------------------------
      1| account as reserved_vm | VM_IO
      2| skip in core dump      | VM_IO, VM_DONTDUMP
      3| do not merge or expand | VM_IO, VM_DONTEXPAND, VM_HUGETLB, VM_PFNMAP
      4| do not mlock           | VM_IO, VM_DONTEXPAND, VM_HUGETLB, VM_PFNMAP
      
      This patch removes reserved_vm counter from mm_struct.  Seems like nobody
      cares about it, it does not exported into userspace directly, it only
      reduces total_vm showed in proc.
      
      Thus VM_RESERVED can be replaced with VM_IO or pair VM_DONTEXPAND | VM_DONTDUMP.
      
      remap_pfn_range() and io_remap_pfn_range() set VM_IO|VM_DONTEXPAND|VM_DONTDUMP.
      remap_vmalloc_range() set VM_DONTEXPAND | VM_DONTDUMP.
      
      [akpm@linux-foundation.org: drivers/vfio/pci/vfio_pci.c fixup]
      Signed-off-by: NKonstantin Khlebnikov <khlebnikov@openvz.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Carsten Otte <cotte@de.ibm.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Morris <james.l.morris@oracle.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Kentaro Takeda <takedakn@nttdata.co.jp>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Venkatesh Pallipadi <venki@google.com>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      314e51b9
  21. 06 10月, 2012 1 次提交
  22. 04 10月, 2012 1 次提交
  23. 03 10月, 2012 1 次提交