1. 13 9月, 2013 1 次提交
    • J
      arch: mm: remove obsolete init OOM protection · 94bce453
      Johannes Weiner 提交于
      The memcg code can trap tasks in the context of the failing allocation
      until an OOM situation is resolved.  They can hold all kinds of locks
      (fs, mm) at this point, which makes it prone to deadlocking.
      
      This series converts memcg OOM handling into a two step process that is
      started in the charge context, but any waiting is done after the fault
      stack is fully unwound.
      
      Patches 1-4 prepare architecture handlers to support the new memcg
      requirements, but in doing so they also remove old cruft and unify
      out-of-memory behavior across architectures.
      
      Patch 5 disables the memcg OOM handling for syscalls, readahead, kernel
      faults, because they can gracefully unwind the stack with -ENOMEM.  OOM
      handling is restricted to user triggered faults that have no other
      option.
      
      Patch 6 reworks memcg's hierarchical OOM locking to make it a little
      more obvious wth is going on in there: reduce locked regions, rename
      locking functions, reorder and document.
      
      Patch 7 implements the two-part OOM handling such that tasks are never
      trapped with the full charge stack in an OOM situation.
      
      This patch:
      
      Back before smart OOM killing, when faulting tasks were killed directly on
      allocation failures, the arch-specific fault handlers needed special
      protection for the init process.
      
      Now that all fault handlers call into the generic OOM killer (see commit
      609838cf: "mm: invoke oom-killer from remaining unconverted page
      fault handlers"), which already provides init protection, the
      arch-specific leftovers can be removed.
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Reviewed-by: NMichal Hocko <mhocko@suse.cz>
      Acked-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: azurIt <azurit@pobox.sk>
      Acked-by: Vineet Gupta <vgupta@synopsys.com>	[arch/arc bits]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      94bce453
  2. 12 9月, 2013 1 次提交
    • N
      mm: migrate: check movability of hugepage in unmap_and_move_huge_page() · 83467efb
      Naoya Horiguchi 提交于
      Currently hugepage migration works well only for pmd-based hugepages
      (mainly due to lack of testing,) so we had better not enable migration of
      other levels of hugepages until we are ready for it.
      
      Some users of hugepage migration (mbind, move_pages, and migrate_pages) do
      page table walk and check pud/pmd_huge() there, so they are safe.  But the
      other users (softoffline and memory hotremove) don't do this, so without
      this patch they can try to migrate unexpected types of hugepages.
      
      To prevent this, we introduce hugepage_migration_support() as an
      architecture dependent check of whether hugepage are implemented on a pmd
      basis or not.  And on some architecture multiple sizes of hugepages are
      available, so hugepage_migration_support() also checks hugepage size.
      Signed-off-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Hillf Danton <dhillf@gmail.com>
      Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      83467efb
  3. 04 9月, 2013 5 次提交
    • C
      tile: make __write_once a synonym for __read_mostly · ce61cdc2
      Chris Metcalf 提交于
      This was really only useful for TILE64 when we mapped the
      kernel data with small pages. Now we use a huge page and we
      really don't want to map different parts of the kernel
      data in different ways.
      
      We retain the __write_once name in case we want to bring
      it back to life at some point in the future.
      
      Note that this change uncovered a latent bug where the
      "smp_topology" variable happened to always be aligned mod 8
      so we could store two "int" values at once, but when we
      eliminated __write_once it ended up only aligned mod 4.
      Fix with an explicit annotation.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      ce61cdc2
    • C
      tile: remove support for TILE64 · d7c96611
      Chris Metcalf 提交于
      This chip is no longer being actively developed for (it was superceded
      by the TILEPro64 in 2008), and in any case the existing compiler and
      toolchain in the community do not support it.  It's unlikely that the
      kernel works with TILE64 at this point as the configuration has not been
      tested in years.  The support is also awkward as it requires maintaining
      a significant number of ifdefs.  So, just remove it altogether.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      d7c96611
    • C
      tile: add virt_to_kpte() API and clean up and document behavior · 640710a3
      Chris Metcalf 提交于
      We use virt_to_pte(NULL, va) a lot, which isn't very obvious.
      I added virt_to_kpte(va) as a more obvious wrapper function,
      that also validates the va as being a kernel adddress.
      
      And, I fixed the semantics of virt_to_pte() so that we handle
      the pud and pmd the same way, and we now document the fact that
      we handle the final pte level differently.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      640710a3
    • C
      tile: parameterize VA and PA space more cleanly · acbde1db
      Chris Metcalf 提交于
      The existing code relied on the hardware definition (<arch/chip.h>)
      to specify how much VA and PA space was available.  It's convenient
      to allow customizing this for some configurations, so provide symbols
      MAX_PA_WIDTH and MAX_VA_WIDTH in <asm/page.h> that can be modified
      if desired.
      
      Additionally, move away from the MEM_XX_INTRPT nomenclature to
      define the start of various regions within the VA space.  In fact
      the cleaner symbol is, for example, MEM_SV_START, to indicate the
      start of the area used for supervisor code; the actual address of the
      interrupt vectors is not as important, and can be changed if desired.
      As part of this change, convert from "intrpt1" nomenclature (which
      built in the old privilege-level 1 model) to a simple "intrpt".
      
      Also strip out some tilepro-specific code supporting modifying the
      PL the kernel could run at, since we don't actually support using
      different PLs in tilepro, only tilegx.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      acbde1db
    • C
      tile: don't assume user privilege is zero · 051168df
      Chris Metcalf 提交于
      Technically, user privilege is anything less than kernel
      privilege.  We modify the existing user_mode() macro to have
      this semantic (and use it in a couple of places it wasn't being
      used before), and add an IS_KERNEL_EX1() macro to the assembly
      code as well.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      051168df
  4. 30 8月, 2013 4 次提交
  5. 14 8月, 2013 8 次提交
    • C
      tile: provide traceability for hypervisor calls · 9ae09838
      Chris Metcalf 提交于
      This change adds infrastructure (CONFIG_TILE_HVGLUE_TRACE) that
      provides C code wrappers for the calls the kernel makes to the Tilera
      hypervisor.  This allows standard kernel infrastructure like FTRACE to
      be able to instrument hypervisor calls.
      
      To allow direct calls to the true API, we export their names with a
      leading underscore as well.  This is important for the few contexts
      where we need to make hypervisor calls without touching the stack.
      
      As part of this change, we also switch from creating the symbols
      with linker magic to creating them with assembler magic.  This lets
      us provide a symbol type and generally make them appear more as symbols
      and less as just random values in the Elf namespace.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      9ae09838
    • C
      tile: avoid struct vm_struct leak · fad052dc
      Chris Metcalf 提交于
      If ioreamp_prot() fails in ioremap_page_range() due to kernel memory
      exhaustion, we previously would leak a struct vm_struct.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      fad052dc
    • C
      tile: implement gettimeofday() via vDSO · 4a556f4f
      Chris Metcalf 提交于
      This change creates the framework for vDSO calls, makes the existing
      rt_sigreturn() mechanism use it, and adds a fast gettimeofday().
      Now that we need to expose the vDSO address to userspace, we add
      AT_SYSINFO_EHDR to the set of aux entries provided to userspace.
      (You can disable any extra vDSO support by booting with vdso=0,
      but the rt_sigreturn vDSO page will still be provided.)
      
      Note that glibc has supported the tile vDSO since release 2.17.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      4a556f4f
    • C
      tile: support simulator notification for ET_DYN objects · 0c1d1917
      Chris Metcalf 提交于
      The tile code notifies the simulator of new ET_EXEC objects starting
      to execute so that tracing code can properly annotate the objects.
      However, we didn't support ET_DYN executables like ld.so, so we
      didn't properly load symbols, etc.  This change enables that support;
      we use a variant of the SIM_CONTROL_DLOPEN simulator notification
      that newer simulators will recognize and use to set the base address
      for the next SIM_CONTROL_OS_EXEC notification.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      0c1d1917
    • C
      tile: support CONFIG_PREEMPT · bc1a298f
      Chris Metcalf 提交于
      This change adds support for CONFIG_PREEMPT (full kernel preemption).
      In addition to the core support, this change includes a number
      of places where we fix up uses of smp_processor_id() and per-cpu
      variables.  I also eliminate the PAGE_HOME_HERE and PAGE_HOME_UNKNOWN
      values for page homing, as it turns out they weren't being used.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      bc1a298f
    • C
      tile: remove calls to arch_flush_lazy_mmu_mode() · 1182b69c
      Chris Metcalf 提交于
      Since it's a no-op on tile anyway, there's no reason to be calling
      it in tile-specific code.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      1182b69c
    • C
      tile: fix some issues in hugepage support · a0bd12d7
      Chris Metcalf 提交于
      First, in huge_pte_offset(), we were erroneously checking
      pgd_present(), which is always true, rather than pud_present(),
      which is the thing that tells us if there is a top-level (L0) PTE.
      Fixing this means we properly look up huge page entries only when
      the Present bit is actually set in the PTE.
      
      Second, use the standard pte_alloc_map() instead of the hand-rolled
      pte_alloc_hugetlb() routine that basically was written to avoid
      worrying about CONFIG_HIGHPTE.  However, we no longer plan to support
      HIGHPTE, so a separate routine was just unnecessary code duplication.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      a0bd12d7
    • C
      tile: fast-path unaligned memory access for tilegx · 2f9ac29e
      Chris Metcalf 提交于
      This change enables unaligned userspace memory access via a kernel
      fast path on tilegx.  The kernel tracks user PC/instruction pairs
      per-thread using a direct-mapped cache in userspace.  The cache
      maps those PC/instruction pairs to JIT'ed instruction sequences that
      load or store using byte-wide load store intructions and then
      synthesize 2-, 4- or 8-byte load or store results.  Once an
      instruction has been seen to generate an unaligned access once,
      subsequent hits on that instruction typically require overhead
      of only around 50 cycles if cache and TLB is hot.
      
      We support the prctl() PR_GET_UNALIGN / PR_SET_UNALIGN sys call to
      enable or disable unaligned fixups on a per-process basis.
      
      To do this we pull some of the tilepro unaligned support out of the
      single_step.c file; tilepro uses instruction disassembly for both
      single-step and unaligned access support.  Since tilegx actually has
      hardware singlestep support, though, it's cleaner to keep the tilegx
      unaligned access code in a separate file.  While we're at it,
      properly rename the tilepro-specific types, etc., to have tilepro
      suffixes instead of generic tile suffixes.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      2f9ac29e
  6. 13 8月, 2013 1 次提交
  7. 11 7月, 2013 1 次提交
  8. 10 7月, 2013 1 次提交
    • J
      mm: invoke oom-killer from remaining unconverted page fault handlers · 609838cf
      Johannes Weiner 提交于
      A few remaining architectures directly kill the page faulting task in an
      out of memory situation.  This is usually not a good idea since that
      task might not even use a significant amount of memory and so may not be
      the optimal victim to resolve the situation.
      
      Since 2.6.29's 1c0fe6e3 ("mm: invoke oom-killer from page fault") there
      is a hook that architecture page fault handlers are supposed to call to
      invoke the OOM killer and let it pick the right task to kill.  Convert
      the remaining architectures over to this hook.
      
      To have the previous behavior of simply taking out the faulting task the
      vm.oom_kill_allocating_task sysctl can be set to 1.
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Reviewed-by: NMichal Hocko <mhocko@suse.cz>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Acked-by: Vineet Gupta <vgupta@synopsys.com>   [arch/arc bits]
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Chen Liqin <liqin.chen@sunplusct.com>
      Cc: Lennox Wu <lennox.wu@gmail.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      609838cf
  9. 04 7月, 2013 4 次提交
    • J
      mm/tile: prepare for removing num_physpages and simplify mem_init() · 3f29c331
      Jiang Liu 提交于
      Prepare for removing num_physpages and simplify mem_init().
      Signed-off-by: NJiang Liu <jiang.liu@huawei.com>
      Acked-by: NChris Metcalf <cmetcalf@tilera.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Wen Congyang <wency@cn.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3f29c331
    • J
      tile: normalize global variables exported by vmlinux.lds · 40a3b8df
      Jiang Liu 提交于
      Normalize global variables exported by vmlinux.lds to conform usage
      guidelines from include/asm-generic/sections.h.
      
      1) Use _text to mark the start of the kernel image including the head
      text, and _stext to mark the start of the .text section.
      2) Export mandatory global variables __init_begin and __init_end.
      Signed-off-by: NJiang Liu <jiang.liu@huawei.com>
      Acked-by: NChris Metcalf <cmetcalf@tilera.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Wen Congyang <wency@cn.fujitsu.com>
      Cc: David Howells <dhowells@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      40a3b8df
    • J
      mm: concentrate modification of totalram_pages into the mm core · 0c988534
      Jiang Liu 提交于
      Concentrate code to modify totalram_pages into the mm core, so the arch
      memory initialized code doesn't need to take care of it.  With these
      changes applied, only following functions from mm core modify global
      variable totalram_pages: free_bootmem_late(), free_all_bootmem(),
      free_all_bootmem_node(), adjust_managed_page_count().
      
      With this patch applied, it will be much more easier for us to keep
      totalram_pages and zone->managed_pages in consistence.
      Signed-off-by: NJiang Liu <jiang.liu@huawei.com>
      Acked-by: NDavid Howells <dhowells@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: <sworddragon2@aol.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Cc: Jianguo Wu <wujianguo@huawei.com>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Tang Chen <tangchen@cn.fujitsu.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Wen Congyang <wency@cn.fujitsu.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0c988534
    • J
      mm/tile: use common help functions to free reserved pages · abd1b6d6
      Jiang Liu 提交于
      Use common help functions to free reserved pages.
      Signed-off-by: NJiang Liu <jiang.liu@huawei.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Wen Congyang <wency@cn.fujitsu.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: <sworddragon2@aol.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Cc: Jianguo Wu <wujianguo@huawei.com>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Tang Chen <tangchen@cn.fujitsu.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      abd1b6d6
  10. 30 4月, 2013 1 次提交
    • J
      mm, vmalloc: change iterating a vmlist to find_vm_area() · ef932473
      Joonsoo Kim 提交于
      This patchset removes vm_struct list management after initializing
      vmalloc.  Adding and removing an entry to vmlist is linear time
      complexity, so it is inefficient.  If we maintain this list, overall
      time complexity of adding and removing area to vmalloc space is O(N),
      although we use rbtree for finding vacant place and it's time complexity
      is just O(logN).
      
      And vmlist and vmlist_lock is used many places of outside of vmalloc.c.
      It is preferable that we hide this raw data structure and provide
      well-defined function for supporting them, because it makes that they
      cannot mistake when manipulating theses structure and it makes us easily
      maintain vmalloc layer.
      
      For kexec and makedumpfile, I export vmap_area_list, instead of vmlist.
      This comes from Atsushi's recommendation.  For more information, please
      refer below link.  https://lkml.org/lkml/2012/12/6/184
      
      This patch:
      
      The purpose of iterating a vmlist is finding vm area with specific virtual
      address.  find_vm_area() is provided for this purpose and more efficient,
      because it uses a rbtree.  So change it.
      Signed-off-by: NJoonsoo Kim <js1304@gmail.com>
      Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Acked-by: NGuan Xuetao <gxt@mprc.pku.edu.cn>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Acked-by: NChris Metcalf <cmetcalf@tilera.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp>
      Cc: Dave Anderson <anderson@redhat.com>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ef932473
  11. 24 2月, 2013 3 次提交
    • S
      swap: add per-partition lock for swapfile · ec8acf20
      Shaohua Li 提交于
      swap_lock is heavily contended when I test swap to 3 fast SSD (even
      slightly slower than swap to 2 such SSD).  The main contention comes
      from swap_info_get().  This patch tries to fix the gap with adding a new
      per-partition lock.
      
      Global data like nr_swapfiles, total_swap_pages, least_priority and
      swap_list are still protected by swap_lock.
      
      nr_swap_pages is an atomic now, it can be changed without swap_lock.  In
      theory, it's possible get_swap_page() finds no swap pages but actually
      there are free swap pages.  But sounds not a big problem.
      
      Accessing partition specific data (like scan_swap_map and so on) is only
      protected by swap_info_struct.lock.
      
      Changing swap_info_struct.flags need hold swap_lock and
      swap_info_struct.lock, because scan_scan_map() will check it.  read the
      flags is ok with either the locks hold.
      
      If both swap_lock and swap_info_struct.lock must be hold, we always hold
      the former first to avoid deadlock.
      
      swap_entry_free() can change swap_list.  To delete that code, we add a
      new highest_priority_index.  Whenever get_swap_page() is called, we
      check it.  If it's valid, we use it.
      
      It's a pity get_swap_page() still holds swap_lock().  But in practice,
      swap_lock() isn't heavily contended in my test with this patch (or I can
      say there are other much more heavier bottlenecks like TLB flush).  And
      BTW, looks get_swap_page() doesn't really need the lock.  We never free
      swap_info[] and we check SWAP_WRITEOK flag.  The only risk without the
      lock is we could swapout to some low priority swap, but we can quickly
      recover after several rounds of swap, so sounds not a big deal to me.
      But I'd prefer to fix this if it's a real problem.
      
      "swap: make each swap partition have one address_space" improved the
      swapout speed from 1.7G/s to 2G/s.  This patch further improves the
      speed to 2.3G/s, so around 15% improvement.  It's a multi-process test,
      so TLB flush isn't the biggest bottleneck before the patches.
      
      [arnd@arndb.de: fix it for nommu]
      [hughd@google.com: add missing unlock]
      [minchan@kernel.org: get rid of lockdep whinge on sys_swapon]
      Signed-off-by: NShaohua Li <shli@fusionio.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Seth Jennings <sjenning@linux.vnet.ibm.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
      Cc: Dan Magenheimer <dan.magenheimer@oracle.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Signed-off-by: NMinchan Kim <minchan@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ec8acf20
    • W
      memory-hotplug: introduce new arch_remove_memory() for removing page table · 24d335ca
      Wen Congyang 提交于
      For removing memory, we need to remove page tables.  But it depends on
      architecture.  So the patch introduce arch_remove_memory() for removing
      page table.  Now it only calls __remove_pages().
      
      Note: __remove_pages() for some archtecuture is not implemented
            (I don't know how to implement it for s390).
      Signed-off-by: NWen Congyang <wency@cn.fujitsu.com>
      Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com>
      Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Jiang Liu <jiang.liu@huawei.com>
      Cc: Jianguo Wu <wujianguo@huawei.com>
      Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Wu Jianguo <wujianguo@huawei.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      24d335ca
    • M
      mm: remove flags argument to mmap_region · c22c0d63
      Michel Lespinasse 提交于
      After the MAP_POPULATE handling has been moved to mmap_region() call
      sites, the only remaining use of the flags argument is to pass the
      MAP_NORESERVE flag.  This can be just as easily handled by
      do_mmap_pgoff(), so do that and remove the mmap_region() flags
      parameter.
      
      [akpm@linux-foundation.org: remove double parens]
      Signed-off-by: NMichel Lespinasse <walken@google.com>
      Acked-by: NRik van Riel <riel@redhat.com>
      Tested-by: NAndy Lutomirski <luto@amacapital.net>
      Cc: Greg Ungerer <gregungerer@westnet.com.au>
      Cc: David Howells <dhowells@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c22c0d63
  12. 09 2月, 2013 1 次提交
  13. 12 12月, 2012 1 次提交
  14. 24 10月, 2012 1 次提交
    • C
      arch/tile: eliminate pt_regs trampolines for syscalls · 6b14e419
      Chris Metcalf 提交于
      Using the new current_pt_regs() model, we can remove some trampolines
      from assembly code and call directly to the C syscall implementations.
      rt_sigreturn() and clone() still need some assembly wrapping, but no
      longer are passed a pt_regs pointer.  sigaltstack() and the
      tilepro-specific cmpxchg_badaddr() syscalls are now just straight C.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      6b14e419
  15. 09 10月, 2012 2 次提交
  16. 23 7月, 2012 1 次提交
  17. 19 7月, 2012 3 次提交
    • C
      arch/tile: enable ZONE_DMA for tilegx · eef015c8
      Chris Metcalf 提交于
      This is required for PCI root complex legacy support and USB OHCI root
      complex support.  With this change tilegx now supports allocating memory
      whose PA fits in 32 bits.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      eef015c8
    • C
      tilegx pci: support I/O to arbitrarily-cached pages · bbaa22c3
      Chris Metcalf 提交于
      The tilegx PCI root complex support (currently only in linux-next)
      is limited to pages that are homed on cached in the default manner,
      i.e. "hash-for-home".  This change supports delivery of I/O data to
      pages that are cached in other ways (locally on a particular core,
      uncached, user-managed incoherent, etc.).
      
      A large part of the change is supporting flushing pages from cache
      on particular homes so that we can transition the data that we are
      delivering to or from the device appropriately.  The new homecache_finv*
      routines handle this.
      
      Some changes to page_table_range_init() were also required to make
      the fixmap code work correctly on tilegx; it hadn't been used there
      before.
      
      We also remove some stub mark_caches_evicted_*() routines that
      were just no-ops anyway.
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      bbaa22c3
    • C
      arch/tile: tilegx PCI root complex support · 12962267
      Chris Metcalf 提交于
      This change implements PCIe root complex support for tilegx using
      the kernel support layer for accessing the TRIO hardware shim.
      
      Reviewed-by: Bjorn Helgaas <bhelgaas@google.com> [changes in 07487f3]
      Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>
      12962267
  18. 26 5月, 2012 1 次提交