1. 19 12月, 2012 28 次提交
    • G
      memcg: kmem controller infrastructure · 7ae1e1d0
      Glauber Costa 提交于
      Introduce infrastructure for tracking kernel memory pages to a given
      memcg.  This will happen whenever the caller includes the flag
      __GFP_KMEMCG flag, and the task belong to a memcg other than the root.
      
      In memcontrol.h those functions are wrapped in inline acessors.  The idea
      is to later on, patch those with static branches, so we don't incur any
      overhead when no mem cgroups with limited kmem are being used.
      
      Users of this functionality shall interact with the memcg core code
      through the following functions:
      
      memcg_kmem_newpage_charge: will return true if the group can handle the
                                 allocation. At this point, struct page is not
                                 yet allocated.
      
      memcg_kmem_commit_charge: will either revert the charge, if struct page
                                allocation failed, or embed memcg information
                                into page_cgroup.
      
      memcg_kmem_uncharge_page: called at free time, will revert the charge.
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Acked-by: NKamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Frederic Weisbecker <fweisbec@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: JoonSoo Kim <js1304@gmail.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Suleiman Souhlal <suleiman@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7ae1e1d0
    • G
      mm: add a __GFP_KMEMCG flag · 7a64bf05
      Glauber Costa 提交于
      This flag is used to indicate to the callees that this allocation is a
      kernel allocation in process context, and should be accounted to current's
      memcg.
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: NRik van Riel <riel@redhat.com>
      Acked-by: NMel Gorman <mel@csn.ul.ie>
      Acked-by: NKamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Acked-by: NChristoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Suleiman Souhlal <suleiman@google.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Frederic Weisbecker <fweisbec@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: JoonSoo Kim <js1304@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7a64bf05
    • G
      memcg: kmem accounting basic infrastructure · 510fc4e1
      Glauber Costa 提交于
      Add the basic infrastructure for the accounting of kernel memory.  To
      control that, the following files are created:
      
       * memory.kmem.usage_in_bytes
       * memory.kmem.limit_in_bytes
       * memory.kmem.failcnt
       * memory.kmem.max_usage_in_bytes
      
      They have the same meaning of their user memory counterparts.  They
      reflect the state of the "kmem" res_counter.
      
      Per cgroup kmem memory accounting is not enabled until a limit is set for
      the group.  Once the limit is set the accounting cannot be disabled for
      that group.  This means that after the patch is applied, no behavioral
      changes exists for whoever is still using memcg to control their memory
      usage, until memory.kmem.limit_in_bytes is set for the first time.
      
      We always account to both user and kernel resource_counters.  This
      effectively means that an independent kernel limit is in place when the
      limit is set to a lower value than the user memory.  A equal or higher
      value means that the user limit will always hit first, meaning that kmem
      is effectively unlimited.
      
      People who want to track kernel memory but not limit it, can set this
      limit to a very high number (like RESOURCE_MAX - 1page - that no one will
      ever hit, or equal to the user memory)
      
      [akpm@linux-foundation.org: MEMCG_MMEM only works with slab and slub]
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Acked-by: NKamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Frederic Weisbecker <fweisbec@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: JoonSoo Kim <js1304@gmail.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Rik van Riel <riel@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      510fc4e1
    • G
      memcg: change defines to an enum · 86ae53e1
      Glauber Costa 提交于
      This is just a cleanup patch for clarity of expression.  In earlier
      submissions, people asked it to be in a separate patch, so here it is.
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Acked-by: NKamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Frederic Weisbecker <fweisbec@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: JoonSoo Kim <js1304@gmail.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Suleiman Souhlal <suleiman@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      86ae53e1
    • S
      memcg: reclaim when more than one page needed · 4c9c5359
      Suleiman Souhlal 提交于
      mem_cgroup_do_charge() was written before kmem accounting, and expects
      three cases: being called for 1 page, being called for a stock of 32
      pages, or being called for a hugepage.  If we call for 2 or 3 pages (and
      both the stack and several slabs used in process creation are such, at
      least with the debug options I had), it assumed it's being called for
      stock and just retried without reclaiming.
      
      Fix that by passing down a minsize argument in addition to the csize.
      
      And what to do about that (csize == PAGE_SIZE && ret) retry?  If it's
      needed at all (and presumably is since it's there, perhaps to handle
      races), then it should be extended to more than PAGE_SIZE, yet how far?
      And should there be a retry count limit, of what?  For now retry up to
      COSTLY_ORDER (as page_alloc.c does) and make sure not to do it if
      __GFP_NORETRY.
      
      v4: fixed nr pages calculation pointed out by Christoph Lameter.
      Signed-off-by: NSuleiman Souhlal <suleiman@google.com>
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Acked-by: NKamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Frederic Weisbecker <fweisbec@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: JoonSoo Kim <js1304@gmail.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Rik van Riel <riel@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4c9c5359
    • S
      memcg: make it possible to use the stock for more than one page · a0956d54
      Suleiman Souhlal 提交于
      We currently have a percpu stock cache scheme that charges one page at a
      time from memcg->res, the user counter.  When the kernel memory controller
      comes into play, we'll need to charge more than that.
      
      This is because kernel memory allocations will also draw from the user
      counter, and can be bigger than a single page, as it is the case with the
      stack (usually 2 pages) or some higher order slabs.
      
      [glommer@parallels.com: added a changelog ]
      Signed-off-by: NSuleiman Souhlal <suleiman@google.com>
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Acked-by: NKamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Frederic Weisbecker <fweisbec@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: JoonSoo Kim <js1304@gmail.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Suleiman Souhlal <suleiman@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a0956d54
    • T
      memory-hotplug: document and enable CONFIG_MOVABLE_NODE · c2974058
      Tang Chen 提交于
      Add help info for CONFIG_MOVABLE_NODE and permit its selection.
      
      This option allows the user to online all memory of a node as movable
      memory.  So that the whole node can be hotplugged.  Users who don't use
      the hotplug feature are also fine with this option on since they won't
      online memory as movable.
      Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com>
      Reviewed-by: NYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Wen Congyang <wency@cn.fujitsu.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      [akpm@linux-foundation.org: tweak help text]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c2974058
    • G
      mm/page_alloc.c: remove duplicate check · 0bb2c763
      Gavin Shan 提交于
      While allocating pages using buddy allocator, the compound page is
      probably split up to free pages.  Under these circumstances, the compound
      page should be destroyed by destroy_compound_page().  However, there is a
      duplicate check to judge if the page is compound.
      
      Remove the duplicate check since the compound_order() returns 0 when the
      page doesn't have PG_head set in destroy_compound_page().  That is to say,
      destroy_compound_page() needn't check PageHead().
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: NPekka Enberg <penberg@cs.helsinki.fi>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0bb2c763
    • A
      drivers/message/fusion/mptscsih.c: missing break · 3012d60b
      Alan Cox 提交于
      This happens to do the right thing in all cases on fibre channel but not on
      other media types
      Signed-off-by: NAlan Cox <alan@linux.intel.com>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
      Cc: Kashyap Desai <kashyap.desai@lsi.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3012d60b
    • F
      h8300: select generic atomic64_t support · d95bfe46
      Fengguang Wu 提交于
      Rationales from Eric:
      
      So I just looked a little deeper and it appears architectures that do
      not support atomic64_t are broken.
      
      The generic atomic64 support came in 2009 to support the perf subsystem
      with the expectation that all architectures would implement atomic64
      support.
      
      Furthermore upon inspection of the kernel atomic64_t is used in a fair
      number of places beyond the performance counters:
      
      block/blk-cgroup.c
      drivers/acpi/apei/
      drivers/block/rbd.c
      drivers/crypto/nx/nx.h
      drivers/gpu/drm/radeon/radeon.h
      drivers/infiniband/hw/ipath/
      drivers/infiniband/hw/qib/
      drivers/staging/octeon/
      fs/xfs/
      include/linux/perf_event.h
      include/net/netfilter/nf_conntrack_acct.h
      kernel/events/
      kernel/trace/
      net/mac80211/key.h
      net/rds/
      
      The block control group, infiniband, xfs, crypto, 802.11, netfilter.
      Nothing quite so fundamental as fs/namespace.c but definitely in
      multiplatform-code that should work, and is already broken on those
      architecutres.
      
      Looking at the implementation of atomic64_add_return in lib/atomic64.c the
      code looks as efficient as these kinds of things get.
      
      Which leads me to the conclusion that we need atomic64 support on all
      architectures.
      Signed-off-by: NFengguang Wu <fengguang.wu@intel.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d95bfe46
    • C
      Coccinelle: add api/d_find_alias.cocci · af56e3f0
      Cyril Roelandt 提交于
      Ensure that calls to d_find_alias() have a corresponding dput().
      Signed-off-by: NCyril Roelandt <tipecaml@gmail.com>
      Cc: Julia Lawall <Julia.Lawall@lip6.fr>
      Cc: Gilles Muller <Gilles.Muller@lip6.fr>
      Cc: Nicolas Palix <nicolas.palix@imag.fr>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      af56e3f0
    • A
      irq: tsk->comm is an array · 19af395d
      Alan Cox 提交于
      The array check is useless so remove it.
      
      [akpm@linux-foundation.org: remove comment, per David]
      Signed-off-by: NAlan Cox <alan@linux.intel.com>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      19af395d
    • C
      ceph: fix dentry reference leak in ceph_encode_fh() · f6af75da
      Cyril Roelandt 提交于
      dput() was not called in the error path.
      Signed-off-by: NCyril Roelandt <tipecaml@gmail.com>
      Cc: Sage Weil <sage@inktank.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f6af75da
    • S
      arch/x86/platform/iris/iris.c: register a platform device and a platform driver · 88d67ee3
      Shérab 提交于
      This makes the iris driver use the platform API, so it is properly exposed
      in /sys.
      
      [akpm@linux-foundation.org: remove commented-out code, add missing space to printk, clean up code layout]
      Signed-off-by: NShérab <Sebastien.Hinderer@ens-lyon.org>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Matthew Garrett <mjg@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      88d67ee3
    • C
      CRIS: fix I/O macros · c24bf9b4
      Corey Minyard 提交于
      The inb/outb macros for CRIS are broken from a number of points of view,
      missing () around parameters and they have an unprotected if statement
      in them.  This was breaking the compile of IPMI on CRIS and thus I was
      being annoyed by build regressions, so I fixed them.
      
      Plus I don't think they would have worked at all, since the data values
      were missing "&" and the outsl had a "3" instead of a "4" for the size.
      From what I can tell, this stuff is not used at all, so this can't be
      any more broken than it was before, anyway.
      Signed-off-by: NCorey Minyard <cminyard@mvista.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Acked-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c24bf9b4
    • J
      backlight: locomolcd: fix checkpatch error and warning · 9f67675a
      Jingoo Han 提交于
      This patch fixes the checkpatch error and warning as below:
      
        WARNING: space prohibited between function name and open parenthesis '('
        ERROR: trailing statements should be on next line
      
      Also, long comments are fixed for the preferred style and unnecessary
      lines are removed.
      Signed-off-by: NJingoo Han <jg1.han@samsung.com>
      Cc: Richard Purdie <rpurdie@rpsys.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9f67675a
    • L
      Merge branch 'slab/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux · ae664dba
      Linus Torvalds 提交于
      Pull SLAB changes from Pekka Enberg:
       "This contains preparational work from Christoph Lameter and Glauber
        Costa for SLAB memcg and cleanups and improvements from Ezequiel
        Garcia and Joonsoo Kim.
      
        Please note that the SLOB cleanup commit from Arnd Bergmann already
        appears in your tree but I had also merged it myself which is why it
        shows up in the shortlog."
      
      * 'slab/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux:
        mm/sl[aou]b: Common alignment code
        slab: Use the new create_boot_cache function to simplify bootstrap
        slub: Use statically allocated kmem_cache boot structure for bootstrap
        mm, sl[au]b: create common functions for boot slab creation
        slab: Simplify bootstrap
        slub: Use correct cpu_slab on dead cpu
        mm: fix slab.c kernel-doc warnings
        mm/slob: use min_t() to compare ARCH_SLAB_MINALIGN
        slab: Ignore internal flags in cache creation
        mm/slob: Use free_page instead of put_page for page-size kmalloc allocations
        mm/sl[aou]b: Move common kmem_cache_size() to slab.h
        mm/slob: Use object_size field in kmem_cache_size()
        mm/slob: Drop usage of page->private for storing page-sized allocations
        slub: Commonize slab_cache field in struct page
        sl[au]b: Process slabinfo_show in common code
        mm/sl[au]b: Move print_slabinfo_header to slab_common.c
        mm/sl[au]b: Move slabinfo processing to slab_common.c
        slub: remove one code path and reduce lock contention in __slab_free()
      ae664dba
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace · a2faf2fc
      Linus Torvalds 提交于
      Pull (again) user namespace infrastructure changes from Eric Biederman:
       "Those bugs, those darn embarrasing bugs just want don't want to get
        fixed.
      
        Linus I just updated my mirror of your kernel.org tree and it appears
        you successfully pulled everything except the last 4 commits that fix
        those embarrasing bugs.
      
        When you get a chance can you please repull my branch"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
        userns: Fix typo in description of the limitation of userns_install
        userns: Add a more complete capability subset test to commit_creds
        userns: Require CAP_SYS_ADMIN for most uses of setns.
        Fix cap_capable to only allow owners in the parent user namespace to have caps.
      a2faf2fc
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lliubbo/blackfin · 4351654e
      Linus Torvalds 提交于
      Pull blackfin update from Bob Liu.
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lliubbo/blackfin:
        blackfin: SEC: clean up SEC interrupt initialization
        blackfin: kgdb: call generic_exec_single() directly
        blackfin: anomaly: add anomaly 16000030 for bf5xx
        Blackfin: dpmc: use module_platform_driver macro
        Blackfin: remove unused is_in_rom()
        Blackfin: remove unnecessary prototype for kobjsize()
        Blackfin: twi: Add missing __iomem annotation
        Blackfin: Annotate strnlen_user and strlen_user 'src' parameter with __user
        Blackfin: Annotate clear_user 'to' parameter with __user
        Blackfin: Add missing __user annotations to put_user
        Blackfin: Annotate strncpy_from_user src parameter with __user
        blackfin: Use Kbuild infrastructure for kvm_para.h
        UAPI: (Scripted) Disintegrate arch/blackfin/include/asm
      4351654e
    • L
      Merge tag 'disintegrate-alpha-20121217' of git://git.infradead.org/users/dhowells/linux-headers · 3d9de190
      Linus Torvalds 提交于
      Pull UAPI disintegration for Alpha from David Howells:
       "I've been asked to send the Alpha UAPI disintegration to you directly.
        The acks I have been given have been added into the patch."
      
      * tag 'disintegrate-alpha-20121217' of git://git.infradead.org/users/dhowells/linux-headers:
        UAPI: (Scripted) Disintegrate arch/alpha/include/asm
      3d9de190
    • L
      Merge tag 'for-3.8' of git://openrisc.net/~jonas/linux · 9a8a5702
      Linus Torvalds 提交于
      Pull OpenRISC update from Jonas Bonn:
       "Trivial cleanups for OpenRISC."
      
      * tag 'for-3.8' of git://openrisc.net/~jonas/linux:
        openrisc: use kbuild.h instead of defining macros in asm-offset.c
        openrisc: Use Kbuild infrastructure for kvm_para.h
      9a8a5702
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 7b077868
      Linus Torvalds 提交于
      Pull s390 update #2 from Martin Schwidefsky:
       "The main patch is the function measurement blocks extension for PCI to
        do performance statistics and help with debugging.  The other patch is
        a small cleanup in ccwdev.h."
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390/ccwdev: Include asm/schid.h.
        s390/pci: performance statistics and debug infrastructure
      7b077868
    • L
      Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc · 16e024f3
      Linus Torvalds 提交于
      Pull powerpc update from Benjamin Herrenschmidt:
       "The main highlight is probably some base POWER8 support.  There's more
        to come such as transactional memory support but that will wait for
        the next one.
      
        Overall it's pretty quiet, or rather I've been pretty poor at picking
        things up from patchwork and reviewing them this time around and Kumar
        no better on the FSL side it seems..."
      
      * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (73 commits)
        powerpc+of: Rename and fix OF reconfig notifier error inject module
        powerpc: mpc5200: Add a3m071 board support
        powerpc/512x: don't compile any platform DIU code if the DIU is not enabled
        powerpc/mpc52xx: use module_platform_driver macro
        powerpc+of: Export of_reconfig_notifier_[register,unregister]
        powerpc/dma/raidengine: add raidengine device
        powerpc/iommu/fsl: Add PAMU bypass enable register to ccsr_guts struct
        powerpc/mpc85xx: Change spin table to cached memory
        powerpc/fsl-pci: Add PCI controller ATMU PM support
        powerpc/86xx: fsl_pcibios_fixup_bus requires CONFIG_PCI
        drivers/virt: the Freescale hypervisor driver doesn't need to check MSR[GS]
        powerpc/85xx: p1022ds: Use NULL instead of 0 for pointers
        powerpc: Disable relocation on exceptions when kexecing
        powerpc: Enable relocation on during exceptions at boot
        powerpc: Move get_longbusy_msecs into hvcall.h and remove duplicate function
        powerpc: Add wrappers to enable/disable relocation on exceptions
        powerpc: Add set_mode hcall
        powerpc: Setup relocation on exceptions for bare metal systems
        powerpc: Move initial mfspr LPCR out of __init_LPCR
        powerpc: Add relocation on exception vector handlers
        ...
      16e024f3
    • D
      x86, paravirt: fix build error when thp is disabled · c36e0501
      David Rientjes 提交于
      With CONFIG_PARAVIRT=y and CONFIG_TRANSPARENT_HUGEPAGE=n, the build breaks
      because set_pmd_at() is undeclared:
      
        mm/memory.c: In function 'do_pmd_numa_page':
        mm/memory.c:3520: error: implicit declaration of function 'set_pmd_at'
        mm/mprotect.c: In function 'change_pmd_protnuma':
        mm/mprotect.c:120: error: implicit declaration of function 'set_pmd_at'
      
      This is because paravirt defines set_pmd_at() only when
      CONFIG_TRANSPARENT_HUGEPAGE=y and such a restriction is unneeded.  The
      fix is to define it for all CONFIG_PARAVIRT configurations.
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c36e0501
    • L
      Merge branch 'for-linus' of git://git.open-osd.org/linux-open-osd · ea77d73c
      Linus Torvalds 提交于
      Pull exofs changes from Boaz Harrosh:
       "These are just 3 patches, the last two are bug fixes on the error
        paths in exofs.
      
        The important patch is the one to osd_uld which adds sysfs info to osd
        devices for use by user-mode clustering discovery software.  I'm
        already sitting on this patch since before February this year, It is
        important for some of the big installation cluster systems, who's been
        compiling their own kernel just for that patch."
      
      Ugh.  The osd_uld patch already went through the SCSI tree, so this was
      kind of pointless.  But at least it has the two small error-path fixes..
      
      * 'for-linus' of git://git.open-osd.org/linux-open-osd:
        exofs: don't leak io_state and pages on read error
        exofs: clean up the correct page collection on write error
        osduld: Add osdname & systemid sysfs at scsi_osd class
      ea77d73c
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs · a22180d2
      Linus Torvalds 提交于
      Pull btrfs update from Chris Mason:
       "A big set of fixes and features.
      
        In terms of line count, most of the code comes from Stefan, who added
        the ability to replace a single drive in place.  This is different
        from how btrfs normally replaces drives, and is much much much faster.
      
        Josef is plowing through our synchronous write performance.  This pull
        request does not include the DIO_OWN_WAITING patch that was discussed
        on the list, but it has a number of other improvements to cut down our
        latencies and CPU time during fsync/O_DIRECT writes.
      
        Miao Xie has a big series of fixes and is spreading out ordered
        operations over more CPUs.  This improves performance and reduces
        contention.
      
        I've put in fixes for error handling around hash collisions.  These
        are going back to individual stable kernels as I test against them.
      
        Otherwise we have a lot of fixes and cleanups, thanks everyone!
        raid5/6 is being rebased against the device replacement code.  I'll
        have it posted this Friday along with a nice series of benchmarks."
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (115 commits)
        Btrfs: fix a bug of per-file nocow
        Btrfs: fix hash overflow handling
        Btrfs: don't take inode delalloc mutex if we're a free space inode
        Btrfs: fix autodefrag and umount lockup
        Btrfs: fix permissions of empty files not affected by umask
        Btrfs: put raid properties into global table
        Btrfs: fix BUG() in scrub when first superblock reading gives EIO
        Btrfs: do not call file_update_time in aio_write
        Btrfs: only unlock and relock if we have to
        Btrfs: use tokens where we can in the tree log
        Btrfs: optimize leaf_space_used
        Btrfs: don't memset new tokens
        Btrfs: only clear dirty on the buffer if it is marked as dirty
        Btrfs: move checks in set_page_dirty under DEBUG
        Btrfs: log changed inodes based on the extent map tree
        Btrfs: add path->really_keep_locks
        Btrfs: do not mark ems as prealloc if we are writing to them
        Btrfs: keep track of the extents original block length
        Btrfs: inline csums if we're fsyncing
        Btrfs: don't bother copying if we're only logging the inode
        ...
      a22180d2
    • L
      Merge tag 'nfs-for-3.8-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · 2d4dce00
      Linus Torvalds 提交于
      Pull NFS client updates from Trond Myklebust:
       "Features include:
      
         - Full audit of BUG_ON asserts in the NFS, SUNRPC and lockd client
           code.  Remove altogether where possible, and replace with
           WARN_ON_ONCE and appropriate error returns where not.
         - NFSv4.1 client adds session dynamic slot table management.  There
           is matching server side code that has been submitted to Bruce for
           consideration.
      
           Together, this code allows the server to dynamically manage the
           amount of memory it allocates to the duplicate request cache for
           each client.  It will constantly resize those caches to reserve
           more memory for clients that are hot while shrinking caches for
           those that are quiescent.
      
        In addition, there are assorted bugfixes for the generic NFS write
        code, fixes to deal with the drop_nlink() warnings, and yet another
        fix for NFSv4 getacl."
      
      * tag 'nfs-for-3.8-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (106 commits)
        SUNRPC: continue run over clients list on PipeFS event instead of break
        NFS: Don't use SetPageError in the NFS writeback code
        SUNRPC: variable 'svsk' is unused in function bc_send_request
        SUNRPC: Handle ECONNREFUSED in xs_local_setup_socket
        NFSv4.1: Deal effectively with interrupted RPC calls.
        NFSv4.1: Move the RPC timestamp out of the slot.
        NFSv4.1: Try to deal with NFS4ERR_SEQ_MISORDERED.
        NFS: nfs_lookup_revalidate should not trust an inode with i_nlink == 0
        NFS: Fix calls to drop_nlink()
        NFS: Ensure that we always drop inodes that have been marked as stale
        nfs: Remove unused list nfs4_clientid_list
        nfs: Remove duplicate function declaration in internal.h
        NFS: avoid NULL dereference in nfs_destroy_server
        SUNRPC handle EKEYEXPIRED in call_refreshresult
        SUNRPC set gss gc_expiry to full lifetime
        nfs: fix page dirtying in NFS DIO read codepath
        nfs: don't zero out the rest of the page if we hit the EOF on a DIO READ
        NFSv4.1: Be conservative about the client highest slotid
        NFSv4.1: Handle NFS4ERR_BADSLOT errors correctly
        nfs: don't extend writes to cover entire page if pagecache is invalid
        ...
      2d4dce00
    • L
      Merge tag 'md-3.8' of git://neil.brown.name/md · ea88eeac
      Linus Torvalds 提交于
      Pull md update from Neil Brown:
       "Mostly just little fixes.  Probably biggest part is AVX accelerated
        RAID6 calculations."
      
      * tag 'md-3.8' of git://neil.brown.name/md:
        md/raid5: add blktrace calls
        md/raid5: use async_tx_quiesce() instead of open-coding it.
        md: Use ->curr_resync as last completed request when cleanly aborting resync.
        lib/raid6: build proper files on corresponding arch
        lib/raid6: Add AVX2 optimized gen_syndrome functions
        lib/raid6: Add AVX2 optimized recovery functions
        md: Update checkpoint of resync/recovery based on time.
        md:Add place to update ->recovery_cp.
        md.c: re-indent various 'switch' statements.
        md: close race between removing and adding a device.
        md: removed unused variable in calc_sb_1_csm.
      ea88eeac
  2. 18 12月, 2012 12 次提交
    • P
      Merge branch 'slab/next' into slab/for-linus · 08afe22c
      Pekka Enberg 提交于
      Fix up a trivial merge conflict with commit baaf1dd4 ("mm/slob: use
      min_t() to compare ARCH_SLAB_MINALIGN") that did not go through the slab
      tree.
      
      Conflicts:
      	mm/slob.c
      Signed-off-by: NPekka Enberg <penberg@kernel.org>
      08afe22c
    • P
      Merge branch 'slab/procfs' into slab/for-linus · a304f836
      Pekka Enberg 提交于
      a304f836
    • L
      Merge branch 'akpm' (Andrew's patch-bomb) · 848b8141
      Linus Torvalds 提交于
      Merge misc patches from Andrew Morton:
       "Incoming:
      
         - lots of misc stuff
      
         - backlight tree updates
      
         - lib/ updates
      
         - Oleg's percpu-rwsem changes
      
         - checkpatch
      
         - rtc
      
         - aoe
      
         - more checkpoint/restart support
      
        I still have a pile of MM stuff pending - Pekka should be merging
        later today after which that is good to go.  A number of other things
        are twiddling thumbs awaiting maintainer merges."
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (180 commits)
        scatterlist: don't BUG when we can trivially return a proper error.
        docs: update documentation about /proc/<pid>/fdinfo/<fd> fanotify output
        fs, fanotify: add @mflags field to fanotify output
        docs: add documentation about /proc/<pid>/fdinfo/<fd> output
        fs, notify: add procfs fdinfo helper
        fs, exportfs: add exportfs_encode_inode_fh() helper
        fs, exportfs: escape nil dereference if no s_export_op present
        fs, epoll: add procfs fdinfo helper
        fs, eventfd: add procfs fdinfo helper
        procfs: add ability to plug in auxiliary fdinfo providers
        tools/testing/selftests/kcmp/kcmp_test.c: print reason for failure in kcmp_test
        breakpoint selftests: print failure status instead of cause make error
        kcmp selftests: print fail status instead of cause make error
        kcmp selftests: make run_tests fix
        mem-hotplug selftests: print failure status instead of cause make error
        cpu-hotplug selftests: print failure status instead of cause make error
        mqueue selftests: print failure status instead of cause make error
        vm selftests: print failure status instead of cause make error
        ubifs: use prandom_bytes
        mtd: nandsim: use prandom_bytes
        ...
      848b8141
    • E
      efi: Fix the build with user namespaces enabled. · 99295618
      Eric W. Biederman 提交于
      When compiling efivars.c the build fails with:
      
         CC      drivers/firmware/efivars.o
        drivers/firmware/efivars.c: In function ‘efivarfs_get_inode’:
        drivers/firmware/efivars.c:886:31: error: incompatible types when assigning to type ‘kgid_t’ from type ‘int’
        make[2]: *** [drivers/firmware/efivars.o] Error 1
        make[1]: *** [drivers/firmware/efivars.o] Error 2
      
      Fix the build error by removing the duplicate initialization of i_uid and
      i_gid inode_init_always has already initialized them to 0.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      99295618
    • S
      mm,numa: fix update_mmu_cache_pmd call · ce4a9cc5
      Stephen Rothwell 提交于
      This build error is currently hidden by the fact that the x86
      implementation of 'update_mmu_cache_pmd()' is a macro that doesn't use
      its last argument, but commit b32967ff ("mm: numa: Add THP migration
      for the NUMA working set scanning fault case") introduced a call with
      the wrong third argument.
      
      In the akpm tree, it causes this build error:
      
        mm/migrate.c: In function 'migrate_misplaced_transhuge_page_put':
        mm/migrate.c:1666:2: error: incompatible type for argument 3 of 'update_mmu_cache_pmd'
        arch/x86/include/asm/pgtable.h:792:20: note: expected 'struct pmd_t *' but argument is of type 'pmd_t'
      
      Fix it.
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ce4a9cc5
    • N
      scatterlist: don't BUG when we can trivially return a proper error. · 6fd59a83
      Nick Bowler 提交于
      There is absolutely no reason to crash the kernel when we have a
      perfectly good return value already available to use for conveying
      failure status.
      
      Let's return an error code instead of crashing the kernel: that sounds
      like a much better plan.
      
      [akpm@linux-foundation.org: s/E2BIG/EINVAL/]
      Signed-off-by: NNick Bowler <nbowler@elliptictech.com>
      Cc: Maxim Levitsky <maximlevitsky@gmail.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6fd59a83
    • C
      docs: update documentation about /proc/<pid>/fdinfo/<fd> fanotify output · e71ec593
      Cyrill Gorcunov 提交于
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andrey Vagin <avagin@openvz.org>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: James Bottomley <jbottomley@parallels.com>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Matthew Helsley <matt.helsley@gmail.com>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@onelan.co.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e71ec593
    • C
      fs, fanotify: add @mflags field to fanotify output · e6dbcafb
      Cyrill Gorcunov 提交于
      The kernel keeps FAN_MARK_IGNORED_SURV_MODIFY bit separately from
      fsnotify_mark::mask|ignored_mask thus put it in @mflags (mark flags)
      field so the user-space reader will be able to detect if such bit were
      used on mark creation procedure.
      
       | pos:	0
       | flags:	04002
       | fanotify flags:10 event-flags:0
       | fanotify mnt_id:12 mflags:40 mask:38 ignored_mask:40000003
       | fanotify ino:4f969 sdev:800013 mflags:0 mask:3b ignored_mask:40000000 fhandle-bytes:8 fhandle-type:1 f_handle:69f90400c275b5b4
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andrey Vagin <avagin@openvz.org>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: James Bottomley <jbottomley@parallels.com>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Matthew Helsley <matt.helsley@gmail.com>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@onelan.co.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e6dbcafb
    • C
      docs: add documentation about /proc/<pid>/fdinfo/<fd> output · f1d8c162
      Cyrill Gorcunov 提交于
      [akpm@linux-foundation.org: tweak documentation]
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andrey Vagin <avagin@openvz.org>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: James Bottomley <jbottomley@parallels.com>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Matthew Helsley <matt.helsley@gmail.com>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@onelan.co.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f1d8c162
    • C
      fs, notify: add procfs fdinfo helper · be77196b
      Cyrill Gorcunov 提交于
      This allow us to print out fsnotify details such as watchee inode, device,
      mask and optionally a file handle.
      
      For inotify objects if kernel compiled with exportfs support the output
      will be
      
       | pos:	0
       | flags:	02000000
       | inotify wd:3 ino:9e7e sdev:800013 mask:800afce ignored_mask:0 fhandle-bytes:8 fhandle-type:1 f_handle:7e9e0000640d1b6d
       | inotify wd:2 ino:a111 sdev:800013 mask:800afce ignored_mask:0 fhandle-bytes:8 fhandle-type:1 f_handle:11a1000020542153
       | inotify wd:1 ino:6b149 sdev:800013 mask:800afce ignored_mask:0 fhandle-bytes:8 fhandle-type:1 f_handle:49b1060023552153
      
      If kernel compiled without exportfs support, the file handle
      won't be provided but inode and device only.
      
       | pos:	0
       | flags:	02000000
       | inotify wd:3 ino:9e7e sdev:800013 mask:800afce ignored_mask:0
       | inotify wd:2 ino:a111 sdev:800013 mask:800afce ignored_mask:0
       | inotify wd:1 ino:6b149 sdev:800013 mask:800afce ignored_mask:0
      
      For fanotify the output is like
      
       | pos:	0
       | flags:	04002
       | fanotify flags:10 event-flags:0
       | fanotify mnt_id:12 mask:3b ignored_mask:0
       | fanotify ino:50205 sdev:800013 mask:3b ignored_mask:40000000 fhandle-bytes:8 fhandle-type:1 f_handle:05020500fb1d47e7
      
      To minimize impact on general fsnotify code the new functionality
      is gathered in fs/notify/fdinfo.c file.
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Acked-by: NPavel Emelyanov <xemul@parallels.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andrey Vagin <avagin@openvz.org>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: James Bottomley <jbottomley@parallels.com>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Matthew Helsley <matt.helsley@gmail.com>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@onelan.co.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      be77196b
    • C
      fs, exportfs: add exportfs_encode_inode_fh() helper · 711c7bf9
      Cyrill Gorcunov 提交于
      We will need this helper in the next patch to provide a file handle for
      inotify marks in /proc/pid/fdinfo output.
      
      The patch is rather providing the way to use inodes directly when dentry
      is not available (like in case of inotify system).
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Acked-by: NPavel Emelyanov <xemul@parallels.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andrey Vagin <avagin@openvz.org>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: James Bottomley <jbottomley@parallels.com>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Matthew Helsley <matt.helsley@gmail.com>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@onelan.co.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      711c7bf9
    • C
      fs, exportfs: escape nil dereference if no s_export_op present · ab49bdec
      Cyrill Gorcunov 提交于
      This routine will be used to generate a file handle in fdinfo output for
      inotify subsystem, where if no s_export_op present the general
      export_encode_fh should be used.  Thus add a test if s_export_op present
      inside exportfs_encode_fh itself.
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Acked-by: NPavel Emelyanov <xemul@parallels.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andrey Vagin <avagin@openvz.org>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: James Bottomley <jbottomley@parallels.com>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Matthew Helsley <matt.helsley@gmail.com>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@onelan.co.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ab49bdec