1. 13 10月, 2012 5 次提交
    • J
      audit: make audit_inode take struct filename · adb5c247
      Jeff Layton 提交于
      Keep a pointer to the audit_names "slot" in struct filename.
      
      Have all of the audit_inode callers pass a struct filename ponter to
      audit_inode instead of a string pointer. If the aname field is already
      populated, then we can skip walking the list altogether and just use it
      directly.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      adb5c247
    • J
      vfs: make path_openat take a struct filename pointer · 669abf4e
      Jeff Layton 提交于
      ...and fix up the callers. For do_file_open_root, just declare a
      struct filename on the stack and fill out the .name field. For
      do_filp_open, make it also take a struct filename pointer, and fix up its
      callers to call it appropriately.
      
      For filp_open, add a variant that takes a struct filename pointer and turn
      filp_open into a wrapper around it.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      669abf4e
    • J
      audit: allow audit code to satisfy getname requests from its names_list · 7ac86265
      Jeff Layton 提交于
      Currently, if we call getname() on a userland string more than once,
      we'll get multiple copies of the string and multiple audit_names
      records.
      
      Add a function that will allow the audit_names code to satisfy getname
      requests using info from the audit_names list, avoiding a new allocation
      and audit_names records.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      7ac86265
    • J
      vfs: define struct filename and have getname() return it · 91a27b2a
      Jeff Layton 提交于
      getname() is intended to copy pathname strings from userspace into a
      kernel buffer. The result is just a string in kernel space. It would
      however be quite helpful to be able to attach some ancillary info to
      the string.
      
      For instance, we could attach some audit-related info to reduce the
      amount of audit-related processing needed. When auditing is enabled,
      we could also call getname() on the string more than once and not
      need to recopy it from userspace.
      
      This patchset converts the getname()/putname() interfaces to return
      a struct instead of a string. For now, the struct just tracks the
      string in kernel space and the original userland pointer for it.
      
      Later, we'll add other information to the struct as it becomes
      convenient.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      91a27b2a
    • A
      infrastructure for saner ret_from_kernel_thread semantics · a74fb73c
      Al Viro 提交于
      * allow kernel_execve() leave the actual return to userland to
      caller (selected by CONFIG_GENERIC_KERNEL_EXECVE).  Callers
      updated accordingly.
      * architecture that does select GENERIC_KERNEL_EXECVE in its
      Kconfig should have its ret_from_kernel_thread() do this:
      	call schedule_tail
      	call the callback left for it by copy_thread(); if it ever
      returns, that's because it has just done successful kernel_execve()
      	jump to return from syscall
      IOW, its only difference from ret_from_fork() is that it does call the
      callback.
      * such an architecture should also get rid of ret_from_kernel_execve()
      and __ARCH_WANT_KERNEL_EXECVE
      
      This is the last part of infrastructure patches in that area - from
      that point on work on different architectures can live independently.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      a74fb73c
  2. 12 10月, 2012 14 次提交
  3. 10 10月, 2012 12 次提交
    • R
      MODSIGN: Make mrproper should remove generated files. · d5b71936
      Rusty Russell 提交于
      It doesn't, because the clean targets don't include kernel/Makefile, and
      because two files were missing from the list.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      d5b71936
    • D
      MODSIGN: Use utf8 strings in signer's name in autogenerated X.509 certs · e7d113bc
      David Howells 提交于
      Place an indication that the certificate should use utf8 strings into the
      x509.genkey template generated by kernel/Makefile.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      e7d113bc
    • D
      MODSIGN: Use the same digest for the autogen key sig as for the module sig · 5e8cb1e4
      David Howells 提交于
      Use the same digest type for the autogenerated key signature as for the module
      signature so that the hash algorithm is guaranteed to be present in the kernel.
      
      Without this, the X.509 certificate loader may reject the X.509 certificate so
      generated because it was self-signed and the signature will be checked against
      itself - but this won't work if the digest algorithm must be loaded as a
      module.
      
      The symptom is that the key fails to load with the following message emitted
      into the kernel log:
      
      	MODSIGN: Problem loading in-kernel X.509 certificate (-65)
      
      the error in brackets being -ENOPKG.  What you should see is something like:
      
      	MODSIGN: Loaded cert 'Magarathea: Glacier signing key: 9588321144239a119d3406d4c4cf1fbae1836fa0'
      
      Note that this doesn't apply to certificates that are not self-signed as we
      don't check those currently as they require the parent CA certificate to be
      available.
      Reported-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      5e8cb1e4
    • D
      MODSIGN: Implement module signature checking · 48ba2462
      David Howells 提交于
      Check the signature on the module against the keys compiled into the kernel or
      available in a hardware key store.
      
      Currently, only RSA keys are supported - though that's easy enough to change,
      and the signature is expected to contain raw components (so not a PGP or
      PKCS#7 formatted blob).
      
      The signature blob is expected to consist of the following pieces in order:
      
       (1) The binary identifier for the key.  This is expected to match the
           SubjectKeyIdentifier from an X.509 certificate.  Only X.509 type
           identifiers are currently supported.
      
       (2) The signature data, consisting of a series of MPIs in which each is in
           the format of a 2-byte BE word sizes followed by the content data.
      
       (3) A 12 byte information block of the form:
      
      	struct module_signature {
      		enum pkey_algo		algo : 8;
      		enum pkey_hash_algo	hash : 8;
      		enum pkey_id_type	id_type : 8;
      		u8			__pad;
      		__be32			id_length;
      		__be32			sig_length;
      	};
      
           The three enums are defined in crypto/public_key.h.
      
           'algo' contains the public-key algorithm identifier (0->DSA, 1->RSA).
      
           'hash' contains the digest algorithm identifier (0->MD4, 1->MD5, 2->SHA1,
            etc.).
      
           'id_type' contains the public-key identifier type (0->PGP, 1->X.509).
      
           '__pad' should be 0.
      
           'id_length' should contain in the binary identifier length in BE form.
      
           'sig_length' should contain in the signature data length in BE form.
      
           The lengths are in BE order rather than CPU order to make dealing with
           cross-compilation easier.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (minor Kconfig fix)
      48ba2462
    • D
      MODSIGN: Provide module signing public keys to the kernel · 631cc66e
      David Howells 提交于
      Include a PGP keyring containing the public keys required to perform module
      verification in the kernel image during build and create a special keyring
      during boot which is then populated with keys of crypto type holding the public
      keys found in the PGP keyring.
      
      These can be seen by root:
      
      [root@andromeda ~]# cat /proc/keys
      07ad4ee0 I-----     1 perm 3f010000     0     0 crypto    modsign.0: RSA 87b9b3bd []
      15c7f8c3 I-----     1 perm 1f030000     0     0 keyring   .module_sign: 1/4
      ...
      
      It is probably worth permitting root to invalidate these keys, resulting in
      their removal and preventing further modules from being loaded with that key.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      631cc66e
    • D
      MODSIGN: Automatically generate module signing keys if missing · d441108c
      David Howells 提交于
      Automatically generate keys for module signing if they're absent so that
      allyesconfig doesn't break.  The builder should consider generating their own
      key and certificate, however, so that the keys are appropriately named.
      
      The private key for the module signer should be placed in signing_key.priv
      (unencrypted!) and the public key in an X.509 certificate as signing_key.x509.
      
      If a transient key is desired for signing the modules, a config file for
      'openssl req' can be placed in x509.genkey, looking something like the
      following:
      
      	[ req ]
      	default_bits = 4096
      	distinguished_name = req_distinguished_name
      	prompt = no
      	x509_extensions = myexts
      
      	[ req_distinguished_name ]
      	O = Magarathea
      	CN = Glacier signing key
      	emailAddress = slartibartfast@magrathea.h2g2
      
      	[ myexts ]
      	basicConstraints=critical,CA:FALSE
      	keyUsage=digitalSignature
      	subjectKeyIdentifier=hash
      	authorityKeyIdentifier=hash
      
      The build process will use this to configure:
      
      	openssl req -new -nodes -utf8 -sha1 -days 36500 -batch \
      		-x509 -config x509.genkey \
      		-outform DER -out signing_key.x509 \
      		-keyout signing_key.priv
      
      to generate the key.
      
      Note that it is required that the X.509 certificate have a subjectKeyIdentifier
      and an authorityKeyIdentifier.  Without those, the certificate will be
      rejected.  These can be used to check the validity of a certificate.
      
      Note that 'make distclean' will remove signing_key.{priv,x509} and x509.genkey,
      whether or not they were generated automatically.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      d441108c
    • D
      MODSIGN: Add FIPS policy · 1d0059f3
      David Howells 提交于
      If we're in FIPS mode, we should panic if we fail to verify the signature on a
      module or we're asked to load an unsigned module in signature enforcing mode.
      Possibly FIPS mode should automatically enable enforcing mode.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      1d0059f3
    • R
      module: signature checking hook · 106a4ee2
      Rusty Russell 提交于
      We do a very simple search for a particular string appended to the module
      (which is cache-hot and about to be SHA'd anyway).  There's both a config
      option and a boot parameter which control whether we accept or fail with
      unsigned modules and modules that are signed with an unknown key.
      
      If module signing is enabled, the kernel will be tainted if a module is
      loaded that is unsigned or has a signature for which we don't have the
      key.
      
      (Useful feedback and tweaks by David Howells <dhowells@redhat.com>)
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      106a4ee2
    • L
      irqdomain: augment add_simple() to allocate descs · 2854d167
      Linus Walleij 提交于
      Currently we rely on all IRQ chip instances to dynamically
      allocate their IRQ descriptors unless they use the linear
      IRQ domain. So for irqdomain_add_legacy() and
      irqdomain_add_simple() the caller need to make sure that
      descriptors are allocated.
      
      Let's slightly augment the yet unused irqdomain_add_simple()
      to also allocate descriptors as a means to simplify usage
      and avoid code duplication throughout the kernel.
      
      We warn if descriptors cannot be allocated, e.g. if a
      platform has the bad habit of hogging descriptors at boot
      time.
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Grant Likely <grant.likely@secretlab.ca>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Lee Jones <lee.jones@linaro.org>
      Reviewed-by: NRob Herring <rob.herring@calxeda.com>
      Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
      2854d167
    • S
      fs: handle failed audit_log_start properly · d1c7d97a
      Sasha Levin 提交于
      audit_log_start() may return NULL, this is unchecked by the caller in
      audit_log_link_denied() and could cause a NULL ptr deref.
      
      Introduced by commit a51d9eaa ("fs: add link restriction audit reporting").
      Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      d1c7d97a
    • D
      timekeeping: Cast raw_interval to u64 to avoid shift overflow · 5b3900cd
      Dan Carpenter 提交于
      We fixed a bunch of integer overflows in timekeeping code during the 3.6
      cycle.  I did an audit based on that and found this potential overflow.
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: NJohn Stultz <johnstul@us.ibm.com>
      Link: http://lkml.kernel.org/r/20121009071823.GA19159@elgon.mountainSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      5b3900cd
    • H
      timers: Fix endless looping between cascade() and internal_add_timer() · 26cff4e2
      Hildner, Christian 提交于
      Adding two (or more) timers with large values for "expires" (they have
      to reside within tv5 in the same list) leads to endless looping
      between cascade() and internal_add_timer() in case CONFIG_BASE_SMALL
      is one and jiffies are crossing the value 1 << 18. The bug was
      introduced between 2.6.11 and 2.6.12 (and survived for quite some
      time).
      
      This patch ensures that when cascade() is called timers within tv5 are
      not added endlessly to their own list again, instead they are added to
      the next lower tv level tv4 (as expected).
      Signed-off-by: NChristian Hildner <christian.hildner@siemens.com>
      Reviewed-by: NJan Kiszka <jan.kiszka@siemens.com>
      Link: http://lkml.kernel.org/r/98673C87CB31274881CFFE0B65ECC87B0F5FC1963E@DEFTHW99EA4MSX.ww902.siemens.netSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      26cff4e2
  4. 09 10月, 2012 9 次提交
    • H
      mm: wrap calls to set_pte_at_notify with invalidate_range_start and invalidate_range_end · 6bdb913f
      Haggai Eran 提交于
      In order to allow sleeping during invalidate_page mmu notifier calls, we
      need to avoid calling when holding the PT lock.  In addition to its direct
      calls, invalidate_page can also be called as a substitute for a change_pte
      call, in case the notifier client hasn't implemented change_pte.
      
      This patch drops the invalidate_page call from change_pte, and instead
      wraps all calls to change_pte with invalidate_range_start and
      invalidate_range_end calls.
      
      Note that change_pte still cannot sleep after this patch, and that clients
      implementing change_pte should not take action on it in case the number of
      outstanding invalidate_range_start calls is larger than one, otherwise
      they might miss a later invalidation.
      Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
      Cc: Andrea Arcangeli <andrea@qumranet.com>
      Cc: Sagi Grimberg <sagig@mellanox.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
      Cc: Or Gerlitz <ogerlitz@mellanox.com>
      Cc: Haggai Eran <haggaie@mellanox.com>
      Cc: Shachar Raindel <raindel@mellanox.com>
      Cc: Liran Liss <liranl@mellanox.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6bdb913f
    • M
      mm: interval tree updates · 9826a516
      Michel Lespinasse 提交于
      Update the generic interval tree code that was introduced in "mm: replace
      vma prio_tree with an interval tree".
      
      Changes:
      
      - fixed 'endpoing' typo noticed by Andrew Morton
      
      - replaced include/linux/interval_tree_tmpl.h, which was used as a
        template (including it automatically defined the interval tree
        functions) with include/linux/interval_tree_generic.h, which only
        defines a preprocessor macro INTERVAL_TREE_DEFINE(), which itself
        defines the interval tree functions when invoked. Now that is a very
        long macro which is unfortunate, but it does make the usage sites
        (lib/interval_tree.c and mm/interval_tree.c) a bit nicer than previously.
      
      - make use of RB_DECLARE_CALLBACKS() in the INTERVAL_TREE_DEFINE() macro,
        instead of duplicating that code in the interval tree template.
      
      - replaced vma_interval_tree_add(), which was actually handling the
        nonlinear and interval tree cases, with vma_interval_tree_insert_after()
        which handles only the interval tree case and has an API that is more
        consistent with the other interval tree handling functions.
        The nonlinear case is now handled explicitly in kernel/fork.c dup_mmap().
      Signed-off-by: NMichel Lespinasse <walken@google.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Daniel Santos <daniel.santos@pobox.com>
      Cc: Hugh Dickins <hughd@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9826a516
    • M
      mm: replace vma prio_tree with an interval tree · 6b2dbba8
      Michel Lespinasse 提交于
      Implement an interval tree as a replacement for the VMA prio_tree.  The
      algorithms are similar to lib/interval_tree.c; however that code can't be
      directly reused as the interval endpoints are not explicitly stored in the
      VMA.  So instead, the common algorithm is moved into a template and the
      details (node type, how to get interval endpoints from the node, etc) are
      filled in using the C preprocessor.
      
      Once the interval tree functions are available, using them as a
      replacement to the VMA prio tree is a relatively simple, mechanical job.
      Signed-off-by: NMichel Lespinasse <walken@google.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Hillf Danton <dhillf@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6b2dbba8
    • D
      oom: remove deprecated oom_adj · 01dc52eb
      Davidlohr Bueso 提交于
      The deprecated /proc/<pid>/oom_adj is scheduled for removal this month.
      Signed-off-by: NDavidlohr Bueso <dave@gnu.org>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      01dc52eb
    • K
      mm: kill vma flag VM_RESERVED and mm->reserved_vm counter · 314e51b9
      Konstantin Khlebnikov 提交于
      A long time ago, in v2.4, VM_RESERVED kept swapout process off VMA,
      currently it lost original meaning but still has some effects:
      
       | effect                 | alternative flags
      -+------------------------+---------------------------------------------
      1| account as reserved_vm | VM_IO
      2| skip in core dump      | VM_IO, VM_DONTDUMP
      3| do not merge or expand | VM_IO, VM_DONTEXPAND, VM_HUGETLB, VM_PFNMAP
      4| do not mlock           | VM_IO, VM_DONTEXPAND, VM_HUGETLB, VM_PFNMAP
      
      This patch removes reserved_vm counter from mm_struct.  Seems like nobody
      cares about it, it does not exported into userspace directly, it only
      reduces total_vm showed in proc.
      
      Thus VM_RESERVED can be replaced with VM_IO or pair VM_DONTEXPAND | VM_DONTDUMP.
      
      remap_pfn_range() and io_remap_pfn_range() set VM_IO|VM_DONTEXPAND|VM_DONTDUMP.
      remap_vmalloc_range() set VM_DONTEXPAND | VM_DONTDUMP.
      
      [akpm@linux-foundation.org: drivers/vfio/pci/vfio_pci.c fixup]
      Signed-off-by: NKonstantin Khlebnikov <khlebnikov@openvz.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Carsten Otte <cotte@de.ibm.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Morris <james.l.morris@oracle.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Kentaro Takeda <takedakn@nttdata.co.jp>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Venkatesh Pallipadi <venki@google.com>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      314e51b9
    • K
      mm: kill vma flag VM_EXECUTABLE and mm->num_exe_file_vmas · e9714acf
      Konstantin Khlebnikov 提交于
      Currently the kernel sets mm->exe_file during sys_execve() and then tracks
      number of vmas with VM_EXECUTABLE flag in mm->num_exe_file_vmas, as soon
      as this counter drops to zero kernel resets mm->exe_file to NULL.  Plus it
      resets mm->exe_file at last mmput() when mm->mm_users drops to zero.
      
      VMA with VM_EXECUTABLE flag appears after mapping file with flag
      MAP_EXECUTABLE, such vmas can appears only at sys_execve() or after vma
      splitting, because sys_mmap ignores this flag.  Usually binfmt module sets
      mm->exe_file and mmaps executable vmas with this file, they hold
      mm->exe_file while task is running.
      
      comment from v2.6.25-6245-g925d1c40 ("procfs task exe symlink"),
      where all this stuff was introduced:
      
      > The kernel implements readlink of /proc/pid/exe by getting the file from
      > the first executable VMA.  Then the path to the file is reconstructed and
      > reported as the result.
      >
      > Because of the VMA walk the code is slightly different on nommu systems.
      > This patch avoids separate /proc/pid/exe code on nommu systems.  Instead of
      > walking the VMAs to find the first executable file-backed VMA we store a
      > reference to the exec'd file in the mm_struct.
      >
      > That reference would prevent the filesystem holding the executable file
      > from being unmounted even after unmapping the VMAs.  So we track the number
      > of VM_EXECUTABLE VMAs and drop the new reference when the last one is
      > unmapped.  This avoids pinning the mounted filesystem.
      
      exe_file's vma accounting is hooked into every file mmap/unmmap and vma
      split/merge just to fix some hypothetical pinning fs from umounting by mm,
      which already unmapped all its executable files, but still alive.
      
      Seems like currently nobody depends on this behaviour.  We can try to
      remove this logic and keep mm->exe_file until final mmput().
      
      mm->exe_file is still protected with mm->mmap_sem, because we want to
      change it via new sys_prctl(PR_SET_MM_EXE_FILE).  Also via this syscall
      task can change its mm->exe_file and unpin mountpoint explicitly.
      Signed-off-by: NKonstantin Khlebnikov <khlebnikov@openvz.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Carsten Otte <cotte@de.ibm.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Morris <james.l.morris@oracle.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Kentaro Takeda <takedakn@nttdata.co.jp>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Venkatesh Pallipadi <venki@google.com>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e9714acf
    • K
      mm: use mm->exe_file instead of first VM_EXECUTABLE vma->vm_file · 2dd8ad81
      Konstantin Khlebnikov 提交于
      Some security modules and oprofile still uses VM_EXECUTABLE for retrieving
      a task's executable file.  After this patch they will use mm->exe_file
      directly.  mm->exe_file is protected with mm->mmap_sem, so locking stays
      the same.
      Signed-off-by: NKonstantin Khlebnikov <khlebnikov@openvz.org>
      Acked-by: Chris Metcalf <cmetcalf@tilera.com>			[arch/tile]
      Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>	[tomoyo]
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Carsten Otte <cotte@de.ibm.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Acked-by: NJames Morris <james.l.morris@oracle.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Kentaro Takeda <takedakn@nttdata.co.jp>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: Venkatesh Pallipadi <venki@google.com>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2dd8ad81
    • S
      CPU hotplug, debug: detect imbalance between get_online_cpus() and put_online_cpus() · 075663d1
      Srivatsa S. Bhat 提交于
      The synchronization between CPU hotplug readers and writers is achieved
      by means of refcounting, safeguarded by the cpu_hotplug.lock.
      
      get_online_cpus() increments the refcount, whereas put_online_cpus()
      decrements it.  If we ever hit an imbalance between the two, we end up
      compromising the guarantees of the hotplug synchronization i.e, for
      example, an extra call to put_online_cpus() can end up allowing a
      hotplug reader to execute concurrently with a hotplug writer.
      
      So, add a WARN_ON() in put_online_cpus() to detect such cases where the
      refcount can go negative, and also attempt to fix it up, so that we can
      continue to run.
      Signed-off-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Reviewed-by: NYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      075663d1
    • C
      Kconfig: clean up the "#if defined(arch)" list for exception-trace sysctl entry · 7ac57a89
      Catalin Marinas 提交于
      Introduce SYSCTL_EXCEPTION_TRACE config option and selec it in the
      architectures requiring support for the "exception-trace" debug_table
      entry in kernel/sysctl.c.
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7ac57a89