1. 09 9月, 2014 1 次提交
  2. 02 9月, 2014 1 次提交
  3. 30 7月, 2014 1 次提交
  4. 28 7月, 2014 1 次提交
  5. 26 7月, 2014 2 次提交
  6. 24 7月, 2014 1 次提交
    • E
      CAPABILITIES: remove undefined caps from all processes · 7d8b6c63
      Eric Paris 提交于
      This is effectively a revert of 7b9a7ec5
      plus fixing it a different way...
      
      We found, when trying to run an application from an application which
      had dropped privs that the kernel does security checks on undefined
      capability bits.  This was ESPECIALLY difficult to debug as those
      undefined bits are hidden from /proc/$PID/status.
      
      Consider a root application which drops all capabilities from ALL 4
      capability sets.  We assume, since the application is going to set
      eff/perm/inh from an array that it will clear not only the defined caps
      less than CAP_LAST_CAP, but also the higher 28ish bits which are
      undefined future capabilities.
      
      The BSET gets cleared differently.  Instead it is cleared one bit at a
      time.  The problem here is that in security/commoncap.c::cap_task_prctl()
      we actually check the validity of a capability being read.  So any task
      which attempts to 'read all things set in bset' followed by 'unset all
      things set in bset' will not even attempt to unset the undefined bits
      higher than CAP_LAST_CAP.
      
      So the 'parent' will look something like:
      CapInh:	0000000000000000
      CapPrm:	0000000000000000
      CapEff:	0000000000000000
      CapBnd:	ffffffc000000000
      
      All of this 'should' be fine.  Given that these are undefined bits that
      aren't supposed to have anything to do with permissions.  But they do...
      
      So lets now consider a task which cleared the eff/perm/inh completely
      and cleared all of the valid caps in the bset (but not the invalid caps
      it couldn't read out of the kernel).  We know that this is exactly what
      the libcap-ng library does and what the go capabilities library does.
      They both leave you in that above situation if you try to clear all of
      you capapabilities from all 4 sets.  If that root task calls execve()
      the child task will pick up all caps not blocked by the bset.  The bset
      however does not block bits higher than CAP_LAST_CAP.  So now the child
      task has bits in eff which are not in the parent.  These are
      'meaningless' undefined bits, but still bits which the parent doesn't
      have.
      
      The problem is now in cred_cap_issubset() (or any operation which does a
      subset test) as the child, while a subset for valid cap bits, is not a
      subset for invalid cap bits!  So now we set durring commit creds that
      the child is not dumpable.  Given it is 'more priv' than its parent.  It
      also means the parent cannot ptrace the child and other stupidity.
      
      The solution here:
      1) stop hiding capability bits in status
      	This makes debugging easier!
      
      2) stop giving any task undefined capability bits.  it's simple, it you
      don't put those invalid bits in CAP_FULL_SET you won't get them in init
      and you won't get them in any other task either.
      	This fixes the cap_issubset() tests and resulting fallout (which
      	made the init task in a docker container untraceable among other
      	things)
      
      3) mask out undefined bits when sys_capset() is called as it might use
      ~0, ~0 to denote 'all capabilities' for backward/forward compatibility.
      	This lets 'capsh --caps="all=eip" -- -c /bin/bash' run.
      
      4) mask out undefined bit when we read a file capability off of disk as
      again likely all bits are set in the xattr for forward/backward
      compatibility.
      	This lets 'setcap all+pe /bin/bash; /bin/bash' run
      Signed-off-by: NEric Paris <eparis@redhat.com>
      Reviewed-by: NKees Cook <keescook@chromium.org>
      Cc: Andrew Vagin <avagin@openvz.org>
      Cc: Andrew G. Morgan <morgan@kernel.org>
      Cc: Serge E. Hallyn <serge.hallyn@canonical.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Steve Grubb <sgrubb@redhat.com>
      Cc: Dan Walsh <dwalsh@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NJames Morris <james.l.morris@oracle.com>
      7d8b6c63
  7. 23 7月, 2014 2 次提交
  8. 19 7月, 2014 5 次提交
    • K
      seccomp: implement SECCOMP_FILTER_FLAG_TSYNC · c2e1f2e3
      Kees Cook 提交于
      Applying restrictive seccomp filter programs to large or diverse
      codebases often requires handling threads which may be started early in
      the process lifetime (e.g., by code that is linked in). While it is
      possible to apply permissive programs prior to process start up, it is
      difficult to further restrict the kernel ABI to those threads after that
      point.
      
      This change adds a new seccomp syscall flag to SECCOMP_SET_MODE_FILTER for
      synchronizing thread group seccomp filters at filter installation time.
      
      When calling seccomp(SECCOMP_SET_MODE_FILTER, SECCOMP_FILTER_FLAG_TSYNC,
      filter) an attempt will be made to synchronize all threads in current's
      threadgroup to its new seccomp filter program. This is possible iff all
      threads are using a filter that is an ancestor to the filter current is
      attempting to synchronize to. NULL filters (where the task is running as
      SECCOMP_MODE_NONE) are also treated as ancestors allowing threads to be
      transitioned into SECCOMP_MODE_FILTER. If prctrl(PR_SET_NO_NEW_PRIVS,
      ...) has been set on the calling thread, no_new_privs will be set for
      all synchronized threads too. On success, 0 is returned. On failure,
      the pid of one of the failing threads will be returned and no filters
      will have been applied.
      
      The race conditions against another thread are:
      - requesting TSYNC (already handled by sighand lock)
      - performing a clone (already handled by sighand lock)
      - changing its filter (already handled by sighand lock)
      - calling exec (handled by cred_guard_mutex)
      The clone case is assisted by the fact that new threads will have their
      seccomp state duplicated from their parent before appearing on the tasklist.
      
      Holding cred_guard_mutex means that seccomp filters cannot be assigned
      while in the middle of another thread's exec (potentially bypassing
      no_new_privs or similar). The call to de_thread() may kill threads waiting
      for the mutex.
      
      Changes across threads to the filter pointer includes a barrier.
      
      Based on patches by Will Drewry.
      Suggested-by: NJulien Tinnes <jln@chromium.org>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Reviewed-by: NOleg Nesterov <oleg@redhat.com>
      Reviewed-by: NAndy Lutomirski <luto@amacapital.net>
      c2e1f2e3
    • K
      seccomp: introduce writer locking · dbd95212
      Kees Cook 提交于
      Normally, task_struct.seccomp.filter is only ever read or modified by
      the task that owns it (current). This property aids in fast access
      during system call filtering as read access is lockless.
      
      Updating the pointer from another task, however, opens up race
      conditions. To allow cross-thread filter pointer updates, writes to the
      seccomp fields are now protected by the sighand spinlock (which is shared
      by all threads in the thread group). Read access remains lockless because
      pointer updates themselves are atomic.  However, writes (or cloning)
      often entail additional checking (like maximum instruction counts)
      which require locking to perform safely.
      
      In the case of cloning threads, the child is invisible to the system
      until it enters the task list. To make sure a child can't be cloned from
      a thread and left in a prior state, seccomp duplication is additionally
      moved under the sighand lock. Then parent and child are certain have
      the same seccomp state when they exit the lock.
      
      Based on patches by Will Drewry and David Drysdale.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Reviewed-by: NOleg Nesterov <oleg@redhat.com>
      Reviewed-by: NAndy Lutomirski <luto@amacapital.net>
      dbd95212
    • K
      sched: move no_new_privs into new atomic flags · 1d4457f9
      Kees Cook 提交于
      Since seccomp transitions between threads requires updates to the
      no_new_privs flag to be atomic, the flag must be part of an atomic flag
      set. This moves the nnp flag into a separate task field, and introduces
      accessors.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Reviewed-by: NOleg Nesterov <oleg@redhat.com>
      Reviewed-by: NAndy Lutomirski <luto@amacapital.net>
      1d4457f9
    • K
      seccomp: add "seccomp" syscall · 48dc92b9
      Kees Cook 提交于
      This adds the new "seccomp" syscall with both an "operation" and "flags"
      parameter for future expansion. The third argument is a pointer value,
      used with the SECCOMP_SET_MODE_FILTER operation. Currently, flags must
      be 0. This is functionally equivalent to prctl(PR_SET_SECCOMP, ...).
      
      In addition to the TSYNC flag later in this patch series, there is a
      non-zero chance that this syscall could be used for configuring a fixed
      argument area for seccomp-tracer-aware processes to pass syscall arguments
      in the future. Hence, the use of "seccomp" not simply "seccomp_add_filter"
      for this syscall. Additionally, this syscall uses operation, flags,
      and user pointer for arguments because strictly passing arguments via
      a user pointer would mean seccomp itself would be unable to trivially
      filter the seccomp syscall itself.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Reviewed-by: NOleg Nesterov <oleg@redhat.com>
      Reviewed-by: NAndy Lutomirski <luto@amacapital.net>
      48dc92b9
    • D
      KEYS: Provide a generic instantiation function · 6a09d17b
      David Howells 提交于
      Provide a generic instantiation function for key types that use the preparse
      hook.  This makes it easier to prereserve key quota before keyrings get locked
      to retain the new key.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NSteve Dickson <steved@redhat.com>
      Acked-by: NJeff Layton <jlayton@primarydata.com>
      Reviewed-by: NSage Weil <sage@redhat.com>
      6a09d17b
  9. 18 7月, 2014 1 次提交
    • D
      KEYS: Allow special keys (eg. DNS results) to be invalidated by CAP_SYS_ADMIN · 0c7774ab
      David Howells 提交于
      Special kernel keys, such as those used to hold DNS results for AFS, CIFS and
      NFS and those used to hold idmapper results for NFS, used to be
      'invalidateable' with key_revoke().  However, since the default permissions for
      keys were reduced:
      
      	Commit: 96b5c8fe
      	KEYS: Reduce initial permissions on keys
      
      it has become impossible to do this.
      
      Add a key flag (KEY_FLAG_ROOT_CAN_INVAL) that will permit a key to be
      invalidated by root.  This should not be used for system keyrings as the
      garbage collector will try and remove any invalidate key.  For system keyrings,
      KEY_FLAG_ROOT_CAN_CLEAR can be used instead.
      
      After this, from userspace, keyctl_invalidate() and "keyctl invalidate" can be
      used by any possessor of CAP_SYS_ADMIN (typically root) to invalidate DNS and
      idmapper keys.  Invalidated keys are immediately garbage collected and will be
      immediately rerequested if needed again.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Tested-by: NSteve Dickson <steved@redhat.com>
      0c7774ab
  10. 17 7月, 2014 1 次提交
    • D
      KEYS: validate certificate trust only with builtin keys · 32c4741c
      Dmitry Kasatkin 提交于
      Instead of allowing public keys, with certificates signed by any
      key on the system trusted keyring, to be added to a trusted keyring,
      this patch further restricts the certificates to those signed only by
      builtin keys on the system keyring.
      
      This patch defines a new option 'builtin' for the kernel parameter
      'keys_ownerid' to allow trust validation using builtin keys.
      
      Simplified Mimi's "KEYS: define an owner trusted keyring" patch
      
      Changelog v7:
      - rename builtin_keys to use_builtin_keys
      Signed-off-by: NDmitry Kasatkin <d.kasatkin@samsung.com>
      Signed-off-by: NMimi Zohar <zohar@linux.vnet.ibm.com>
      32c4741c
  11. 10 7月, 2014 1 次提交
    • P
      selinux: fix the default socket labeling in sock_graft() · 4da6daf4
      Paul Moore 提交于
      The sock_graft() hook has special handling for AF_INET, AF_INET, and
      AF_UNIX sockets as those address families have special hooks which
      label the sock before it is attached its associated socket.
      Unfortunately, the sock_graft() hook was missing a default approach
      to labeling sockets which meant that any other address family which
      made use of connections or the accept() syscall would find the
      returned socket to be in an "unlabeled" state.  This was recently
      demonstrated by the kcrypto/AF_ALG subsystem and the newly released
      cryptsetup package (cryptsetup v1.6.5 and later).
      
      This patch preserves the special handling in selinux_sock_graft(),
      but adds a default behavior - setting the sock's label equal to the
      associated socket - which resolves the problem with AF_ALG and
      presumably any other address family which makes use of accept().
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaul Moore <pmoore@redhat.com>
      Tested-by: NMilan Broz <gmazyland@gmail.com>
      4da6daf4
  12. 09 7月, 2014 3 次提交
  13. 08 7月, 2014 1 次提交
  14. 04 7月, 2014 1 次提交
    • T
      ptrace,x86: force IRET path after a ptrace_stop() · b9cd18de
      Tejun Heo 提交于
      The 'sysret' fastpath does not correctly restore even all regular
      registers, much less any segment registers or reflags values.  That is
      very much part of why it's faster than 'iret'.
      
      Normally that isn't a problem, because the normal ptrace() interface
      catches the process using the signal handler infrastructure, which
      always returns with an iret.
      
      However, some paths can get caught using ptrace_event() instead of the
      signal path, and for those we need to make sure that we aren't going to
      return to user space using 'sysret'.  Otherwise the modifications that
      may have been done to the register set by the tracer wouldn't
      necessarily take effect.
      
      Fix it by forcing IRET path by setting TIF_NOTIFY_RESUME from
      arch_ptrace_stop_needed() which is invoked from ptrace_stop().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NAndy Lutomirski <luto@amacapital.net>
      Acked-by: NOleg Nesterov <oleg@redhat.com>
      Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b9cd18de
  15. 03 7月, 2014 1 次提交
    • T
      kernfs: kernfs_notify() must be useable from non-sleepable contexts · ecca47ce
      Tejun Heo 提交于
      d911d987 ("kernfs: make kernfs_notify() trigger inotify events
      too") added fsnotify triggering to kernfs_notify() which requires a
      sleepable context.  There are already existing users of
      kernfs_notify() which invoke it from an atomic context and in general
      it's silly to require a sleepable context for triggering a
      notification.
      
      The following is an invalid context bug triggerd by md invoking
      sysfs_notify() from IO completion path.
      
       BUG: sleeping function called from invalid context at kernel/locking/mutex.c:586
       in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/1
       2 locks held by swapper/1/0:
        #0:  (&(&vblk->vq_lock)->rlock){-.-...}, at: [<ffffffffa0039042>] virtblk_done+0x42/0xe0 [virtio_blk]
        #1:  (&(&bitmap->counts.lock)->rlock){-.....}, at: [<ffffffff81633718>] bitmap_endwrite+0x68/0x240
       irq event stamp: 33518
       hardirqs last  enabled at (33515): [<ffffffff8102544f>] default_idle+0x1f/0x230
       hardirqs last disabled at (33516): [<ffffffff818122ed>] common_interrupt+0x6d/0x72
       softirqs last  enabled at (33518): [<ffffffff810a1272>] _local_bh_enable+0x22/0x50
       softirqs last disabled at (33517): [<ffffffff810a29e0>] irq_enter+0x60/0x80
       CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.16.0-0.rc2.git2.1.fc21.x86_64 #1
       Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
        0000000000000000 f90db13964f4ee05 ffff88007d403b80 ffffffff81807b4c
        0000000000000000 ffff88007d403ba8 ffffffff810d4f14 0000000000000000
        0000000000441800 ffff880078fa1780 ffff88007d403c38 ffffffff8180caf2
       Call Trace:
        <IRQ>  [<ffffffff81807b4c>] dump_stack+0x4d/0x66
        [<ffffffff810d4f14>] __might_sleep+0x184/0x240
        [<ffffffff8180caf2>] mutex_lock_nested+0x42/0x440
        [<ffffffff812d76a0>] kernfs_notify+0x90/0x150
        [<ffffffff8163377c>] bitmap_endwrite+0xcc/0x240
        [<ffffffffa00de863>] close_write+0x93/0xb0 [raid1]
        [<ffffffffa00df029>] r1_bio_write_done+0x29/0x50 [raid1]
        [<ffffffffa00e0474>] raid1_end_write_request+0xe4/0x260 [raid1]
        [<ffffffff813acb8b>] bio_endio+0x6b/0xa0
        [<ffffffff813b46c4>] blk_update_request+0x94/0x420
        [<ffffffff813bf0ea>] blk_mq_end_io+0x1a/0x70
        [<ffffffffa00392c2>] virtblk_request_done+0x32/0x80 [virtio_blk]
        [<ffffffff813c0648>] __blk_mq_complete_request+0x88/0x120
        [<ffffffff813c070a>] blk_mq_complete_request+0x2a/0x30
        [<ffffffffa0039066>] virtblk_done+0x66/0xe0 [virtio_blk]
        [<ffffffffa002535a>] vring_interrupt+0x3a/0xa0 [virtio_ring]
        [<ffffffff81116177>] handle_irq_event_percpu+0x77/0x340
        [<ffffffff8111647d>] handle_irq_event+0x3d/0x60
        [<ffffffff81119436>] handle_edge_irq+0x66/0x130
        [<ffffffff8101c3e4>] handle_irq+0x84/0x150
        [<ffffffff818146ad>] do_IRQ+0x4d/0xe0
        [<ffffffff818122f2>] common_interrupt+0x72/0x72
        <EOI>  [<ffffffff8105f706>] ? native_safe_halt+0x6/0x10
        [<ffffffff81025454>] default_idle+0x24/0x230
        [<ffffffff81025f9f>] arch_cpu_idle+0xf/0x20
        [<ffffffff810f5adc>] cpu_startup_entry+0x37c/0x7b0
        [<ffffffff8104df1b>] start_secondary+0x25b/0x300
      
      This patch fixes it by punting the notification delivery through a
      work item.  This ends up adding an extra pointer to kernfs_elem_attr
      enlarging kernfs_node by a pointer, which is not ideal but not a very
      big deal either.  If this turns out to be an actual issue, we can move
      kernfs_elem_attr->size to kernfs_node->iattr later.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NJosh Boyer <jwboyer@fedoraproject.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Reviewed-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ecca47ce
  16. 02 7月, 2014 1 次提交
    • Z
      core: fix typo in percpu read_mostly section · 330d2822
      Zhengyu He 提交于
      This fixes a typo that named the read_mostly section of percpu as
      readmostly. It works fine with SMP because the linker script specifies
      .data..percpu..readmostly. However, UP kernel builds don't have percpu
      sections defined and the non-percpu version of the section is called
      data..read_mostly, so .data..readmostly will float around and may break
      things unexpectedly.
      
      Looking at the original change that introduced data..percpu..readmostly
      (commit c957ef2c), it looks like this
      was the original intention.
      
      Tested: Built UP kernel and confirmed the sections got merged.
      
      - Before the patch:
      $ objdump -h vmlinux.o  | grep '\.data\.\.read.*mostly'
      38 .data..read_mostly 00004418  0000000000000000  0000000000000000  00431ac0  2**6
      50 .data..readmostly 00000014  0000000000000000  0000000000000000  00444000  2**3
      
      - After the patch:
      $ objdump -h vmlinux.o  | grep '\.data\.\.read.*mostly'
      38 .data..read_mostly 00004438  0000000000000000  0000000000000000  00431ac0  2**6
      Signed-off-by: NZhengyu He <hzy@google.com>
      Signed-off-by: NFilipe Brandenburger <filbranden@google.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      330d2822
  17. 01 7月, 2014 1 次提交
  18. 30 6月, 2014 1 次提交
  19. 28 6月, 2014 1 次提交
  20. 27 6月, 2014 1 次提交
    • A
      Fix 32-bit regression in block device read(2) · 0b86dbf6
      Al Viro 提交于
      blkdev_read_iter() wants to cap the iov_iter by the amount of data
      remaining to the end of device.  That's what iov_iter_truncate() is for
      (trim iter->count if it's above the given limit).  So far, so good, but
      the argument of iov_iter_truncate() is size_t, so on 32bit boxen (in
      case of a large device) we end up with that upper limit truncated down
      to 32 bits *before* comparing it with iter->count.
      
      Easily fixed by making iov_iter_truncate() take 64bit argument - it does
      the right thing after such change (we only reach the assignment in there
      when the current value of iter->count is greater than the limit, i.e.
      for anything that would get truncated we don't reach the assignment at
      all) and that argument is not the new value of iter->count - it's an
      upper limit for such.
      
      The overhead of passing u64 is not an issue - the thing is inlined, so
      callers passing size_t won't pay any penalty.
      Reported-and-tested-by: NTheodore Tso <tytso@mit.edu>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Tested-by: NAlan Cox <gnomes@lxorguk.ukuu.org.uk>
      Tested-by: NBruno Wolff III <bruno@wolff.to>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0b86dbf6
  21. 25 6月, 2014 2 次提交
  22. 24 6月, 2014 3 次提交
    • A
      kernel/watchdog.c: print traces for all cpus on lockup detection · ed235875
      Aaron Tomlin 提交于
      A 'softlockup' is defined as a bug that causes the kernel to loop in
      kernel mode for more than a predefined period to time, without giving
      other tasks a chance to run.
      
      Currently, upon detection of this condition by the per-cpu watchdog
      task, debug information (including a stack trace) is sent to the system
      log.
      
      On some occasions, we have observed that the "victim" rather than the
      actual "culprit" (i.e.  the owner/holder of the contended resource) is
      reported to the user.  Often this information has proven to be
      insufficient to assist debugging efforts.
      
      To avoid loss of useful debug information, for architectures which
      support NMI, this patch makes it possible to improve soft lockup
      reporting.  This is accomplished by issuing an NMI to each cpu to obtain
      a stack trace.
      
      If NMI is not supported we just revert back to the old method.  A sysctl
      and boot-time parameter is available to toggle this feature.
      
      [dzickus@redhat.com: add CONFIG_SMP in certain areas]
      [akpm@linux-foundation.org: additional CONFIG_SMP=n optimisations]
      [mq@suse.cz: fix warning]
      Signed-off-by: NAaron Tomlin <atomlin@redhat.com>
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Mateusz Guzik <mguzik@redhat.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Signed-off-by: NJan Moskyto Matejka <mq@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ed235875
    • A
      nmi: provide the option to issue an NMI back trace to every cpu but current · f3aca3d0
      Aaron Tomlin 提交于
      Sometimes it is preferred not to use the trigger_all_cpu_backtrace()
      routine when one wants to avoid capturing a back trace for current.  For
      instance if one was previously captured recently.
      
      This patch provides a new routine namely
      trigger_allbutself_cpu_backtrace() which offers the flexibility to issue
      an NMI to every cpu but current and capture a back trace accordingly.
      
      Patch x86 and sparc to support new routine.
      
      [dzickus@redhat.com: add stub in #else clause]
      [dzickus@redhat.com: don't print message in single processor case, wrap with get/put_cpu based on Oleg's suggestion]
      [sfr@canb.auug.org.au: undo C99ism]
      Signed-off-by: NAaron Tomlin <atomlin@redhat.com>
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Cc: Mateusz Guzik <mguzik@redhat.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f3aca3d0
    • P
      kexec: save PG_head_mask in VMCOREINFO · b3acc56b
      Petr Tesarik 提交于
      To allow filtering of huge pages, makedumpfile must be able to identify
      them in the dump.  This can be done by checking the appropriate page
      flag, so communicate its value to makedumpfile through the VMCOREINFO
      interface.
      
      There's only one small catch.  Depending on how many page flags are
      available on a given architecture, this bit can be called PG_head or
      PG_compound.
      
      I sent a similar patch back in 2012, but Eric Biederman did not like
      using an #ifdef.  So, this time I'm adding a common symbol
      (PG_head_mask) instead.
      
      See https://lkml.org/lkml/2012/11/28/91 for the previous version.
      Signed-off-by: NPetr Tesarik <ptesarik@suse.cz>
      Acked-by: NVivek Goyal <vgoyal@redhat.com>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Shaohua Li <shli@kernel.org>
      Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b3acc56b
  23. 23 6月, 2014 1 次提交
  24. 22 6月, 2014 1 次提交
  25. 18 6月, 2014 2 次提交
  26. 17 6月, 2014 1 次提交
  27. 15 6月, 2014 2 次提交