1. 04 12月, 2018 1 次提交
    • R
      Revert "exec: make de_thread() freezable" · a72173ec
      Rafael J. Wysocki 提交于
      Revert commit c2239788 "exec: make de_thread() freezable" as
      requested by Ingo Molnar:
      
      "So there's a new regression in v4.20-rc4, my desktop produces this
      lockdep splat:
      
      [ 1772.588771] WARNING: pkexec/4633 still has locks held!
      [ 1772.588773] 4.20.0-rc4-custom-00213-g93a49841322b #1 Not tainted
      [ 1772.588775] ------------------------------------
      [ 1772.588776] 1 lock held by pkexec/4633:
      [ 1772.588778]  #0: 00000000ed85fbf8 (&sig->cred_guard_mutex){+.+.}, at: prepare_bprm_creds+0x2a/0x70
      [ 1772.588786] stack backtrace:
      [ 1772.588789] CPU: 7 PID: 4633 Comm: pkexec Not tainted 4.20.0-rc4-custom-00213-g93a49841322b #1
      [ 1772.588792] Call Trace:
      [ 1772.588800]  dump_stack+0x85/0xcb
      [ 1772.588803]  flush_old_exec+0x116/0x890
      [ 1772.588807]  ? load_elf_phdrs+0x72/0xb0
      [ 1772.588809]  load_elf_binary+0x291/0x1620
      [ 1772.588815]  ? sched_clock+0x5/0x10
      [ 1772.588817]  ? search_binary_handler+0x6d/0x240
      [ 1772.588820]  search_binary_handler+0x80/0x240
      [ 1772.588823]  load_script+0x201/0x220
      [ 1772.588825]  search_binary_handler+0x80/0x240
      [ 1772.588828]  __do_execve_file.isra.32+0x7d2/0xa60
      [ 1772.588832]  ? strncpy_from_user+0x40/0x180
      [ 1772.588835]  __x64_sys_execve+0x34/0x40
      [ 1772.588838]  do_syscall_64+0x60/0x1c0
      
      The warning gets triggered by an ancient lockdep check in the freezer:
      
      (gdb) list *0xffffffff812ece06
      0xffffffff812ece06 is in flush_old_exec (./include/linux/freezer.h:57).
      52	 * DO NOT ADD ANY NEW CALLERS OF THIS FUNCTION
      53	 * If try_to_freeze causes a lockdep warning it means the caller may deadlock
      54	 */
      55	static inline bool try_to_freeze_unsafe(void)
      56	{
      57		might_sleep();
      58		if (likely(!freezing(current)))
      59			return false;
      60		return __refrigerator(false);
      61	}
      
      I reviewed the ->cred_guard_mutex code, and the mutex is held across all
      of exec() - and we always did this.
      
      But there's this recent -rc4 commit:
      
      > Chanho Min (1):
      >       exec: make de_thread() freezable
      
        c2239788: exec: make de_thread() freezable
      
      I believe this commit is bogus, you cannot call try_to_freeze() from
      de_thread(), because it's holding the ->cred_guard_mutex."
      Reported-by: NIngo Molnar <mingo@kernel.org>
      Tested-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      a72173ec
  2. 19 11月, 2018 1 次提交
    • C
      exec: make de_thread() freezable · c2239788
      Chanho Min 提交于
      Suspend fails due to the exec family of functions blocking the freezer.
      The casue is that de_thread() sleeps in TASK_UNINTERRUPTIBLE waiting for
      all sub-threads to die, and we have the deadlock if one of them is frozen.
      This also can occur with the schedule() waiting for the group thread leader
      to exit if it is frozen.
      
      In our machine, it causes freeze timeout as bellows.
      
      Freezing of tasks failed after 20.010 seconds (1 tasks refusing to freeze, wq_busy=0):
      setcpushares-ls D ffffffc00008ed70     0  5817   1483 0x0040000d
       Call trace:
      [<ffffffc00008ed70>] __switch_to+0x88/0xa0
      [<ffffffc000d1c30c>] __schedule+0x1bc/0x720
      [<ffffffc000d1ca90>] schedule+0x40/0xa8
      [<ffffffc0001cd784>] flush_old_exec+0xdc/0x640
      [<ffffffc000220360>] load_elf_binary+0x2a8/0x1090
      [<ffffffc0001ccff4>] search_binary_handler+0x9c/0x240
      [<ffffffc00021c584>] load_script+0x20c/0x228
      [<ffffffc0001ccff4>] search_binary_handler+0x9c/0x240
      [<ffffffc0001ce8e0>] do_execveat_common.isra.14+0x4f8/0x6e8
      [<ffffffc0001cedd0>] compat_SyS_execve+0x38/0x48
      [<ffffffc00008de30>] el0_svc_naked+0x24/0x28
      
      To fix this, make de_thread() freezable. It looks safe and works fine.
      Suggested-by: NOleg Nesterov <oleg@redhat.com>
      Signed-off-by: NChanho Min <chanho.min@lge.com>
      Acked-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NPavel Machek <pavel@ucw.cz>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      c2239788
  3. 11 10月, 2018 1 次提交
    • E
      vfs: require i_size <= SIZE_MAX in kernel_read_file() · 691115c3
      Eric Biggers 提交于
      On 32-bit systems, the buffer allocated by kernel_read_file() is too
      small if the file size is > SIZE_MAX, due to truncation to size_t.
      
      Fortunately, since the 'count' argument to kernel_read() is also
      truncated to size_t, only the allocated space is filled; then, -EIO is
      returned since 'pos != i_size' after the read loop.
      
      But this is not obvious and seems incidental.  We should be more
      explicit about this case.  So, fail early if i_size > SIZE_MAX.
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NMimi Zohar <zohar@linux.ibm.com>
      691115c3
  4. 27 7月, 2018 1 次提交
    • K
      mm: fix vma_is_anonymous() false-positives · bfd40eaf
      Kirill A. Shutemov 提交于
      vma_is_anonymous() relies on ->vm_ops being NULL to detect anonymous
      VMA.  This is unreliable as ->mmap may not set ->vm_ops.
      
      False-positive vma_is_anonymous() may lead to crashes:
      
      	next ffff8801ce5e7040 prev ffff8801d20eca50 mm ffff88019c1e13c0
      	prot 27 anon_vma ffff88019680cdd8 vm_ops 0000000000000000
      	pgoff 0 file ffff8801b2ec2d00 private_data 0000000000000000
      	flags: 0xff(read|write|exec|shared|mayread|maywrite|mayexec|mayshare)
      	------------[ cut here ]------------
      	kernel BUG at mm/memory.c:1422!
      	invalid opcode: 0000 [#1] SMP KASAN
      	CPU: 0 PID: 18486 Comm: syz-executor3 Not tainted 4.18.0-rc3+ #136
      	Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google
      	01/01/2011
      	RIP: 0010:zap_pmd_range mm/memory.c:1421 [inline]
      	RIP: 0010:zap_pud_range mm/memory.c:1466 [inline]
      	RIP: 0010:zap_p4d_range mm/memory.c:1487 [inline]
      	RIP: 0010:unmap_page_range+0x1c18/0x2220 mm/memory.c:1508
      	Call Trace:
      	 unmap_single_vma+0x1a0/0x310 mm/memory.c:1553
      	 zap_page_range_single+0x3cc/0x580 mm/memory.c:1644
      	 unmap_mapping_range_vma mm/memory.c:2792 [inline]
      	 unmap_mapping_range_tree mm/memory.c:2813 [inline]
      	 unmap_mapping_pages+0x3a7/0x5b0 mm/memory.c:2845
      	 unmap_mapping_range+0x48/0x60 mm/memory.c:2880
      	 truncate_pagecache+0x54/0x90 mm/truncate.c:800
      	 truncate_setsize+0x70/0xb0 mm/truncate.c:826
      	 simple_setattr+0xe9/0x110 fs/libfs.c:409
      	 notify_change+0xf13/0x10f0 fs/attr.c:335
      	 do_truncate+0x1ac/0x2b0 fs/open.c:63
      	 do_sys_ftruncate+0x492/0x560 fs/open.c:205
      	 __do_sys_ftruncate fs/open.c:215 [inline]
      	 __se_sys_ftruncate fs/open.c:213 [inline]
      	 __x64_sys_ftruncate+0x59/0x80 fs/open.c:213
      	 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
      	 entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Reproducer:
      
      	#include <stdio.h>
      	#include <stddef.h>
      	#include <stdint.h>
      	#include <stdlib.h>
      	#include <string.h>
      	#include <sys/types.h>
      	#include <sys/stat.h>
      	#include <sys/ioctl.h>
      	#include <sys/mman.h>
      	#include <unistd.h>
      	#include <fcntl.h>
      
      	#define KCOV_INIT_TRACE			_IOR('c', 1, unsigned long)
      	#define KCOV_ENABLE			_IO('c', 100)
      	#define KCOV_DISABLE			_IO('c', 101)
      	#define COVER_SIZE			(1024<<10)
      
      	#define KCOV_TRACE_PC  0
      	#define KCOV_TRACE_CMP 1
      
      	int main(int argc, char **argv)
      	{
      		int fd;
      		unsigned long *cover;
      
      		system("mount -t debugfs none /sys/kernel/debug");
      		fd = open("/sys/kernel/debug/kcov", O_RDWR);
      		ioctl(fd, KCOV_INIT_TRACE, COVER_SIZE);
      		cover = mmap(NULL, COVER_SIZE * sizeof(unsigned long),
      				PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
      		munmap(cover, COVER_SIZE * sizeof(unsigned long));
      		cover = mmap(NULL, COVER_SIZE * sizeof(unsigned long),
      				PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
      		memset(cover, 0, COVER_SIZE * sizeof(unsigned long));
      		ftruncate(fd, 3UL << 20);
      		return 0;
      	}
      
      This can be fixed by assigning anonymous VMAs own vm_ops and not relying
      on it being NULL.
      
      If ->mmap() failed to set ->vm_ops, mmap_region() will set it to
      dummy_vm_ops.  This way we will have non-NULL ->vm_ops for all VMAs.
      
      Link: http://lkml.kernel.org/r/20180724121139.62570-4-kirill.shutemov@linux.intel.comSigned-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Reported-by: syzbot+3f84280d52be9b7083cc@syzkaller.appspotmail.com
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bfd40eaf
  5. 22 7月, 2018 2 次提交
    • L
      mm: make vm_area_alloc() initialize core fields · 490fc053
      Linus Torvalds 提交于
      Like vm_area_dup(), it initializes the anon_vma_chain head, and the
      basic mm pointer.
      
      The rest of the fields end up being different for different users,
      although the plan is to also initialize the 'vm_ops' field to a dummy
      entry.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      490fc053
    • L
      mm: use helper functions for allocating and freeing vm_area structs · 3928d4f5
      Linus Torvalds 提交于
      The vm_area_struct is one of the most fundamental memory management
      objects, but the management of it is entirely open-coded evertwhere,
      ranging from allocation and freeing (using kmem_cache_[z]alloc and
      kmem_cache_free) to initializing all the fields.
      
      We want to unify this in order to end up having some unified
      initialization of the vmas, and the first step to this is to at least
      have basic allocation functions.
      
      Right now those functions are literally just wrappers around the
      kmem_cache_*() calls.  This is a purely mechanical conversion:
      
          # new vma:
          kmem_cache_zalloc(vm_area_cachep, GFP_KERNEL) -> vm_area_alloc()
      
          # copy old vma
          kmem_cache_alloc(vm_area_cachep, GFP_KERNEL) -> vm_area_dup(old)
      
          # free vma
          kmem_cache_free(vm_area_cachep, vma) -> vm_area_free(vma)
      
      to the point where the old vma passed in to the vm_area_dup() function
      isn't even used yet (because I've left all the old manual initialization
      alone).
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3928d4f5
  6. 21 7月, 2018 1 次提交
    • E
      pid: Implement PIDTYPE_TGID · 6883f81a
      Eric W. Biederman 提交于
      Everywhere except in the pid array we distinguish between a tasks pid and
      a tasks tgid (thread group id).  Even in the enumeration we want that
      distinction sometimes so we have added __PIDTYPE_TGID.  With leader_pid
      we almost have an implementation of PIDTYPE_TGID in struct signal_struct.
      
      Add PIDTYPE_TGID as a first class member of the pid_type enumeration and
      into the pids array.  Then remove the __PIDTYPE_TGID special case and the
      leader_pid in signal_struct.
      
      The net size increase is just an extra pointer added to struct pid and
      an extra pair of pointers of an hlist_node added to task_struct.
      
      The effect on code maintenance is the removal of a number of special
      cases today and the potential to remove many more special cases as
      PIDTYPE_TGID gets used to it's fullest.  The long term potential
      is allowing zombie thread group leaders to exit, which will remove
      a lot more special cases in the code.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      6883f81a
  7. 06 6月, 2018 1 次提交
    • M
      rseq: Introduce restartable sequences system call · d7822b1e
      Mathieu Desnoyers 提交于
      Expose a new system call allowing each thread to register one userspace
      memory area to be used as an ABI between kernel and user-space for two
      purposes: user-space restartable sequences and quick access to read the
      current CPU number value from user-space.
      
      * Restartable sequences (per-cpu atomics)
      
      Restartables sequences allow user-space to perform update operations on
      per-cpu data without requiring heavy-weight atomic operations.
      
      The restartable critical sections (percpu atomics) work has been started
      by Paul Turner and Andrew Hunter. It lets the kernel handle restart of
      critical sections. [1] [2] The re-implementation proposed here brings a
      few simplifications to the ABI which facilitates porting to other
      architectures and speeds up the user-space fast path.
      
      Here are benchmarks of various rseq use-cases.
      
      Test hardware:
      
      arm32: ARMv7 Processor rev 4 (v7l) "Cubietruck", 2-core
      x86-64: Intel E5-2630 v3@2.40GHz, 16-core, hyperthreading
      
      The following benchmarks were all performed on a single thread.
      
      * Per-CPU statistic counter increment
      
                      getcpu+atomic (ns/op)    rseq (ns/op)    speedup
      arm32:                344.0                 31.4          11.0
      x86-64:                15.3                  2.0           7.7
      
      * LTTng-UST: write event 32-bit header, 32-bit payload into tracer
                   per-cpu buffer
      
                      getcpu+atomic (ns/op)    rseq (ns/op)    speedup
      arm32:               2502.0                 2250.0         1.1
      x86-64:               117.4                   98.0         1.2
      
      * liburcu percpu: lock-unlock pair, dereference, read/compare word
      
                      getcpu+atomic (ns/op)    rseq (ns/op)    speedup
      arm32:                751.0                 128.5          5.8
      x86-64:                53.4                  28.6          1.9
      
      * jemalloc memory allocator adapted to use rseq
      
      Using rseq with per-cpu memory pools in jemalloc at Facebook (based on
      rseq 2016 implementation):
      
      The production workload response-time has 1-2% gain avg. latency, and
      the P99 overall latency drops by 2-3%.
      
      * Reading the current CPU number
      
      Speeding up reading the current CPU number on which the caller thread is
      running is done by keeping the current CPU number up do date within the
      cpu_id field of the memory area registered by the thread. This is done
      by making scheduler preemption set the TIF_NOTIFY_RESUME flag on the
      current thread. Upon return to user-space, a notify-resume handler
      updates the current CPU value within the registered user-space memory
      area. User-space can then read the current CPU number directly from
      memory.
      
      Keeping the current cpu id in a memory area shared between kernel and
      user-space is an improvement over current mechanisms available to read
      the current CPU number, which has the following benefits over
      alternative approaches:
      
      - 35x speedup on ARM vs system call through glibc
      - 20x speedup on x86 compared to calling glibc, which calls vdso
        executing a "lsl" instruction,
      - 14x speedup on x86 compared to inlined "lsl" instruction,
      - Unlike vdso approaches, this cpu_id value can be read from an inline
        assembly, which makes it a useful building block for restartable
        sequences.
      - The approach of reading the cpu id through memory mapping shared
        between kernel and user-space is portable (e.g. ARM), which is not the
        case for the lsl-based x86 vdso.
      
      On x86, yet another possible approach would be to use the gs segment
      selector to point to user-space per-cpu data. This approach performs
      similarly to the cpu id cache, but it has two disadvantages: it is
      not portable, and it is incompatible with existing applications already
      using the gs segment selector for other purposes.
      
      Benchmarking various approaches for reading the current CPU number:
      
      ARMv7 Processor rev 4 (v7l)
      Machine model: Cubietruck
      - Baseline (empty loop):                                    8.4 ns
      - Read CPU from rseq cpu_id:                               16.7 ns
      - Read CPU from rseq cpu_id (lazy register):               19.8 ns
      - glibc 2.19-0ubuntu6.6 getcpu:                           301.8 ns
      - getcpu system call:                                     234.9 ns
      
      x86-64 Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz:
      - Baseline (empty loop):                                    0.8 ns
      - Read CPU from rseq cpu_id:                                0.8 ns
      - Read CPU from rseq cpu_id (lazy register):                0.8 ns
      - Read using gs segment selector:                           0.8 ns
      - "lsl" inline assembly:                                   13.0 ns
      - glibc 2.19-0ubuntu6 getcpu:                              16.6 ns
      - getcpu system call:                                      53.9 ns
      
      - Speed (benchmark taken on v8 of patchset)
      
      Running 10 runs of hackbench -l 100000 seems to indicate, contrary to
      expectations, that enabling CONFIG_RSEQ slightly accelerates the
      scheduler:
      
      Configuration: 2 sockets * 8-core Intel(R) Xeon(R) CPU E5-2630 v3 @
      2.40GHz (directly on hardware, hyperthreading disabled in BIOS, energy
      saving disabled in BIOS, turboboost disabled in BIOS, cpuidle.off=1
      kernel parameter), with a Linux v4.6 defconfig+localyesconfig,
      restartable sequences series applied.
      
      * CONFIG_RSEQ=n
      
      avg.:      41.37 s
      std.dev.:   0.36 s
      
      * CONFIG_RSEQ=y
      
      avg.:      40.46 s
      std.dev.:   0.33 s
      
      - Size
      
      On x86-64, between CONFIG_RSEQ=n/y, the text size increase of vmlinux is
      567 bytes, and the data size increase of vmlinux is 5696 bytes.
      
      [1] https://lwn.net/Articles/650333/
      [2] http://www.linuxplumbersconf.org/2013/ocw/system/presentations/1695/original/LPC%20-%20PerCpu%20Atomics.pdfSigned-off-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Joel Fernandes <joelaf@google.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Dave Watson <davejwatson@fb.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: "H . Peter Anvin" <hpa@zytor.com>
      Cc: Chris Lameter <cl@linux.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Andrew Hunter <ahh@google.com>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Paul Turner <pjt@google.com>
      Cc: Boqun Feng <boqun.feng@gmail.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Ben Maurer <bmaurer@fb.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: linux-api@vger.kernel.org
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/20151027235635.16059.11630.stgit@pjt-glaptop.roam.corp.google.com
      Link: http://lkml.kernel.org/r/20150624222609.6116.86035.stgit@kitami.mtv.corp.google.com
      Link: https://lkml.kernel.org/r/20180602124408.8430-3-mathieu.desnoyers@efficios.com
      d7822b1e
  8. 24 5月, 2018 1 次提交
    • A
      umh: introduce fork_usermode_blob() helper · 449325b5
      Alexei Starovoitov 提交于
      Introduce helper:
      int fork_usermode_blob(void *data, size_t len, struct umh_info *info);
      struct umh_info {
             struct file *pipe_to_umh;
             struct file *pipe_from_umh;
             pid_t pid;
      };
      
      that GPLed kernel modules (signed or unsigned) can use it to execute part
      of its own data as swappable user mode process.
      
      The kernel will do:
      - allocate a unique file in tmpfs
      - populate that file with [data, data + len] bytes
      - user-mode-helper code will do_execve that file and, before the process
        starts, the kernel will create two unix pipes for bidirectional
        communication between kernel module and umh
      - close tmpfs file, effectively deleting it
      - the fork_usermode_blob will return zero on success and populate
        'struct umh_info' with two unix pipes and the pid of the user process
      
      As the first step in the development of the bpfilter project
      the fork_usermode_blob() helper is introduced to allow user mode code
      to be invoked from a kernel module. The idea is that user mode code plus
      normal kernel module code are built as part of the kernel build
      and installed as traditional kernel module into distro specified location,
      such that from a distribution point of view, there is
      no difference between regular kernel modules and kernel modules + umh code.
      Such modules can be signed, modprobed, rmmod, etc. The use of this new helper
      by a kernel module doesn't make it any special from kernel and user space
      tooling point of view.
      
      Such approach enables kernel to delegate functionality traditionally done
      by the kernel modules into the user space processes (either root or !root) and
      reduces security attack surface of the new code. The buggy umh code would crash
      the user process, but not the kernel. Another advantage is that umh code
      of the kernel module can be debugged and tested out of user space
      (e.g. opening the possibility to run clang sanitizers, fuzzers or
      user space test suites on the umh code).
      In case of the bpfilter project such architecture allows complex control plane
      to be done in the user space while bpf based data plane stays in the kernel.
      
      Since umh can crash, can be oom-ed by the kernel, killed by the admin,
      the kernel module that uses them (like bpfilter) needs to manage life
      time of umh on its own via two unix pipes and the pid of umh.
      
      The exit code of such kernel module should kill the umh it started,
      so that rmmod of the kernel module will cleanup the corresponding umh.
      Just like if the kernel module does kmalloc() it should kfree() it
      in the exit code.
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      449325b5
  9. 12 4月, 2018 3 次提交
  10. 19 3月, 2018 1 次提交
  11. 04 1月, 2018 1 次提交
  12. 18 12月, 2017 1 次提交
  13. 15 12月, 2017 1 次提交
    • A
      exec: avoid gcc-8 warning for get_task_comm · 3756f640
      Arnd Bergmann 提交于
      gcc-8 warns about using strncpy() with the source size as the limit:
      
        fs/exec.c:1223:32: error: argument to 'sizeof' in 'strncpy' call is the same expression as the source; did you mean to use the size of the destination? [-Werror=sizeof-pointer-memaccess]
      
      This is indeed slightly suspicious, as it protects us from source
      arguments without NUL-termination, but does not guarantee that the
      destination is terminated.
      
      This keeps the strncpy() to ensure we have properly padded target
      buffer, but ensures that we use the correct length, by passing the
      actual length of the destination buffer as well as adding a build-time
      check to ensure it is exactly TASK_COMM_LEN.
      
      There are only 23 callsites which I all reviewed to ensure this is
      currently the case.  We could get away with doing only the check or
      passing the right length, but it doesn't hurt to do both.
      
      Link: http://lkml.kernel.org/r/20171205151724.1764896-1-arnd@arndb.deSigned-off-by: NArnd Bergmann <arnd@arndb.de>
      Suggested-by: NKees Cook <keescook@chromium.org>
      Acked-by: NKees Cook <keescook@chromium.org>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Serge Hallyn <serge@hallyn.com>
      Cc: James Morris <james.l.morris@oracle.com>
      Cc: Aleksa Sarai <asarai@suse.de>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Frederic Weisbecker <frederic@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3756f640
  14. 30 11月, 2017 1 次提交
  15. 25 10月, 2017 1 次提交
    • M
      locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns... · 6aa7de05
      Mark Rutland 提交于
      locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns to READ_ONCE()/WRITE_ONCE()
      
      Please do not apply this to mainline directly, instead please re-run the
      coccinelle script shown below and apply its output.
      
      For several reasons, it is desirable to use {READ,WRITE}_ONCE() in
      preference to ACCESS_ONCE(), and new code is expected to use one of the
      former. So far, there's been no reason to change most existing uses of
      ACCESS_ONCE(), as these aren't harmful, and changing them results in
      churn.
      
      However, for some features, the read/write distinction is critical to
      correct operation. To distinguish these cases, separate read/write
      accessors must be used. This patch migrates (most) remaining
      ACCESS_ONCE() instances to {READ,WRITE}_ONCE(), using the following
      coccinelle script:
      
      ----
      // Convert trivial ACCESS_ONCE() uses to equivalent READ_ONCE() and
      // WRITE_ONCE()
      
      // $ make coccicheck COCCI=/home/mark/once.cocci SPFLAGS="--include-headers" MODE=patch
      
      virtual patch
      
      @ depends on patch @
      expression E1, E2;
      @@
      
      - ACCESS_ONCE(E1) = E2
      + WRITE_ONCE(E1, E2)
      
      @ depends on patch @
      expression E;
      @@
      
      - ACCESS_ONCE(E)
      + READ_ONCE(E)
      ----
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: davem@davemloft.net
      Cc: linux-arch@vger.kernel.org
      Cc: mpe@ellerman.id.au
      Cc: shuah@kernel.org
      Cc: snitzer@redhat.com
      Cc: thor.thayer@linux.intel.com
      Cc: tj@kernel.org
      Cc: viro@zeniv.linux.org.uk
      Cc: will.deacon@arm.com
      Link: http://lkml.kernel.org/r/1508792849-3115-19-git-send-email-paulmck@linux.vnet.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6aa7de05
  16. 20 10月, 2017 1 次提交
    • M
      membarrier: Provide register expedited private command · a961e409
      Mathieu Desnoyers 提交于
      This introduces a "register private expedited" membarrier command which
      allows eventual removal of important memory barrier constraints on the
      scheduler fast-paths. It changes how the "private expedited" membarrier
      command (new to 4.14) is used from user-space.
      
      This new command allows processes to register their intent to use the
      private expedited command.  This affects how the expedited private
      command introduced in 4.14-rc is meant to be used, and should be merged
      before 4.14 final.
      
      Processes are now required to register before using
      MEMBARRIER_CMD_PRIVATE_EXPEDITED, otherwise that command returns EPERM.
      
      This fixes a problem that arose when designing requested extensions to
      sys_membarrier() to allow JITs to efficiently flush old code from
      instruction caches.  Several potential algorithms are much less painful
      if the user register intent to use this functionality early on, for
      example, before the process spawns the second thread.  Registering at
      this time removes the need to interrupt each and every thread in that
      process at the first expedited sys_membarrier() system call.
      Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a961e409
  17. 04 10月, 2017 1 次提交
  18. 15 9月, 2017 1 次提交
  19. 14 9月, 2017 1 次提交
    • M
      mm: treewide: remove GFP_TEMPORARY allocation flag · 0ee931c4
      Michal Hocko 提交于
      GFP_TEMPORARY was introduced by commit e12ba74d ("Group short-lived
      and reclaimable kernel allocations") along with __GFP_RECLAIMABLE.  It's
      primary motivation was to allow users to tell that an allocation is
      short lived and so the allocator can try to place such allocations close
      together and prevent long term fragmentation.  As much as this sounds
      like a reasonable semantic it becomes much less clear when to use the
      highlevel GFP_TEMPORARY allocation flag.  How long is temporary? Can the
      context holding that memory sleep? Can it take locks? It seems there is
      no good answer for those questions.
      
      The current implementation of GFP_TEMPORARY is basically GFP_KERNEL |
      __GFP_RECLAIMABLE which in itself is tricky because basically none of
      the existing caller provide a way to reclaim the allocated memory.  So
      this is rather misleading and hard to evaluate for any benefits.
      
      I have checked some random users and none of them has added the flag
      with a specific justification.  I suspect most of them just copied from
      other existing users and others just thought it might be a good idea to
      use without any measuring.  This suggests that GFP_TEMPORARY just
      motivates for cargo cult usage without any reasoning.
      
      I believe that our gfp flags are quite complex already and especially
      those with highlevel semantic should be clearly defined to prevent from
      confusion and abuse.  Therefore I propose dropping GFP_TEMPORARY and
      replace all existing users to simply use GFP_KERNEL.  Please note that
      SLAB users with shrinkers will still get __GFP_RECLAIMABLE heuristic and
      so they will be placed properly for memory fragmentation prevention.
      
      I can see reasons we might want some gfp flag to reflect shorterm
      allocations but I propose starting from a clear semantic definition and
      only then add users with proper justification.
      
      This was been brought up before LSF this year by Matthew [1] and it
      turned out that GFP_TEMPORARY really doesn't have a clear semantic.  It
      seems to be a heuristic without any measured advantage for most (if not
      all) its current users.  The follow up discussion has revealed that
      opinions on what might be temporary allocation differ a lot between
      developers.  So rather than trying to tweak existing users into a
      semantic which they haven't expected I propose to simply remove the flag
      and start from scratch if we really need a semantic for short term
      allocations.
      
      [1] http://lkml.kernel.org/r/20170118054945.GD18349@bombadil.infradead.org
      
      [akpm@linux-foundation.org: fix typo]
      [akpm@linux-foundation.org: coding-style fixes]
      [sfr@canb.auug.org.au: drm/i915: fix up]
        Link: http://lkml.kernel.org/r/20170816144703.378d4f4d@canb.auug.org.au
      Link: http://lkml.kernel.org/r/20170728091904.14627-1-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Acked-by: NMel Gorman <mgorman@suse.de>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Neil Brown <neilb@suse.de>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0ee931c4
  20. 05 9月, 2017 2 次提交
  21. 02 8月, 2017 10 次提交
    • K
      exec: Consolidate pdeath_signal clearing · fe8993b3
      Kees Cook 提交于
      Instead of an additional secureexec check for pdeath_signal, just move it
      up into the initial secureexec test. Neither perf nor arch code touches
      pdeath_signal, so the relocation shouldn't change anything.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Acked-by: NSerge Hallyn <serge@hallyn.com>
      fe8993b3
    • K
      exec: Use sane stack rlimit under secureexec · 64701dee
      Kees Cook 提交于
      For a secureexec, before memory layout selection has happened, reset the
      stack rlimit to something sane to avoid the caller having control over
      the resulting layouts.
      
      $ ulimit -s
      8192
      $ ulimit -s unlimited
      $ /bin/sh -c 'ulimit -s'
      unlimited
      $ sudo /bin/sh -c 'ulimit -s'
      8192
      
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Reviewed-by: NJames Morris <james.l.morris@oracle.com>
      Acked-by: NSerge Hallyn <serge@hallyn.com>
      64701dee
    • K
      exec: Consolidate dumpability logic · 473d8963
      Kees Cook 提交于
      Since it's already valid to set dumpability in the early part of
      setup_new_exec(), we can consolidate the logic into a single place.
      The BINPRM_FLAGS_ENFORCE_NONDUMP is set during would_dump() calls
      before setup_new_exec(), so its test is safe to move as well.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Acked-by: NSerge Hallyn <serge@hallyn.com>
      Reviewed-by: NJames Morris <james.l.morris@oracle.com>
      473d8963
    • K
      exec: Use secureexec for clearing pdeath_signal · a70423df
      Kees Cook 提交于
      Like dumpability, clearing pdeath_signal happens both in setup_new_exec()
      and later in commit_creds(). The test in setup_new_exec() is different
      from all other privilege comparisons, though: it is checking the new cred
      (bprm) uid vs the old cred (current) euid. This appears to be a bug,
      introduced by commit a6f76f23 ("CRED: Make execve() take advantage of
      copy-on-write credentials"):
      
      -       if (bprm->e_uid != current_euid() ||
      -           bprm->e_gid != current_egid()) {
      -               set_dumpable(current->mm, suid_dumpable);
      +       if (bprm->cred->uid != current_euid() ||
      +           bprm->cred->gid != current_egid()) {
      
      It was bprm euid vs current euid (and egids), but the effective got
      dropped. Nothing in the exec flow changes bprm->cred->uid (nor gid).
      The call traces are:
      
      	prepare_bprm_creds()
      	    prepare_exec_creds()
      	        prepare_creds()
      	            memcpy(new_creds, old_creds, ...)
      	            security_prepare_creds() (unimplemented by commoncap)
      	...
      	prepare_binprm()
      	    bprm_fill_uid()
      	        resets euid/egid to current euid/egid
      	        sets euid/egid on bprm based on set*id file bits
      	    security_bprm_set_creds()
      		cap_bprm_set_creds()
      		        handle all caps-based manipulations
      
      so this test is effectively a test of current_uid() vs current_euid(),
      which is wrong, just like the prior dumpability tests were wrong.
      
      The commit log says "Clear pdeath_signal and set dumpable on
      certain circumstances that may not be covered by commit_creds()." This
      may be meaning the earlier old euid vs new euid (and egid) test that
      got changed.
      
      Luckily, as with dumpability, this is all masked by commit_creds()
      which performs old/new euid and egid tests and clears pdeath_signal.
      
      And again, like dumpability, we should include LSM secureexec logic for
      pdeath_signal clearing. For example, Smack goes out of its way to clear
      pdeath_signal when it finds a secureexec condition.
      
      Cc: David Howells <dhowells@redhat.com>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Acked-by: NSerge Hallyn <serge@hallyn.com>
      Reviewed-by: NJames Morris <james.l.morris@oracle.com>
      a70423df
    • K
      exec: Use secureexec for setting dumpability · e37fdb78
      Kees Cook 提交于
      The examination of "current" to decide dumpability is wrong. This was a
      check of and euid/uid (or egid/gid) mismatch in the existing process,
      not the newly created one. This appears to stretch back into even the
      "history.git" tree. Luckily, dumpability is later set in commit_creds().
      In earlier kernel versions before creds existed, similar checks also
      existed late in the exec flow, covering up the mistake as far back as I
      could find.
      
      Note that because the commit_creds() check examines differences of euid,
      uid, egid, gid, and capabilities between the old and new creds, it would
      look like the setup_new_exec() dumpability test could be entirely removed.
      However, the secureexec test may cover a different set of tests (specific
      to the LSMs) than what commit_creds() checks for. So, fix this test to
      use secureexec (the removed euid tests are redundant to the commoncap
      secureexec checks now).
      
      Cc: David Howells <dhowells@redhat.com>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Acked-by: NSerge Hallyn <serge@hallyn.com>
      Reviewed-by: NJames Morris <james.l.morris@oracle.com>
      e37fdb78
    • K
      LSM: drop bprm_secureexec hook · 2af62280
      Kees Cook 提交于
      This removes the bprm_secureexec hook since the logic has been folded into
      the bprm_set_creds hook for all LSMs now.
      
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Reviewed-by: NJohn Johansen <john.johansen@canonical.com>
      Acked-by: NJames Morris <james.l.morris@oracle.com>
      Acked-by: NSerge Hallyn <serge@hallyn.com>
      2af62280
    • K
      commoncap: Refactor to remove bprm_secureexec hook · 46d98eb4
      Kees Cook 提交于
      The commoncap implementation of the bprm_secureexec hook is the only LSM
      that depends on the final call to its bprm_set_creds hook (since it may
      be called for multiple files, it ignores bprm->called_set_creds). As a
      result, it cannot safely _clear_ bprm->secureexec since other LSMs may
      have set it.  Instead, remove the bprm_secureexec hook by introducing a
      new flag to bprm specific to commoncap: cap_elevated. This is similar to
      cap_effective, but that is used for a specific subset of elevated
      privileges, and exists solely to track state from bprm_set_creds to
      bprm_secureexec. As such, it will be removed in the next patch.
      
      Here, set the new bprm->cap_elevated flag when setuid/setgid has happened
      from bprm_fill_uid() or fscapabilities have been prepared. This temporarily
      moves the bprm_secureexec hook to a static inline. The helper will be
      removed in the next patch; this makes the step easier to review and bisect,
      since this does not introduce any changes to inputs nor outputs to the
      "elevated privileges" calculation.
      
      The new flag is merged with the bprm->secureexec flag in setup_new_exec()
      since this marks the end of any further prepare_binprm() calls.
      
      Cc: Andy Lutomirski <luto@kernel.org>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Reviewed-by: NAndy Lutomirski <luto@kernel.org>
      Acked-by: NJames Morris <james.l.morris@oracle.com>
      Acked-by: NSerge Hallyn <serge@hallyn.com>
      46d98eb4
    • K
      binfmt: Introduce secureexec flag · c425e189
      Kees Cook 提交于
      The bprm_secureexec hook can be moved earlier. Right now, it is called
      during create_elf_tables(), via load_binary(), via search_binary_handler(),
      via exec_binprm(). Nearly all (see exception below) state used by
      bprm_secureexec is created during the bprm_set_creds hook, called from
      prepare_binprm().
      
      For all LSMs (except commoncaps described next), only the first execution
      of bprm_set_creds takes any effect (they all check bprm->called_set_creds
      which prepare_binprm() sets after the first call to the bprm_set_creds
      hook).  However, all these LSMs also only do anything with bprm_secureexec
      when they detected a secure state during their first run of bprm_set_creds.
      Therefore, it is functionally identical to move the detection into
      bprm_set_creds, since the results from secureexec here only need to be
      based on the first call to the LSM's bprm_set_creds hook.
      
      The single exception is that the commoncaps secureexec hook also examines
      euid/uid and egid/gid differences which are controlled by bprm_fill_uid(),
      via prepare_binprm(), which can be called multiple times (e.g.
      binfmt_script, binfmt_misc), and may clear the euid/egid for the final
      load (i.e. the script interpreter). However, while commoncaps specifically
      ignores bprm->cred_prepared, and runs its bprm_set_creds hook each time
      prepare_binprm() may get called, it needs to base the secureexec decision
      on the final call to bprm_set_creds. As a result, it will need special
      handling.
      
      To begin this refactoring, this adds the secureexec flag to the bprm
      struct, and calls the secureexec hook during setup_new_exec(). This is
      safe since all the cred work is finished (and past the point of no return).
      This explicit call will be removed in later patches once the hook has been
      removed.
      
      Cc: David Howells <dhowells@redhat.com>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Reviewed-by: NJohn Johansen <john.johansen@canonical.com>
      Acked-by: NSerge Hallyn <serge@hallyn.com>
      Reviewed-by: NJames Morris <james.l.morris@oracle.com>
      c425e189
    • K
      exec: Correct comments about "point of no return" · a9208e42
      Kees Cook 提交于
      In commit 221af7f8 ("Split 'flush_old_exec' into two functions"),
      the comment about the point of no return should have stayed in
      flush_old_exec() since it refers to "bprm->mm = NULL;" line, but prior
      changes in commits c89681ed ("remove steal_locks()"), and
      fd8328be ("sanitize handling of shared descriptor tables in failing
      execve()") made it look like it meant the current->sas_ss_sp line instead.
      
      The comment was referring to the fact that once bprm->mm is NULL, all
      failures from a binfmt load_binary hook (e.g. load_elf_binary), will
      get SEGV raised against current. Move this comment and expand the
      explanation a bit, putting it above the assignment this time, and add
      details about the true nature of "point of no return" being the call
      to flush_old_exec() itself.
      
      This also removes an erroneous commet about when credentials are being
      installed. That has its own dedicated function, install_exec_creds(),
      which carries a similar (and correct) comment, so remove the bogus comment
      where installation is not actually happening.
      
      Cc: David Howells <dhowells@redhat.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Acked-by: NSerge Hallyn <serge@hallyn.com>
      a9208e42
    • K
      exec: Rename bprm->cred_prepared to called_set_creds · ddb4a144
      Kees Cook 提交于
      The cred_prepared bprm flag has a misleading name. It has nothing to do
      with the bprm_prepare_cred hook, and actually tracks if bprm_set_creds has
      been called. Rename this flag and improve its comment.
      
      Cc: David Howells <dhowells@redhat.com>
      Cc: Stephen Smalley <sds@tycho.nsa.gov>
      Cc: Casey Schaufler <casey@schaufler-ca.com>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Acked-by: NJohn Johansen <john.johansen@canonical.com>
      Acked-by: NJames Morris <james.l.morris@oracle.com>
      Acked-by: NPaul Moore <paul@paul-moore.com>
      Acked-by: NSerge Hallyn <serge@hallyn.com>
      ddb4a144
  22. 08 7月, 2017 1 次提交
  23. 24 6月, 2017 1 次提交
    • K
      fs/exec.c: account for argv/envp pointers · 98da7d08
      Kees Cook 提交于
      When limiting the argv/envp strings during exec to 1/4 of the stack limit,
      the storage of the pointers to the strings was not included.  This means
      that an exec with huge numbers of tiny strings could eat 1/4 of the stack
      limit in strings and then additional space would be later used by the
      pointers to the strings.
      
      For example, on 32-bit with a 8MB stack rlimit, an exec with 1677721
      single-byte strings would consume less than 2MB of stack, the max (8MB /
      4) amount allowed, but the pointers to the strings would consume the
      remaining additional stack space (1677721 * 4 == 6710884).
      
      The result (1677721 + 6710884 == 8388605) would exhaust stack space
      entirely.  Controlling this stack exhaustion could result in
      pathological behavior in setuid binaries (CVE-2017-1000365).
      
      [akpm@linux-foundation.org: additional commenting from Kees]
      Fixes: b6a2fea3 ("mm: variable length argument support")
      Link: http://lkml.kernel.org/r/20170622001720.GA32173@beastSigned-off-by: NKees Cook <keescook@chromium.org>
      Acked-by: NRik van Riel <riel@redhat.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Qualys Security Advisory <qsa@qualys.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      98da7d08
  24. 20 3月, 2017 1 次提交
    • K
      x86/arch_prctl: Add ARCH_[GET|SET]_CPUID · e9ea1e7f
      Kyle Huey 提交于
      Intel supports faulting on the CPUID instruction beginning with Ivy Bridge.
      When enabled, the processor will fault on attempts to execute the CPUID
      instruction with CPL>0. Exposing this feature to userspace will allow a
      ptracer to trap and emulate the CPUID instruction.
      
      When supported, this feature is controlled by toggling bit 0 of
      MSR_MISC_FEATURES_ENABLES. It is documented in detail in Section 2.3.2 of
      https://bugzilla.kernel.org/attachment.cgi?id=243991
      
      Implement a new pair of arch_prctls, available on both x86-32 and x86-64.
      
      ARCH_GET_CPUID: Returns the current CPUID state, either 0 if CPUID faulting
          is enabled (and thus the CPUID instruction is not available) or 1 if
          CPUID faulting is not enabled.
      
      ARCH_SET_CPUID: Set the CPUID state to the second argument. If
          cpuid_enabled is 0 CPUID faulting will be activated, otherwise it will
          be deactivated. Returns ENODEV if CPUID faulting is not supported on
          this system.
      
      The state of the CPUID faulting flag is propagated across forks, but reset
      upon exec.
      Signed-off-by: NKyle Huey <khuey@kylehuey.com>
      Cc: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com>
      Cc: kvm@vger.kernel.org
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: linux-kselftest@vger.kernel.org
      Cc: Nadav Amit <nadav.amit@gmail.com>
      Cc: Robert O'Callahan <robert@ocallahan.org>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: user-mode-linux-devel@lists.sourceforge.net
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: user-mode-linux-user@lists.sourceforge.net
      Cc: David Matlack <dmatlack@google.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Dmitry Safonov <dsafonov@virtuozzo.com>
      Cc: linux-fsdevel@vger.kernel.org
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Link: http://lkml.kernel.org/r/20170320081628.18952-9-khuey@kylehuey.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      e9ea1e7f
  25. 02 3月, 2017 3 次提交