1. 15 3月, 2017 1 次提交
  2. 11 3月, 2017 1 次提交
    • T
      kexec, x86/purgatory: Unbreak it and clean it up · 40c50c1f
      Thomas Gleixner 提交于
      The purgatory code defines global variables which are referenced via a
      symbol lookup in the kexec code (core and arch).
      
      A recent commit addressing sparse warnings made these static and thereby
      broke kexec_file.
      
      Why did this happen? Simply because the whole machinery is undocumented and
      lacks any form of forward declarations. The variable names are unspecific
      and lack a prefix, so adding forward declarations creates shadow variables
      in the core code. Aside of that the code relies on magic constants and
      duplicate struct definitions with no way to ensure that these things stay
      in sync. The section placement of the purgatory variables happened by
      chance and not by design.
      
      Unbreak kexec and cleanup the mess:
      
       - Add proper forward declarations and document the usage
       - Use common struct definition
       - Use the proper common defines instead of magic constants
       - Add a purgatory_ prefix to have a proper name space
       - Use ARRAY_SIZE() instead of a homebrewn reimplementation
       - Add proper sections to the purgatory variables [ From Mike ]
      
      Fixes: 72042a8c ("x86/purgatory: Make functions and variables static")
      Reported-by: NMike Galbraith <&lt;efault@gmx.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Nicholas Mc Guire <der.herr@hofr.at>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: "Tobin C. Harding" <me@tobin.cc>
      Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1703101315140.3681@nanosSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      40c50c1f
  3. 10 3月, 2017 3 次提交
    • A
      userfaultfd: non-cooperative: rollback userfaultfd_exit · dd0db88d
      Andrea Arcangeli 提交于
      Patch series "userfaultfd non-cooperative further update for 4.11 merge
      window".
      
      Unfortunately I noticed one relevant bug in userfaultfd_exit while doing
      more testing.  I've been doing testing before and this was also tested
      by kbuild bot and exercised by the selftest, but this bug never
      reproduced before.
      
      I dropped userfaultfd_exit as result.  I dropped it because of
      implementation difficulty in receiving signals in __mmput and because I
      think -ENOSPC as result from the background UFFDIO_COPY should be enough
      already.
      
      Before I decided to remove userfaultfd_exit, I noticed userfaultfd_exit
      wasn't exercised by the selftest and when I tried to exercise it, after
      moving it to a more correct place in __mmput where it would make more
      sense and where the vma list is stable, it resulted in the
      event_wait_completion in D state.  So then I added the second patch to
      be sure even if we call userfaultfd_event_wait_completion too late
      during task exit(), we won't risk to generate tasks in D state.  The
      same check exists in handle_userfault() for the same reason, except it
      makes a difference there, while here is just a robustness check and it's
      run under WARN_ON_ONCE.
      
      While looking at the userfaultfd_event_wait_completion() function I
      looked back at its callers too while at it and I think it's not ok to
      stop executing dup_fctx on the fcs list because we relay on
      userfaultfd_event_wait_completion to execute
      userfaultfd_ctx_put(fctx->orig) which is paired against
      userfaultfd_ctx_get(fctx->orig) in dup_userfault just before
      list_add(fcs).  This change only takes care of fctx->orig but this area
      also needs further review looking for similar problems in fctx->new.
      
      The only patch that is urgent is the first because it's an use after
      free during a SMP race condition that affects all processes if
      CONFIG_USERFAULTFD=y.  Very hard to reproduce though and probably
      impossible without SLUB poisoning enabled.
      
      This patch (of 3):
      
      I once reproduced this oops with the userfaultfd selftest, it's not
      easily reproducible and it requires SLUB poisoning to reproduce.
      
          general protection fault: 0000 [#1] SMP
          Modules linked in:
          CPU: 2 PID: 18421 Comm: userfaultfd Tainted: G               ------------ T 3.10.0+ #15
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.1-0-g8891697-prebuilt.qemu-project.org 04/01/2014
          task: ffff8801f83b9440 ti: ffff8801f833c000 task.ti: ffff8801f833c000
          RIP: 0010:[<ffffffff81451299>]  [<ffffffff81451299>] userfaultfd_exit+0x29/0xa0
          RSP: 0018:ffff8801f833fe80  EFLAGS: 00010202
          RAX: ffff8801f833ffd8 RBX: 6b6b6b6b6b6b6b6b RCX: ffff8801f83b9440
          RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8800baf18600
          RBP: ffff8801f833fee8 R08: 0000000000000000 R09: 0000000000000001
          R10: 0000000000000000 R11: ffffffff8127ceb3 R12: 0000000000000000
          R13: ffff8800baf186b0 R14: ffff8801f83b99f8 R15: 00007faed746c700
          FS:  0000000000000000(0000) GS:ffff88023fc80000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
          CR2: 00007faf0966f028 CR3: 0000000001bc6000 CR4: 00000000000006e0
          DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
          DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
          Call Trace:
            do_exit+0x297/0xd10
            SyS_exit+0x17/0x20
            tracesys+0xdd/0xe2
          Code: 00 00 66 66 66 66 90 55 48 89 e5 41 54 53 48 83 ec 58 48 8b 1f 48 85 db 75 11 eb 73 66 0f 1f 44 00 00 48 8b 5b 10 48 85 db 74 64 <4c> 8b a3 b8 00 00 00 4d 85 e4 74 eb 41 f6 84 24 2c 01 00 00 80
          RIP  [<ffffffff81451299>] userfaultfd_exit+0x29/0xa0
           RSP <ffff8801f833fe80>
          ---[ end trace 9fecd6dcb442846a ]---
      
      In the debugger I located the "mm" pointer in the stack and walking
      mm->mmap->vm_next through the end shows the vma->vm_next list is fully
      consistent and it is null terminated list as expected.  So this has to
      be an SMP race condition where userfaultfd_exit was running while the
      vma list was being modified by another CPU.
      
      When userfaultfd_exit() run one of the ->vm_next pointers pointed to
      SLAB_POISON (RBX is the vma pointer and is 0x6b6b..).
      
      The reason is that it's not running in __mmput but while there are still
      other threads running and it's not holding the mmap_sem (it can't as it
      has to wait the even to be received by the manager).  So this is an use
      after free that was happening for all processes.
      
      One more implementation problem aside from the race condition:
      userfaultfd_exit has really to check a flag in mm->flags before walking
      the vma or it's going to slowdown the exit() path for regular tasks.
      
      One more implementation problem: at that point signals can't be
      delivered so it would also create a task in D state if the manager
      doesn't read the event.
      
      The major design issue: it overall looks superfluous as the manager can
      check for -ENOSPC in the background transfer:
      
      	if (mmget_not_zero(ctx->mm)) {
      [..]
      	} else {
      		return -ENOSPC;
      	}
      
      It's safer to roll it back and re-introduce it later if at all.
      
      [rppt@linux.vnet.ibm.com: documentation fixup after removal of UFFD_EVENT_EXIT]
        Link: http://lkml.kernel.org/r/1488345437-4364-1-git-send-email-rppt@linux.vnet.ibm.com
      Link: http://lkml.kernel.org/r/20170224181957.19736-2-aarcange@redhat.comSigned-off-by: NAndrea Arcangeli <aarcange@redhat.com>
      Signed-off-by: NMike Rapoport <rppt@linux.vnet.ibm.com>
      Acked-by: NMike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dd0db88d
    • M
      scripts/spelling.txt: add "overide" pattern and fix typo instances · 505d3085
      Masahiro Yamada 提交于
      Fix typos and add the following to the scripts/spelling.txt:
      
        overide||override
      
      While we are here, fix the doubled "address" in the touched line
      Documentation/devicetree/bindings/regulator/ti-abb-regulator.txt.
      
      Also, fix the comment block style in the touched hunks in
      drivers/media/dvb-frontends/drx39xyj/drx_driver.h.
      
      Link: http://lkml.kernel.org/r/1481573103-11329-21-git-send-email-yamada.masahiro@socionext.comSigned-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      505d3085
    • M
      scripts/spelling.txt: add "disble(d)" pattern and fix typo instances · 8a1115ff
      Masahiro Yamada 提交于
      Fix typos and add the following to the scripts/spelling.txt:
      
        disble||disable
        disbled||disabled
      
      I kept the TSL2563_INT_DISBLED in /drivers/iio/light/tsl2563.c
      untouched.  The macro is not referenced at all, but this commit is
      touching only comment blocks just in case.
      
      Link: http://lkml.kernel.org/r/1481573103-11329-20-git-send-email-yamada.masahiro@socionext.comSigned-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8a1115ff
  4. 09 3月, 2017 1 次提交
    • L
      sched/headers: fix up header file dependency on <linux/sched/signal.h> · bd0f9b35
      Linus Torvalds 提交于
      The scheduler header file split and cleanups ended up exposing a few
      nasty header file dependencies, and in particular it showed how we in
      <linux/wait.h> ended up depending on "signal_pending()", which now comes
      from <linux/sched/signal.h>.
      
      That's a very subtle and annoying dependency, which already caused a
      semantic merge conflict (see commit e58bc927 "Pull overlayfs updates
      from Miklos Szeredi", which added that fixup in the merge commit).
      
      It turns out that we can avoid this dependency _and_ improve code
      generation by moving the guts of the fairly nasty helper #define
      __wait_event_interruptible_locked() to out-of-line code.  The code that
      includes the signal_pending() check is all in the slow-path where we
      actually go to sleep waiting for the event anyway, so using a helper
      function is the right thing to do.
      
      Using a helper function is also what we already did for the non-locked
      versions, see the "__wait_event*()" macros and the "prepare_to_wait*()"
      set of helper functions.
      
      We might want to try to unify all these macro games, we have a _lot_ of
      subtly different wait-event loops.  But this is the minimal patch to fix
      the annoying header dependency.
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bd0f9b35
  5. 07 3月, 2017 2 次提交
  6. 06 3月, 2017 2 次提交
  7. 03 3月, 2017 8 次提交
  8. 02 3月, 2017 22 次提交