1. 10 10月, 2017 1 次提交
    • J
      x86/unwind: Use MSB for frame pointer encoding on 32-bit · 5c99b692
      Josh Poimboeuf 提交于
      On x86-32, Tetsuo Handa and Fengguang Wu reported unwinder warnings
      like:
      
        WARNING: kernel stack regs at f60bb9c8 in swapper:1 has bad 'bp' value 0ba00000
      
      And also there were some stack dumps with a bunch of unreliable '?'
      symbols after an apic_timer_interrupt symbol, meaning the unwinder got
      confused when it tried to read the regs.
      
      The cause of those issues is that, with GCC 4.8 (and possibly older),
      there are cases where GCC misaligns the stack pointer in a leaf function
      for no apparent reason:
      
        c124a388 <acpi_rs_move_data>:
        c124a388:       55                      push   %ebp
        c124a389:       89 e5                   mov    %esp,%ebp
        c124a38b:       57                      push   %edi
        c124a38c:       56                      push   %esi
        c124a38d:       89 d6                   mov    %edx,%esi
        c124a38f:       53                      push   %ebx
        c124a390:       31 db                   xor    %ebx,%ebx
        c124a392:       83 ec 03                sub    $0x3,%esp
        ...
        c124a3e3:       83 c4 03                add    $0x3,%esp
        c124a3e6:       5b                      pop    %ebx
        c124a3e7:       5e                      pop    %esi
        c124a3e8:       5f                      pop    %edi
        c124a3e9:       5d                      pop    %ebp
        c124a3ea:       c3                      ret
      
      If an interrupt occurs in such a function, the regs on the stack will be
      unaligned, which breaks the frame pointer encoding assumption.  So on
      32-bit, use the MSB instead of the LSB to encode the regs.
      
      This isn't an issue on 64-bit, because interrupts align the stack before
      writing to it.
      Reported-and-tested-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Reported-and-tested-by: NFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: LKP <lkp@01.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/279a26996a482ca716605c7dbc7f2db9d8d91e81.1507597785.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5c99b692
  2. 29 8月, 2017 2 次提交
  3. 24 5月, 2017 1 次提交
    • J
      Revert "x86/entry: Fix the end of the stack for newly forked tasks" · ebd57499
      Josh Poimboeuf 提交于
      Petr Mladek reported the following warning when loading the livepatch
      sample module:
      
        WARNING: CPU: 1 PID: 3699 at arch/x86/kernel/stacktrace.c:132 save_stack_trace_tsk_reliable+0x133/0x1a0
        ...
        Call Trace:
         __schedule+0x273/0x820
         schedule+0x36/0x80
         kthreadd+0x305/0x310
         ? kthread_create_on_cpu+0x80/0x80
         ? icmp_echo.part.32+0x50/0x50
         ret_from_fork+0x2c/0x40
      
      That warning means the end of the stack is no longer recognized as such
      for newly forked tasks.  The problem was introduced with the following
      commit:
      
        ff3f7e24 ("x86/entry: Fix the end of the stack for newly forked tasks")
      
      ... which was completely misguided.  It only partially fixed the
      reported issue, and it introduced another bug in the process.  None of
      the other entry code saves the frame pointer before calling into C code,
      so it doesn't make sense for ret_from_fork to do so either.
      
      Contrary to what I originally thought, the original issue wasn't related
      to newly forked tasks.  It was actually related to ftrace.  When entry
      code calls into a function which then calls into an ftrace handler, the
      stack frame looks different than normal.
      
      The original issue will be fixed in the unwinder, in a subsequent patch.
      Reported-by: NPetr Mladek <pmladek@suse.com>
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Dave Jones <davej@codemonkey.org.uk>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: live-patching@vger.kernel.org
      Fixes: ff3f7e24 ("x86/entry: Fix the end of the stack for newly forked tasks")
      Link: http://lkml.kernel.org/r/f350760f7e82f0750c8d1dd093456eb212751caa.1495553739.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ebd57499
  4. 24 3月, 2017 1 次提交
  5. 01 3月, 2017 1 次提交
  6. 12 1月, 2017 1 次提交
    • J
      x86/entry: Fix the end of the stack for newly forked tasks · ff3f7e24
      Josh Poimboeuf 提交于
      When unwinding a task, the end of the stack is always at the same offset
      right below the saved pt_regs, regardless of which syscall was used to
      enter the kernel.  That convention allows the unwinder to verify that a
      stack is sane.
      
      However, newly forked tasks don't always follow that convention, as
      reported by the following unwinder warning seen by Dave Jones:
      
        WARNING: kernel stack frame pointer at ffffc90001443f30 in kworker/u8:8:30468 has bad value           (null)
      
      The warning was due to the following call chain:
      
        (ftrace handler)
        call_usermodehelper_exec_async+0x5/0x140
        ret_from_fork+0x22/0x30
      
      The problem is that ret_from_fork() doesn't create a stack frame before
      calling other functions.  Fix that by carefully using the frame pointer
      macros.
      
      In addition to conforming to the end of stack convention, this also
      makes related stack traces more sensible by making it clear to the user
      that ret_from_fork() was involved.
      Reported-by: NDave Jones <davej@codemonkey.org.uk>
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Miroslav Benes <mbenes@suse.cz>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/8854cdaab980e9700a81e9ebf0d4238e4bbb68ef.1483978430.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ff3f7e24
  7. 09 12月, 2016 1 次提交
  8. 21 10月, 2016 1 次提交
    • J
      x86/entry/unwind: Create stack frames for saved interrupt registers · 946c1911
      Josh Poimboeuf 提交于
      With frame pointers, when a task is interrupted, its stack is no longer
      completely reliable because the function could have been interrupted
      before it had a chance to save the previous frame pointer on the stack.
      So the caller of the interrupted function could get skipped by a stack
      trace.
      
      This is problematic for live patching, which needs to know whether a
      stack trace of a sleeping task can be relied upon.  There's currently no
      way to detect if a sleeping task was interrupted by a page fault
      exception or preemption before it went to sleep.
      
      Another issue is that when dumping the stack of an interrupted task, the
      unwinder has no way of knowing where the saved pt_regs registers are, so
      it can't print them.
      
      This solves those issues by encoding the pt_regs pointer in the frame
      pointer on entry from an interrupt or an exception.
      
      This patch also updates the unwinder to be able to decode it, because
      otherwise the unwinder would be broken by this change.
      
      Note that this causes a change in the behavior of the unwinder: each
      instance of a pt_regs on the stack is now considered a "frame".  So
      callers of unwind_get_return_address() will now get an occasional
      'regs->ip' address that would have previously been skipped over.
      Suggested-by: NAndy Lutomirski <luto@amacapital.net>
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/8b9f84a21e39d249049e0547b559ff8da0df0988.1476973742.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      946c1911
  9. 20 10月, 2016 3 次提交
  10. 24 8月, 2016 2 次提交
  11. 08 8月, 2016 1 次提交
  12. 15 7月, 2016 1 次提交
  13. 05 5月, 2016 2 次提交
  14. 10 3月, 2016 6 次提交
    • A
      x86/entry/32: Change INT80 to be an interrupt gate · a798f091
      Andy Lutomirski 提交于
      We want all of the syscall entries to run with interrupts off so that
      we can efficiently run context tracking before enabling interrupts.
      
      This will regress int $0x80 performance on 32-bit kernels by a
      couple of cycles.  This shouldn't matter much -- int $0x80 is not a
      fast path.
      
      This effectively reverts:
      
        657c1eea ("x86/entry/32: Fix entry_INT80_32() to expect interrupts to be on")
      
      ... and fixes the same issue differently.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/59b4f90c9ebfccd8c937305dbbbca680bc74b905.1457558566.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a798f091
    • A
      x86/entry: Improve system call entry comments · fda57b22
      Andy Lutomirski 提交于
      Ingo suggested that the comments should explain when the various
      entries are used.  This adds these explanations and improves other
      parts of the comments.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Andrew Cooper <andrew.cooper3@citrix.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/9524ecef7a295347294300045d08354d6a57c6e7.1457578375.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      fda57b22
    • A
      x86/entry/32: Simplify and fix up the SYSENTER stack #DB/NMI fixup · 7536656f
      Andy Lutomirski 提交于
      Right after SYSENTER, we can get a #DB or NMI.  On x86_32, there's no IST,
      so the exception handler is invoked on the temporary SYSENTER stack.
      
      Because the SYSENTER stack is very small, we have a fixup to switch
      off the stack quickly when this happens.  The old fixup had several issues:
      
       1. It checked the interrupt frame's CS and EIP.  This wasn't
          obviously correct on Xen or if vm86 mode was in use [1].
      
       2. In the NMI handler, it did some frightening digging into the
          stack frame.  I'm not convinced this digging was correct.
      
       3. The fixup didn't switch stacks and then switch back.  Instead, it
          synthesized a brand new stack frame that would redirect the IRET
          back to the SYSENTER code.  That frame was highly questionable.
          For one thing, if NMI nested inside #DB, we would effectively
          abort the #DB prologue, which was probably safe but was
          frightening.  For another, the code used PUSHFL to write the
          FLAGS portion of the frame, which was simply bogus -- by the time
          PUSHFL was called, at least TF, NT, VM, and all of the arithmetic
          flags were clobbered.
      
      Simplify this considerably.  Instead of looking at the saved frame
      to see where we came from, check the hardware ESP register against
      the SYSENTER stack directly.  Malicious user code cannot spoof the
      kernel ESP register, and by moving the check after SAVE_ALL, we can
      use normal PER_CPU accesses to find all the relevant addresses.
      
      With this patch applied, the improved syscall_nt_32 test finally
      passes on 32-bit kernels.
      
      [1] It isn't obviously correct, but it is nonetheless safe from vm86
          shenanigans as far as I can tell.  A user can't point EIP at
          entry_SYSENTER_32 while in vm86 mode because entry_SYSENTER_32,
          like all kernel addresses, is greater than 0xffff and would thus
          violate the CS segment limit.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Andrew Cooper <andrew.cooper3@citrix.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/b2cdbc037031c07ecf2c40a96069318aec0e7971.1457578375.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      7536656f
    • A
      x86/entry: Vastly simplify SYSENTER TF (single-step) handling · f2b37575
      Andy Lutomirski 提交于
      Due to a blatant design error, SYSENTER doesn't clear TF (single-step).
      
      As a result, if a user does SYSENTER with TF set, we will single-step
      through the kernel until something clears TF.  There is absolutely
      nothing we can do to prevent this short of turning off SYSENTER [1].
      
      Simplify the handling considerably with two changes:
      
        1. We already sanitize EFLAGS in SYSENTER to clear NT and AC.  We can
           add TF to that list of flags to sanitize with no overhead whatsoever.
      
        2. Teach do_debug() to ignore single-step traps in the SYSENTER prologue.
      
      That's all we need to do.
      
      Don't get too excited -- our handling is still buggy on 32-bit
      kernels.  There's nothing wrong with the SYSENTER code itself, but
      the #DB prologue has a clever fixup for traps on the very first
      instruction of entry_SYSENTER_32, and the fixup doesn't work quite
      correctly.  The next two patches will fix that.
      
      [1] We could probably prevent it by forcing BTF on at all times and
          making sure we clear TF before any branches in the SYSENTER
          code.  Needless to say, this is a bad idea.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Andrew Cooper <andrew.cooper3@citrix.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/a30d2ea06fe4b621fe6a9ef911b02c0f38feb6f2.1457578375.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f2b37575
    • A
      x86/entry/32: Restore FLAGS on SYSEXIT · c2c9b52f
      Andy Lutomirski 提交于
      We weren't restoring FLAGS at all on SYSEXIT.  Apparently no one cared.
      
      With this patch applied, native kernels should always honor
      task_pt_regs()->flags, which opens the door for some sys_iopl()
      cleanups.  I'll do those as a separate series, though, since getting
      it right will involve tweaking some paravirt ops.
      
      ( The short version is that, before this patch, sys_iopl(), invoked via
        SYSENTER, wasn't guaranteed to ever transfer the updated
        regs->flags, so sys_iopl() had to change the hardware flags register
        as well. )
      Reported-by: NBrian Gerst <brgerst@gmail.com>
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Andrew Cooper <andrew.cooper3@citrix.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/3f98b207472dc9784838eb5ca2b89dcc845ce269.1457578375.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c2c9b52f
    • A
      x86/entry/32: Filter NT and speed up AC filtering in SYSENTER · 67f590e8
      Andy Lutomirski 提交于
      This makes the 32-bit code work just like the 64-bit code.  It should
      speed up syscalls on 32-bit kernels on Skylake by something like 20
      cycles (by analogy to the 64-bit compat case).
      
      It also cleans up NT just like we do for the 64-bit case.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Andrew Cooper <andrew.cooper3@citrix.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/07daef3d44bd1ed62a2c866e143e8df64edb40ee.1457578375.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      67f590e8
  15. 08 3月, 2016 1 次提交
    • A
      x86/entry/32: Introduce and use X86_BUG_ESPFIX instead of paravirt_enabled · 58a5aac5
      Andy Lutomirski 提交于
      x86_64 has very clean espfix handling on paravirt: espfix64 is set
      up in native_iret, so paravirt systems that override iret bypass
      espfix64 automatically.  This is robust and straightforward.
      
      x86_32 is messier.  espfix is set up before the IRET paravirt patch
      point, so it can't be directly conditionalized on whether we use
      native_iret.  We also can't easily move it into native_iret without
      regressing performance due to a bizarre consideration.  Specifically,
      on 64-bit kernels, the logic is:
      
        if (regs->ss & 0x4)
                setup_espfix;
      
      On 32-bit kernels, the logic is:
      
        if ((regs->ss & 0x4) && (regs->cs & 0x3) == 3 &&
            (regs->flags & X86_EFLAGS_VM) == 0)
                setup_espfix;
      
      The performance of setup_espfix itself is essentially irrelevant, but
      the comparison happens on every IRET so its performance matters.  On
      x86_64, there's no need for any registers except flags to implement
      the comparison, so we fold the whole thing into native_iret.  On
      x86_32, we don't do that because we need a free register to
      implement the comparison efficiently.  We therefore do espfix setup
      before restoring registers on x86_32.
      
      This patch gets rid of the explicit paravirt_enabled check by
      introducing X86_BUG_ESPFIX on 32-bit systems and using an ALTERNATIVE
      to skip espfix on paravirt systems where iret != native_iret.  This is
      also messy, but it's at least in line with other things we do.
      
      This improves espfix performance by removing a branch, but no one
      cares.  More importantly, it removes a paravirt_enabled user, which is
      good because paravirt_enabled is ill-defined and is going away.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Cc: Andrew Cooper <andrew.cooper3@citrix.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: boris.ostrovsky@oracle.com
      Cc: david.vrabel@citrix.com
      Cc: konrad.wilk@oracle.com
      Cc: lguest@lists.ozlabs.org
      Cc: xen-devel@lists.xensource.com
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      58a5aac5
  16. 24 2月, 2016 1 次提交
  17. 30 1月, 2016 1 次提交
  18. 21 12月, 2015 2 次提交
  19. 19 12月, 2015 1 次提交
  20. 23 11月, 2015 2 次提交
  21. 18 10月, 2015 2 次提交
  22. 09 10月, 2015 3 次提交
  23. 07 10月, 2015 1 次提交
  24. 05 8月, 2015 2 次提交
    • A
      x86/entry/32: Migrate to C exit path · 5d73fc70
      Andy Lutomirski 提交于
      This removes the hybrid asm-and-C implementation of exit work.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eric Paris <eparis@parisplace.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/2baa438619ea6c027b40ec9fceacca52f09c74d09.1438378274.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5d73fc70
    • A
      x86/entry/32: Remove 32-bit syscall audit optimizations · c5f69fde
      Andy Lutomirski 提交于
      The asm audit optimizations are ugly and obfuscate the code too
      much. Remove them.
      
      This will regress performance if syscall auditing is enabled on
      32-bit kernels and SYSENTER is in use. If this becomes a
      problem, interested parties are encouraged to implement the
      equivalent of the 64-bit opportunistic SYSRET optimization.
      
      Alternatively, a case could be made that, on 32-bit kernels, a
      less messy asm audit optimization could be done. 32-bit kernels
      don't have the complicated partial register saving tricks that
      64-bit kernels have, so the SYSENTER post-syscall path could
      just call the audit hooks directly.  Any reimplementation of
      this ought to demonstrate that it only calls the audit hook once
      per syscall, though, which does not currently appear to be true.
      
      Someone would have to make the case that doing so would be
      better than implementing opportunistic SYSEXIT, though.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eric Paris <eparis@parisplace.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/212be39dd8c90b44c4b7bbc678128d6b88bdb9912.1438378274.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c5f69fde