1. 29 5月, 2009 7 次提交
  2. 27 5月, 2009 4 次提交
  3. 25 5月, 2009 1 次提交
  4. 23 5月, 2009 1 次提交
  5. 22 5月, 2009 1 次提交
  6. 16 5月, 2009 1 次提交
    • J
      x86: Fix performance regression caused by paravirt_ops on native kernels · b4ecc126
      Jeremy Fitzhardinge 提交于
      Xiaohui Xin and some other folks at Intel have been looking into what's
      behind the performance hit of paravirt_ops when running native.
      
      It appears that the hit is entirely due to the paravirtualized
      spinlocks introduced by:
      
       | commit 8efcbab6
       | Date:   Mon Jul 7 12:07:51 2008 -0700
       |
       |     paravirt: introduce a "lock-byte" spinlock implementation
      
      The extra call/return in the spinlock path is somehow
      causing an increase in the cycles/instruction of somewhere around 2-7%
      (seems to vary quite a lot from test to test).  The working theory is
      that the CPU's pipeline is getting upset about the
      call->call->locked-op->return->return, and seems to be failing to
      speculate (though I haven't seen anything definitive about the precise
      reasons).  This doesn't entirely make sense, because the performance
      hit is also visible on unlock and other operations which don't involve
      locked instructions.  But spinlock operations clearly swamp all the
      other pvops operations, even though I can't imagine that they're
      nearly as common (there's only a .05% increase in instructions
      executed).
      
      If I disable just the pv-spinlock calls, my tests show that pvops is
      identical to non-pvops performance on native (my measurements show that
      it is actually about .1% faster, but Xiaohui shows a .05% slowdown).
      
      Summary of results, averaging 10 runs of the "mmperf" test, using a
      no-pvops build as baseline:
      
      		nopv		Pv-nospin	Pv-spin
      CPU cycles	100.00%		99.89%		102.18%
      instructions	100.00%		100.10%		100.15%
      CPI		100.00%		99.79%		102.03%
      cache ref	100.00%		100.84%		100.28%
      cache miss	100.00%		90.47%		88.56%
      cache miss rate	100.00%		89.72%		88.31%
      branches	100.00%		99.93%		100.04%
      branch miss	100.00%		103.66%		107.72%
      branch miss rt	100.00%		103.73%		107.67%
      wallclock	100.00%		99.90%		102.20%
      
      The clear effect here is that the 2% increase in CPI is
      directly reflected in the final wallclock time.
      
      (The other interesting effect is that the more ops are
      out of line calls via pvops, the lower the cache access
      and miss rates.  Not too surprising, but it suggests that
      the non-pvops kernel is over-inlined.  On the flipside,
      the branch misses go up correspondingly...)
      
      So, what's the fix?
      
      Paravirt patching turns all the pvops calls into direct calls, so
      _spin_lock etc do end up having direct calls.  For example, the compiler
      generated code for paravirtualized _spin_lock is:
      
      <_spin_lock+0>:		mov    %gs:0xb4c8,%rax
      <_spin_lock+9>:		incl   0xffffffffffffe044(%rax)
      <_spin_lock+15>:	callq  *0xffffffff805a5b30
      <_spin_lock+22>:	retq
      
      The indirect call will get patched to:
      <_spin_lock+0>:		mov    %gs:0xb4c8,%rax
      <_spin_lock+9>:		incl   0xffffffffffffe044(%rax)
      <_spin_lock+15>:	callq <__ticket_spin_lock>
      <_spin_lock+20>:	nop; nop		/* or whatever 2-byte nop */
      <_spin_lock+22>:	retq
      
      One possibility is to inline _spin_lock, etc, when building an
      optimised kernel (ie, when there's no spinlock/preempt
      instrumentation/debugging enabled).  That will remove the outer
      call/return pair, returning the instruction stream to a single
      call/return, which will presumably execute the same as the non-pvops
      case.  The downsides arel 1) it will replicate the
      preempt_disable/enable code at eack lock/unlock callsite; this code is
      fairly small, but not nothing; and 2) the spinlock definitions are
      already a very heavily tangled mass of #ifdefs and other preprocessor
      magic, and making any changes will be non-trivial.
      
      The other obvious answer is to disable pv-spinlocks.  Making them a
      separate config option is fairly easy, and it would be trivial to
      enable them only when Xen is enabled (as the only non-default user).
      But it doesn't really address the common case of a distro build which
      is going to have Xen support enabled, and leaves the open question of
      whether the native performance cost of pv-spinlocks is worth the
      performance improvement on a loaded Xen system (10% saving of overall
      system CPU when guests block rather than spin).  Still it is a
      reasonable short-term workaround.
      
      [ Impact: fix pvops performance regression when running native ]
      Analysed-by: N"Xin Xiaohui" <xiaohui.xin@intel.com>
      Analysed-by: N"Li Xin" <xin.li@intel.com>
      Analysed-by: N"Nakajima Jun" <jun.nakajima@intel.com>
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Acked-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Xen-devel <xen-devel@lists.xensource.com>
      LKML-Reference: <4A0B62F7.5030802@goop.org>
      [ fixed the help text ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b4ecc126
  7. 15 5月, 2009 1 次提交
  8. 14 5月, 2009 1 次提交
    • S
      x86/function-graph: fix constraint for recording old return value · aa512a27
      Steven Rostedt 提交于
      After upgrading from gcc 4.2.2 to 4.4.0, the function graph tracer broke.
      Investigating, I found that in the asm that replaces the return value,
      gcc was using the same register for the old value as it was for the
      new value.
      
      	mov	(addr), old
      	mov	new, (addr)
      
      But if old and new are the same register, we clobber new with old!
      I first thought this was a bug in gcc 4.4.0 and reported it:
      
        http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40132
      
      Andrew Pinski responded (quickly), saying that it was correct gcc behavior
      and the code needed to denote old as an "early clobber".
      
      Instead of "=r"(old), we need "=&r"(old).
      
      [Impact: keep function graph tracer from breaking with gcc 4.4.0 ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      aa512a27
  9. 11 5月, 2009 1 次提交
    • Y
      x86: mtrr: Fix high_width computation when phys-addr is >= 44bit · 917a0153
      Yinghai Lu 提交于
      found one system where cpu address line is 44bits, mtrr printout
      is not right:
      
       [    0.000000] MTRR variable ranges enabled:
       [    0.000000]   0 base 0   00000000 mask FF0 00000000 write-back
       [    0.000000]   1 base 10  00000000 mask FFF 80000000 write-back
       [    0.000000]   2 base 0   80000000 mask FFF 80000000 uncachable
       [    0.000000]   3 base 0   7F800000 mask FFF FF800000 uncachable
      
      Li Zefan and Frederic pointed out the high_width could be -4 some how.
      
      It turns out when phys_addr is 44bit, size_or_mask will be
      ffffffff,00000000 so ffs(size_or_mask) will be 0.
      
      Try to check low 32 bit, to get correct high_width.
      Signed-off-by: NYinghai Lu <yinghai@kerne.org>
      Also-analyzed-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Also-analyzed-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Cc: Zhaolei <zhaolei@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Vegard Nossum <vegard.nossum@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      LKML-Reference: <4A026540.8060504@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      917a0153
  10. 10 5月, 2009 1 次提交
  11. 08 5月, 2009 3 次提交
    • H
      x86: MCE: make cmci_discover_lock irq-safe · e5299926
      Hidetoshi Seto 提交于
      Lockdep reports the warning below when Li tries to offline one cpu:
      
      [  110.835487] =================================
      [  110.835616] [ INFO: inconsistent lock state ]
      [  110.835688] 2.6.30-rc4-00336-g8c9ed899 #52
      [  110.835757] ---------------------------------
      [  110.835828] inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
      [  110.835908] swapper/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
      [  110.835982]  (cmci_discover_lock){?.+...}, at: [<ffffffff80236dc0>] cmci_clear+0x30/0x9b
      
      cmci_clear() can be called via smp_call_function_single().
      
      It is better to disable interrupt while holding cmci_discover_lock,
      to turn it into an irq-safe lock - we can deadlock otherwise.
      
      [ Impact: fix possible deadlock in the MCE code ]
      Reported-by: NShaohua Li <shaohua.li@intel.com>
      Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      LKML-Reference: <4A03ED38.8000700@jp.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Reported-by: Shaohua Li<shaohua.li@intel.com>
      e5299926
    • H
      x86, kexec: fix crashdump panic with CONFIG_KEXEC_JUMP · 6407df5c
      Huang Ying 提交于
      Tim Starling reported that crashdump will panic with kernel compiled
      with CONFIG_KEXEC_JUMP due to null pointer deference in
      machine_kexec_32.c: machine_kexec(), when deferencing
      kexec_image. Refering to:
      
      http://bugzilla.kernel.org/show_bug.cgi?id=13265
      
      This patch fixes the BUG via replacing global variable reference:
      kexec_image in machine_kexec() with local variable reference: image,
      which is more appropriate, and will not be null.
      
      Same BUG is in machine_kexec_64.c too, so fixed too in the same way.
      
      [ Impact: fix crash on kexec ]
      Reported-by: NTim Starling <tstarling@wikimedia.org>
      Signed-off-by: NHuang Ying <ying.huang@intel.com>
      LKML-Reference: <1241751101.6259.85.camel@yhuang-dev.sh.intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      6407df5c
    • J
      x86: fix boot hang in early_reserve_e820() · 61438766
      Jan Beulich 提交于
      If the first non-reserved (sub-)range doesn't fit the size requested,
      an endless loop will be entered. If a range returned from
      find_e820_area_size() turns out insufficient in size, the range must
      be skipped before calling the function again.
      
      [ Impact: fixes boot hang on some platforms ]
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      61438766
  12. 05 5月, 2009 1 次提交
    • A
      x86: show number of core_siblings instead of thread_siblings in /proc/cpuinfo · 35d11680
      Andreas Herrmann 提交于
      Commit 7ad728f9
      (cpumask: x86: convert cpu_sibling_map/cpu_core_map to cpumask_var_t)
      changed the output of /proc/cpuinfo for siblings:
      
      Example on an AMD Phenom:
      
        physical id   : 0
        siblings : 1
        core id	   : 3
        cpu cores  : 4
      
      Before that commit it was:
      
        physical id	: 0
        siblings : 4
        core id	   : 3
        cpu cores  : 4
      
      Instead of cpu_core_mask it now uses cpu_sibling_mask to count siblings.
      This is due to the following hunk of above commit:
      
      |  --- a/arch/x86/kernel/cpu/proc.c
      |  +++ b/arch/x86/kernel/cpu/proc.c
      |  @@ -14,7 +14,7 @@ static void show_cpuinfo_core(struct seq_file *m, struct cpuinf
      |          if (c->x86_max_cores * smp_num_siblings > 1) {
      |                  seq_printf(m, "physical id\t: %d\n", c->phys_proc_id);
      |                  seq_printf(m, "siblings\t: %d\n",
      |  -                          cpus_weight(per_cpu(cpu_core_map, cpu)));
      |  +                          cpumask_weight(cpu_sibling_mask(cpu)));
      |                  seq_printf(m, "core id\t\t: %d\n", c->cpu_core_id);
      |                  seq_printf(m, "cpu cores\t: %d\n", c->booted_cores);
      |                  seq_printf(m, "apicid\t\t: %d\n", c->apicid);
      
      This was a mistake, because the impact line shows that this side-effect
      was not anticipated:
      
         Impact: reduce per-cpu size for CONFIG_CPUMASK_OFFSTACK=y
      
      So revert the respective hunk to restore the old behavior.
      
      [ Impact: fix sibling-info regression in /proc/cpuinfo ]
      Signed-off-by: NAndreas Herrmann <andreas.herrmann3@amd.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      LKML-Reference: <20090504182859.GA29045@alberich.amd.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      35d11680
  13. 04 5月, 2009 1 次提交
  14. 02 5月, 2009 1 次提交
  15. 24 4月, 2009 1 次提交
  16. 23 4月, 2009 2 次提交
  17. 22 4月, 2009 3 次提交
  18. 21 4月, 2009 4 次提交
    • R
      x86: avoid theoretical spurious NMI backtraces with CONFIG_CPUMASK_OFFSTACK=y · fcc5c4a2
      Rusty Russell 提交于
      In theory (though not shown in practice) alloc_cpumask_var() doesn't zero
      memory, so CPUs might print an "NMI backtrace for cpu %d" once on boot.
      
      (Bug introduced in fcef8576).
      
      [ Impact: avoid theoretical syslog noise in rare configs ]
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <alpine.DEB.2.00.0904202113520.10097@gandalf.stny.rr.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      fcc5c4a2
    • R
      x86: fix boot crash in NMI watchdog with CONFIG_CPUMASK_OFFSTACK=y and flat APIC · 2f537a9f
      Rusty Russell 提交于
      fcef8576 converted backtrace_mask to a
      cpumask_var_t, and assumed check_nmi_watchdog was called before
      nmi_watchdog_tick was ever called.  Steven's oops shows I was wrong.
      
      This is something of a bandaid: I'm not sure we *should* be calling
      nmi_watchdog_tick before check_nmi_watchdog.  Note that gcc eliminates
      this test for the CONFIG_CPUMASK_OFFSTACK=n case.
      
      [ Impact: fix boot crash in rare configs ]
      Reported-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      LKML-Reference: <alpine.DEB.2.00.0904202113520.10097@gandalf.stny.rr.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2f537a9f
    • S
      x86-64: fix FPU corruption with signals and preemption · 06c38d5e
      Suresh Siddha 提交于
      In 64bit signal delivery path, clear_used_math() was happening before saving
      the current active FPU state on to the user stack for signal handling. Between
      clear_used_math() and the state store on to the user stack, potentially we
      can get a page fault for the user address and can block. Infact, while testing
      we were hitting the might_fault() in __clear_user() which can do a schedule().
      
      At a later point in time, we will schedule back into this process and
      resume the save state (using "xsave/fxsave" instruction) which can lead
      to DNA fault. And as used_math was cleared before, we will reinit the FP state
      in the DNA fault and continue. This reinit will result in loosing the
      FPU state of the process.
      
      Move clear_used_math() to a point after the FPU state has been stored
      onto the user stack.
      
      This issue is present from a long time (even before the xsave changes
      and the x86 merge). But it can easily be exposed in 2.6.28.x and 2.6.29.x
      series because of the __clear_user() in this path, which has an explicit
      __cond_resched() leading to a context switch with CONFIG_PREEMPT_VOLUNTARY.
      
      [ Impact: fix FPU state corruption ]
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: <stable@kernel.org>			[2.6.28.x, 2.6.29.x]
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      06c38d5e
    • J
      x86/uv: fix for no memory at paddr 0 · fc61e663
      Jack Steiner 提交于
      Fix endcase where the memory at physical address 0 does not really
      exist AND one of the sockets on blade 0 has no active cpus.
      
      The memory that _appears_ to be at physical address 0 is actually
      memory that located at a different address but has been remapped by
      the chipset so that it appears to be at physical address 0.
      
      When determining the UV pnode, the algorithm for determining the pnode
      incorrectly used the relocated physical address instead of the actual
      (global) address.
      
      [ Impact: boot failure on partitioned systems ]
      Signed-off-by: NJack Steiner <steiner@sgi.com>
      LKML-Reference: <20090420132530.GA23156@sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      fc61e663
  19. 20 4月, 2009 4 次提交
    • T
      acpi-cpufreq: Do not let get_measured perf depend on internal variable · d876dfbb
      Thomas Renninger 提交于
      Take already available policy->cpuinfo.max_freq and get rid of acpi-cpufreq
      specific max_freq variable.
      
      This implies that P0 is always the highest frequency which should always
      be true as ACPI spec says:
      As a result, the zeroth entry describes the highest performance state
      Signed-off-by: NThomas Renninger <trenn@suse.de>
      Acked-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Signed-off-by: NLen Brown <len.brown@intel.com>
      d876dfbb
    • T
      d91758f5
    • T
      acpi-cpufreq: Cleanup: Use printk_once · e0e8c4e5
      Thomas Renninger 提交于
      Signed-off-by: NThomas Renninger <trenn@suse.de>
      Signed-off-by: NLen Brown <len.brown@intel.com>
      e0e8c4e5
    • P
      x86, acpi_cpufreq: Fix the NULL pointer dereference in get_measured_perf · 093f13e2
      Pallipadi, Venkatesh 提交于
      Fix for a regression that was introduced by earlier commit
      18b2646f on Mon Apr 6 11:26:08 2009
      
      Regression resulted in the below error happened on systems with
      software coordination where per_cpu acpi data will not be initiated for
      secondary CPUs in a P-state domain.
      
      On Tue, 2009-04-14 at 23:01 -0700, Zhang, Yanmin wrote:
       My machine hanged with kernel 2.6.30-rc2 when script read
      > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor.
      >
      > opps happens in get_measured_perf:
      >
      >         cur.aperf.whole = readin.aperf.whole -
      >                                 per_cpu(drv_data, cpu)->saved_aperf;
      >
      > Because per_cpu(drv_data, cpu)=NULL.
      >
      > So function get_measured_perf should check if (per_cpu(drv_data,
      > cpu)==NULL)
      > and return 0 if it's NULL.
      
      --------------sys log------------------
      
      BUG: unable to handle kernel NULL pointer dereference at
      0000000000000020
      IP: [<ffffffff8021af75>] get_measured_perf+0x4a/0xf9
      PGD a7dd88067 PUD a7ccf5067 PMD 0
      Oops: 0000 [#1] SMP
      last sysfs file: /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
      CPU 0
      Modules linked in: video output
      Pid: 2091, comm: kondemand/0 Not tainted 2.6.30-rc2 #1 MP Server
      RIP: 0010:[<ffffffff8021af75>]  [<ffffffff8021af75>]
      get_measured_perf+0x4a/0xf9
      RSP: 0018:ffff880a7d56de20  EFLAGS: 00010246
      RAX: 0000000000000000 RBX: 00000046241a42b6 RCX: ffff88004d219000
      RDX: 000000000000b660 RSI: 0000000000000020 RDI: 0000000000000001
      RBP: ffff880a7f052000 R08: 00000046241a42b6 R09: ffffffff807639f0
      R10: 00000000ffffffea R11: ffffffff802207f4 R12: ffff880a7f052000
      R13: ffff88004d20e460 R14: 0000000000ddd5a6 R15: 0000000000000001
      FS:  0000000000000000(0000) GS:ffff88004d200000(0000)
      knlGS:0000000000000000
      CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
      CR2: 0000000000000020 CR3: 0000000a7f1bf000 CR4: 00000000000006e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process kondemand/0 (pid: 2091, threadinfo ffff880a7d56c000, task
      ffff880a7d4d18c0)
      Stack:
       ffff880a7f052078 ffffffff803efd54 00000046241a42b6 000000462ffa9e95
       0000000000000001 0000000000000001 00000000ffffffea ffffffff8064f41a
       0000000000000012 0000000000000012 ffff880a7f052000 ffffffff80650547
      Call Trace:
       [<ffffffff803efd54>] ? kobject_get+0x12/0x17
       [<ffffffff8064f41a>] ? __cpufreq_driver_getavg+0x42/0x57
       [<ffffffff80650547>] ? do_dbs_timer+0x147/0x272
       [<ffffffff80650400>] ? do_dbs_timer+0x0/0x272
       [<ffffffff802474ca>] ? worker_thread+0x15b/0x1f5
       [<ffffffff8024a02c>] ? autoremove_wake_function+0x0/0x2e
       [<ffffffff8024736f>] ? worker_thread+0x0/0x1f5
       [<ffffffff80249f0d>] ? kthread+0x54/0x83
       [<ffffffff8020c87a>] ? child_rip+0xa/0x20
       [<ffffffff80249eb9>] ? kthread+0x0/0x83
       [<ffffffff8020c870>] ? child_rip+0x0/0x20
      Code: 99 a6 03 00 31 c9 85 c0 0f 85 c3 00 00 00 89 df 4c 8b 44 24 10 48
      c7 c2 60 b6 00 00 48 8b 0c fd e0 30 a5 80 4c 89 c3 48 8b 04 0a <48> 2b
      58 20 48 8b 44 24 18 48 89 1c 24 48 8b 34 0a 48 2b 46 28
      RIP  [<ffffffff8021af75>] get_measured_perf+0x4a/0xf9
       RSP <ffff880a7d56de20>
      CR2: 0000000000000020
      ---[ end trace 2b8fac9a49e19ad4 ]---
      Tested-by: N"Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
      Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Signed-off-by: NLen Brown <len.brown@intel.com>
      093f13e2
  20. 18 4月, 2009 1 次提交
    • S
      lockdep, x86: account for irqs enabled in paranoid_exit · 0300e7f1
      Steven Rostedt 提交于
      I hit the check_flags error of lockdep:
      
       WARNING: at kernel/lockdep.c:2893 check_flags+0x1a7/0x1d0()
       [...]
       hardirqs last  enabled at (12567): [<ffffffff8026206a>] local_bh_enable+0xaa/0x110
       hardirqs last disabled at (12569): [<ffffffff80610c76>] int3+0x16/0x40
       softirqs last  enabled at (12566): [<ffffffff80514d2b>] lock_sock_nested+0xfb/0x110
       softirqs last disabled at (12568): [<ffffffff8058454e>] tcp_prequeue_process+0x2e/0xa0
      
      The check_flags warning of lockdep tells me that lockdep thought interrupts
      were disabled, but they were really enabled.
      
      The numbers in the above parenthesis show the order of events:
      
       12566: softirqs last enabled:  lock_sock_nested
       12567: hardirqs last enabled:  local_bh_enable
       12568: softirqs last disabled: tcp_prequeue_process
       12566: hardirqs last disabled: int3
      
      int3 is a breakpoint!
      
      Examining this further, I have CONFIG_NET_TCPPROBE enabled which adds
      break points into the kernel.
      
      The paranoid_exit of the return of int3 does not account for enabling
      interrupts on return to kernel. This code is a bit tricky since it
      is also used by the nmi handler (when lockdep is off), and we must be
      careful about the swapgs. We can not call kernel code after the swapgs
      has been performed.
      
      [ Impact: fix lockdep check_flags warning + self-turn-off ]
      Acked-by: NPeter Zijlsta <a.p.zijlstra@chello.nl>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0300e7f1