1. 07 6月, 2013 1 次提交
  2. 31 5月, 2013 2 次提交
    • P
      x86: Allow FPU to be used at interrupt time even with eagerfpu · 5187b28f
      Pekka Riikonen 提交于
      With the addition of eagerfpu the irq_fpu_usable() now returns false
      negatives especially in the case of ksoftirqd and interrupted idle task,
      two common cases for FPU use for example in networking/crypto.  With
      eagerfpu=off FPU use is possible in those contexts.  This is because of
      the eagerfpu check in interrupted_kernel_fpu_idle():
      
      ...
        * For now, with eagerfpu we will return interrupted kernel FPU
        * state as not-idle. TBD: Ideally we can change the return value
        * to something like __thread_has_fpu(current). But we need to
        * be careful of doing __thread_clear_has_fpu() before saving
        * the FPU etc for supporting nested uses etc. For now, take
        * the simple route!
      ...
       	if (use_eager_fpu())
       		return 0;
      
      As eagerfpu is automatically "on" on those CPUs that also have the
      features like AES-NI this patch changes the eagerfpu check to return 1 in
      case the kernel_fpu_begin() has not been said yet.  Once it has been the
      __thread_has_fpu() will start returning 0.
      
      Notice that with eagerfpu the __thread_has_fpu is always true initially.
      FPU use is thus always possible no matter what task is under us, unless
      the state has already been saved with kernel_fpu_begin().
      
      [ hpa: this is a performance regression, not a correctness regression,
        but since it can be quite serious on CPUs which need encryption at
        interrupt time I am marking this for urgent/stable. ]
      Signed-off-by: NPekka Riikonen <priikone@iki.fi>
      Link: http://lkml.kernel.org/r/alpine.GSO.2.00.1305131356320.18@git.silcnet.org
      Cc: <stable@vger.kernel.org> v3.7+
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      5187b28f
    • J
      x86, crc32-pclmul: Fix build with older binutils · 2baad612
      Jan Beulich 提交于
      binutils prior to 2.18 (e.g. the ones found on SLE10) don't support
      assembling PEXTRD, so a macro based approach like the one for PCLMULQDQ
      in the same file should be used.
      
      This requires making the helper macros capable of recognizing 32-bit
      general purpose register operands.
      
      [ hpa: tagging for stable as it is a low risk build fix ]
      Signed-off-by: NJan Beulich <jbeulich@suse.com>
      Link: http://lkml.kernel.org/r/51A6142A02000078000D99D8@nat28.tlf.novell.com
      Cc: Alexander Boyko <alexander_boyko@xyratex.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Huang Ying <ying.huang@intel.com>
      Cc: <stable@vger.kernel.org> v3.9
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      2baad612
  3. 29 5月, 2013 2 次提交
    • S
      xen: Clean up apic ipi interface · 1db01b49
      Stefan Bader 提交于
      Commit f447d56d introduced the
      implementation of the PV apic ipi interface. But there were some
      odd things (it seems none of which cause really any issue but
      maybe they should be cleaned up anyway):
       - xen_send_IPI_mask_allbutself (and by that xen_send_IPI_allbutself)
         ignore the passed in vector and only use the CALL_FUNCTION_SINGLE
         vector. While xen_send_IPI_all and xen_send_IPI_mask use the vector.
       - physflat_send_IPI_allbutself is declared unnecessarily. It is never
         used.
      
      This patch tries to clean up those things.
      Signed-off-by: NStefan Bader <stefan.bader@canonical.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      1db01b49
    • Z
      x86-64, init: Fix a possible wraparound bug in switchover in head_64.S · e9d0626e
      Zhang Yanfei 提交于
      In head_64.S, a switchover has been used to handle kernel crossing
      1G, 512G boundaries.
      
      And commit 8170e6be
          x86, 64bit: Use a #PF handler to materialize early mappings on demand
      said:
          During the switchover in head_64.S, before #PF handler is available,
          we use three pages to handle kernel crossing 1G, 512G boundaries with
          sharing page by playing games with page aliasing: the same page is
          mapped twice in the higher-level tables with appropriate wraparound.
      
      But from the switchover code, when we set up the PUD table:
      114         addq    $4096, %rdx
      115         movq    %rdi, %rax
      116         shrq    $PUD_SHIFT, %rax
      117         andl    $(PTRS_PER_PUD-1), %eax
      118         movq    %rdx, (4096+0)(%rbx,%rax,8)
      119         movq    %rdx, (4096+8)(%rbx,%rax,8)
      
      It seems line 119 has a potential bug there. For example,
      if the kernel is loaded at physical address 511G+1008M, that is
          000000000 111111111 111111000 000000000000000000000
      and the kernel _end is 512G+2M, that is
          000000001 000000000 000000001 000000000000000000000
      So in this example, when using the 2nd page to setup PUD (line 114~119),
      rax is 511.
      In line 118, we put rdx which is the address of the PMD page (the 3rd page)
      into entry 511 of the PUD table. But in line 119, the entry we calculate from
      (4096+8)(%rbx,%rax,8) has exceeded the PUD page. IMO, the entry in line
      119 should be wraparound into entry 0 of the PUD table.
      
      The patch fixes the bug.
      Signed-off-by: NZhang Yanfei <zhangyanfei@cn.fujitsu.com>
      Link: http://lkml.kernel.org/r/5191DE5A.3020302@cn.fujitsu.comSigned-off-by: NYinghai Lu <yinghai@kernel.org>
      Cc: <stable@vger.kernel.org> v3.9
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      e9d0626e
  4. 28 5月, 2013 1 次提交
  5. 21 5月, 2013 2 次提交
  6. 15 5月, 2013 1 次提交
    • J
      time: Revert ALWAYS_USE_PERSISTENT_CLOCK compile time optimizaitons · b4f711ee
      John Stultz 提交于
      Kay Sievers noted that the ALWAYS_USE_PERSISTENT_CLOCK config,
      which enables some minor compile time optimization to avoid
      uncessary code in mostly the suspend/resume path could cause
      problems for userland.
      
      In particular, the dependency for RTC_HCTOSYS on
      !ALWAYS_USE_PERSISTENT_CLOCK, which avoids setting the time
      twice and simplifies suspend/resume, has the side effect
      of causing the /sys/class/rtc/rtcN/hctosys flag to always be
      zero, and this flag is commonly used by udev to setup the
      /dev/rtc symlink to /dev/rtcN, which can cause pain for
      older applications.
      
      While the udev rules could use some work to be less fragile,
      breaking userland should strongly be avoided. Additionally
      the compile time optimizations are fairly minor, and the code
      being optimized is likely to be reworked in the future, so
      lets revert this change.
      Reported-by: NKay Sievers <kay@vrfy.org>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      Cc: stable <stable@vger.kernel.org> #3.9
      Cc: Feng Tang <feng.tang@intel.com>
      Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
      Link: http://lkml.kernel.org/r/1366828376-18124-1-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      b4f711ee
  7. 14 5月, 2013 1 次提交
    • L
      x86, efi: initial the local variable of DataSize to zero · eccaf52f
      Lee, Chun-Yi 提交于
      That will be better initial the value of DataSize to zero for the input of
      GetVariable(), otherwise we will feed a random value. The debug log of input
      DataSize like this:
      
      ...
      [  195.915612] EFI Variables Facility v0.08 2004-May-17
      [  195.915819] efi: size: 18446744071581821342
      [  195.915969] efi:  size': 18446744071581821342
      [  195.916324] efi: size: 18446612150714306560
      [  195.916632] efi:  size': 18446612150714306560
      [  195.917159] efi: size: 18446612150714306560
      [  195.917453] efi:  size': 18446612150714306560
      ...
      
      The size' is value that was returned by BIOS.
      
      After applied this patch:
      [   82.442042] EFI Variables Facility v0.08 2004-May-17
      [   82.442202] efi: size: 0
      [   82.442360] efi:  size': 1039
      [   82.443828] efi: size: 0
      [   82.444127] efi:  size': 2616
      [   82.447057] efi: size: 0
      [   82.447356] efi:  size': 5832
      ...
      
      Found on Acer Aspire V3 BIOS, it will not return the size of data if we input a
      non-zero DataSize.
      
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Signed-off-by: NLee, Chun-Yi <jlee@suse.com>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      eccaf52f
  8. 10 5月, 2013 4 次提交
  9. 09 5月, 2013 5 次提交
    • P
      KVM: emulator: emulate SALC · 326f578f
      Paolo Bonzini 提交于
      This is an almost-undocumented instruction available in 32-bit mode.
      I say "almost" undocumented because AMD documents it in their opcode
      maps just to say that it is unavailable in 64-bit mode (sections
      "A.2.1 One-Byte Opcodes" and "B.3 Invalid and Reassigned Instructions
      in 64-Bit Mode").
      
      It is roughly equivalent to "sbb %al, %al" except it does not
      set the flags.  Use fastop to emulate it, but do not use the opcode
      directly because it would fail if the host is 64-bit!
      Reported-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
      Cc: stable@vger.kernel.org # 3.9
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NGleb Natapov <gleb@redhat.com>
      326f578f
    • P
      KVM: emulator: emulate XLAT · 7fa57952
      Paolo Bonzini 提交于
      This is used by SGABIOS, KVM breaks with emulate_invalid_guest_state=1.
      It is just a MOV in disguise, with a funny source address.
      Reported-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
      Cc: stable@vger.kernel.org # 3.9
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NGleb Natapov <gleb@redhat.com>
      7fa57952
    • P
      KVM: emulator: emulate AAM · a035d5c6
      Paolo Bonzini 提交于
      This is used by SGABIOS, KVM breaks with emulate_invalid_guest_state=1.
      
      AAM needs the source operand to be unsigned; do the same in AAD as well
      for consistency, even though it does not affect the result.
      Reported-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
      Cc: stable@vger.kernel.org # 3.9
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NGleb Natapov <gleb@redhat.com>
      a035d5c6
    • K
      x86/microcode: Add local mutex to fix physical CPU hot-add deadlock · 074d72ff
      Konrad Rzeszutek Wilk 提交于
      This can easily be triggered if a new CPU is added (via
      ACPI hotplug mechanism) and from user-space you do:
      
         echo 1 > /sys/devices/system/cpu/cpu3/online
      
      (or wait for UDEV to do it) on a newly appeared physical CPU.
      
      The deadlock is that the "store_online" in drivers/base/cpu.c
      takes the cpu_hotplug_driver_lock() lock, then calls "cpu_up".
      "cpu_up" eventually ends up calling "save_mc_for_early"
      which also takes the cpu_hotplug_driver_lock() lock.
      
      And here is that lockdep thinks of it:
      
       smpboot: Stack at about ffff880075c39f44
       smpboot: CPU3: has booted.
       microcode: CPU3 sig=0x206a7, pf=0x2, revision=0x25
      
       =============================================
       [ INFO: possible recursive locking detected ]
       3.9.0upstream-10129-g167af0e #1 Not tainted
       ---------------------------------------------
       sh/2487 is trying to acquire lock:
        (x86_cpu_hotplug_driver_mutex){+.+.+.}, at: [<ffffffff81075512>] cpu_hotplug_driver_lock+0x12/0x20
      
       but task is already holding lock:
        (x86_cpu_hotplug_driver_mutex){+.+.+.}, at: [<ffffffff81075512>] cpu_hotplug_driver_lock+0x12/0x20
      
       other info that might help us debug this:
        Possible unsafe locking scenario:
      
              CPU0
              ----
         lock(x86_cpu_hotplug_driver_mutex);
         lock(x86_cpu_hotplug_driver_mutex);
      
        *** DEADLOCK ***
      
        May be due to missing lock nesting notation
      
       6 locks held by sh/2487:
        #0:  (sb_writers#5){.+.+.+}, at: [<ffffffff811ca48d>] vfs_write+0x17d/0x190
        #1:  (&buffer->mutex){+.+.+.}, at: [<ffffffff812464ef>] sysfs_write_file+0x3f/0x160
        #2:  (s_active#20){.+.+.+}, at: [<ffffffff81246578>] sysfs_write_file+0xc8/0x160
        #3:  (x86_cpu_hotplug_driver_mutex){+.+.+.}, at: [<ffffffff81075512>] cpu_hotplug_driver_lock+0x12/0x20
        #4:  (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff810961c2>] cpu_maps_update_begin+0x12/0x20
        #5:  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff810962a7>] cpu_hotplug_begin+0x27/0x60
      Suggested-and-Acked-by: NBorislav Petkov <bp@alien8.de>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: fenghua.yu@intel.com
      Cc: xen-devel@lists.xensource.com
      Cc: stable@vger.kernel.org # for v3.9
      Link: http://lkml.kernel.org/r/1368029583-23337-1-git-send-email-konrad.wilk@oracle.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      074d72ff
    • G
      KVM: VMX: fix halt emulation while emulating invalid guest sate · 8d76c49e
      Gleb Natapov 提交于
      The invalid guest state emulation loop does not check halt_request
      which causes 100% cpu loop while guest is in halt and in invalid
      state, but more serious issue is that this leaves halt_request set, so
      random instruction emulated by vm86 #GP exit can be interpreted
      as halt which causes guest hang. Fix both problems by handling
      halt_request in emulation loop.
      Reported-by: NTomas Papan <tomas.papan@gmail.com>
      Tested-by: NTomas Papan <tomas.papan@gmail.com>
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      CC: stable@vger.kernel.org
      Signed-off-by: NGleb Natapov <gleb@redhat.com>
      8d76c49e
  10. 08 5月, 2013 4 次提交
  11. 07 5月, 2013 3 次提交
  12. 06 5月, 2013 1 次提交
    • K
      xen/vcpu/pvhvm: Fix vcpu hotplugging hanging. · 7f1fc268
      Konrad Rzeszutek Wilk 提交于
      If a user did:
      
      	echo 0 > /sys/devices/system/cpu/cpu1/online
      	echo 1 > /sys/devices/system/cpu/cpu1/online
      
      we would (this a build with DEBUG enabled) get to:
      smpboot: ++++++++++++++++++++=_---CPU UP  1
      .. snip..
      smpboot: Stack at about ffff880074c0ff44
      smpboot: CPU1: has booted.
      
      and hang. The RCU mechanism would kick in an try to IPI the CPU1
      but the IPIs (and all other interrupts) would never arrive at the
      CPU1. At first glance at least. A bit digging in the hypervisor
      trace shows that (using xenanalyze):
      
      [vla] d4v1 vec 243 injecting
         0.043163027 --|x d4v1 intr_window vec 243 src 5(vector) intr f3
      ]  0.043163639 --|x d4v1 vmentry cycles 1468
      ]  0.043164913 --|x d4v1 vmexit exit_reason PENDING_INTERRUPT eip ffffffff81673254
         0.043164913 --|x d4v1 inj_virq vec 243  real
        [vla] d4v1 vec 243 injecting
         0.043164913 --|x d4v1 intr_window vec 243 src 5(vector) intr f3
      ]  0.043165526 --|x d4v1 vmentry cycles 1472
      ]  0.043166800 --|x d4v1 vmexit exit_reason PENDING_INTERRUPT eip ffffffff81673254
         0.043166800 --|x d4v1 inj_virq vec 243  real
        [vla] d4v1 vec 243 injecting
      
      there is a pending event (subsequent debugging shows it is the IPI
      from the VCPU0 when smpboot.c on VCPU1 has done
      "set_cpu_online(smp_processor_id(), true)") and the guest VCPU1 is
      interrupted with the callback IPI (0xf3 aka 243) which ends up calling
      __xen_evtchn_do_upcall.
      
      The __xen_evtchn_do_upcall seems to do *something* but not acknowledge
      the pending events. And the moment the guest does a 'cli' (that is the
      ffffffff81673254 in the log above) the hypervisor is invoked again to
      inject the IPI (0xf3) to tell the guest it has pending interrupts.
      This repeats itself forever.
      
      The culprit was the per_cpu(xen_vcpu, cpu) pointer. At the bootup
      we set each per_cpu(xen_vcpu, cpu) to point to the
      shared_info->vcpu_info[vcpu] but later on use the VCPUOP_register_vcpu_info
      to register per-CPU  structures (xen_vcpu_setup).
      This is used to allow events for more than 32 VCPUs and for performance
      optimizations reasons.
      
      When the user performs the VCPU hotplug we end up calling the
      the xen_vcpu_setup once more. We make the hypercall which returns
      -EINVAL as it does not allow multiple registration calls (and
      already has re-assigned where the events are being set). We pick
      the fallback case and set per_cpu(xen_vcpu, cpu) to point to the
      shared_info->vcpu_info[vcpu] (which is a good fallback during bootup).
      However the hypervisor is still setting events in the register
      per-cpu structure (per_cpu(xen_vcpu_info, cpu)).
      
      As such when the events are set by the hypervisor (such as timer one),
      and when we iterate in __xen_evtchn_do_upcall we end up reading stale
      events from the shared_info->vcpu_info[vcpu] instead of the
      per_cpu(xen_vcpu_info, cpu) structures. Hence we never acknowledge the
      events that the hypervisor has set and the hypervisor keeps on reminding
      us to ack the events which we never do.
      
      The fix is simple. Don't on the second time when xen_vcpu_setup is
      called over-write the per_cpu(xen_vcpu, cpu) if it points to
      per_cpu(xen_vcpu_info).
      Acked-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
      CC: stable@vger.kernel.org
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      7f1fc268
  13. 05 5月, 2013 1 次提交
  14. 04 5月, 2013 2 次提交
  15. 03 5月, 2013 4 次提交
  16. 01 5月, 2013 6 次提交
    • S
      Kconfig: consolidate CONFIG_DEBUG_STRICT_USER_COPY_CHECKS · 446f24d1
      Stephen Boyd 提交于
      The help text for this config is duplicated across the x86, parisc, and
      s390 Kconfig.debug files.  Arnd Bergman noted that the help text was
      slightly misleading and should be fixed to state that enabling this
      option isn't a problem when using pre 4.4 gcc.
      
      To simplify the rewording, consolidate the text into lib/Kconfig.debug
      and modify it there to be more explicit about when you should say N to
      this config.
      
      Also, make the text a bit more generic by stating that this option
      enables compile time checks so we can cover architectures which emit
      warnings vs.  ones which emit errors.  The details of how an
      architecture decided to implement the checks isn't as important as the
      concept of compile time checking of copy_from_user() calls.
      
      While we're doing this, remove all the copy_from_user_overflow() code
      that's duplicated many times and place it into lib/ so that any
      architecture supporting this option can get the function for free.
      Signed-off-by: NStephen Boyd <sboyd@codeaurora.org>
      Acked-by: NArnd Bergmann <arnd@arndb.de>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Acked-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Acked-by: NHelge Deller <deller@gmx.de>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      446f24d1
    • O
      coredump: factor out the setting of PF_DUMPCORE · 079148b9
      Oleg Nesterov 提交于
      Cleanup.  Every linux_binfmt->core_dump() sets PF_DUMPCORE, move this into
      zap_threads() called by do_coredump().
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NMandeep Singh Baines <msb@chromium.org>
      Cc: Neil Horman <nhorman@redhat.com>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      079148b9
    • T
      dump_stack: unify debug information printed by show_regs() · a43cb95d
      Tejun Heo 提交于
      show_regs() is inherently arch-dependent but it does make sense to print
      generic debug information and some archs already do albeit in slightly
      different forms.  This patch introduces a generic function to print debug
      information from show_regs() so that different archs print out the same
      information and it's much easier to modify what's printed.
      
      show_regs_print_info() prints out the same debug info as dump_stack()
      does plus task and thread_info pointers.
      
      * Archs which didn't print debug info now do.
      
        alpha, arc, blackfin, c6x, cris, frv, h8300, hexagon, ia64, m32r,
        metag, microblaze, mn10300, openrisc, parisc, score, sh64, sparc,
        um, xtensa
      
      * Already prints debug info.  Replaced with show_regs_print_info().
        The printed information is superset of what used to be there.
      
        arm, arm64, avr32, mips, powerpc, sh32, tile, unicore32, x86
      
      * s390 is special in that it used to print arch-specific information
        along with generic debug info.  Heiko and Martin think that the
        arch-specific extra isn't worth keeping s390 specfic implementation.
        Converted to use the generic version.
      
      Note that now all archs print the debug info before actual register
      dumps.
      
      An example BUG() dump follows.
      
       kernel BUG at /work/os/work/kernel/workqueue.c:4841!
       invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
       Modules linked in:
       CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.9.0-rc1-work+ #7
       Hardware name: empty empty/S3992, BIOS 080011  10/26/2007
       task: ffff88007c85e040 ti: ffff88007c860000 task.ti: ffff88007c860000
       RIP: 0010:[<ffffffff8234a07e>]  [<ffffffff8234a07e>] init_workqueues+0x4/0x6
       RSP: 0000:ffff88007c861ec8  EFLAGS: 00010246
       RAX: ffff88007c861fd8 RBX: ffffffff824466a8 RCX: 0000000000000001
       RDX: 0000000000000046 RSI: 0000000000000001 RDI: ffffffff8234a07a
       RBP: ffff88007c861ec8 R08: 0000000000000000 R09: 0000000000000000
       R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff8234a07a
       R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
       FS:  0000000000000000(0000) GS:ffff88007dc00000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
       CR2: ffff88015f7ff000 CR3: 00000000021f1000 CR4: 00000000000007f0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
       Stack:
        ffff88007c861ef8 ffffffff81000312 ffffffff824466a8 ffff88007c85e650
        0000000000000003 0000000000000000 ffff88007c861f38 ffffffff82335e5d
        ffff88007c862080 ffffffff8223d8c0 ffff88007c862080 ffffffff81c47760
       Call Trace:
        [<ffffffff81000312>] do_one_initcall+0x122/0x170
        [<ffffffff82335e5d>] kernel_init_freeable+0x9b/0x1c8
        [<ffffffff81c47760>] ? rest_init+0x140/0x140
        [<ffffffff81c4776e>] kernel_init+0xe/0xf0
        [<ffffffff81c6be9c>] ret_from_fork+0x7c/0xb0
        [<ffffffff81c47760>] ? rest_init+0x140/0x140
        ...
      
      v2: Typo fix in x86-32.
      
      v3: CPU number dropped from show_regs_print_info() as
          dump_stack_print_info() has been updated to print it.  s390
          specific implementation dropped as requested by s390 maintainers.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Acked-by: NJesper Nilsson <jesper.nilsson@axis.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Mike Frysinger <vapier@gentoo.org>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Acked-by: Chris Metcalf <cmetcalf@tilera.com>		[tile bits]
      Acked-by: Richard Kuo <rkuo@codeaurora.org>		[hexagon bits]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a43cb95d
    • T
      dump_stack: implement arch-specific hardware description in task dumps · 98e5e1bf
      Tejun Heo 提交于
      x86 and ia64 can acquire extra hardware identification information
      from DMI and print it along with task dumps; however, the usage isn't
      consistent.
      
      * x86 show_regs() collects vendor, product and board strings and print
        them out with PID, comm and utsname.  Some of the information is
        printed again later in the same dump.
      
      * warn_slowpath_common() explicitly accesses the DMI board and prints
        it out with "Hardware name:" label.  This applies to both x86 and
        ia64 but is irrelevant on all other archs.
      
      * ia64 doesn't show DMI information on other non-WARN dumps.
      
      This patch introduces arch-specific hardware description used by
      dump_stack().  It can be set by calling dump_stack_set_arch_desc()
      during boot and, if exists, printed out in a separate line with
      "Hardware name:" label.
      
      dmi_set_dump_stack_arch_desc() is added which sets arch-specific
      description from DMI data.  It uses dmi_ids_string[] which is set from
      dmi_present() used for DMI debug message.  It is superset of the
      information x86 show_regs() is using.  The function is called from x86
      and ia64 boot code right after dmi_scan_machine().
      
      This makes the explicit DMI handling in warn_slowpath_common()
      unnecessary.  Removed.
      
      show_regs() isn't yet converted to use generic debug information
      printing and this patch doesn't remove the duplicate DMI handling in
      x86 show_regs().  The next patch will unify show_regs() handling and
      remove the duplication.
      
      An example WARN dump follows.
      
       WARNING: at kernel/workqueue.c:4841 init_workqueues+0x35/0x505()
       Modules linked in:
       CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.9.0-rc1-work+ #3
       Hardware name: empty empty/S3992, BIOS 080011  10/26/2007
        0000000000000009 ffff88007c861e08 ffffffff81c614dc ffff88007c861e48
        ffffffff8108f500 ffffffff82228240 0000000000000040 ffffffff8234a08e
        0000000000000000 0000000000000000 0000000000000000 ffff88007c861e58
       Call Trace:
        [<ffffffff81c614dc>] dump_stack+0x19/0x1b
        [<ffffffff8108f500>] warn_slowpath_common+0x70/0xa0
        [<ffffffff8108f54a>] warn_slowpath_null+0x1a/0x20
        [<ffffffff8234a0c3>] init_workqueues+0x35/0x505
        ...
      
      v2: Use the same string as the debug message from dmi_present() which
          also contains BIOS information.  Move hardware name into its own
          line as warn_slowpath_common() did.  This change was suggested by
          Bjorn Helgaas.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Mike Frysinger <vapier@gentoo.org>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      98e5e1bf
    • T
      dump_stack: consolidate dump_stack() implementations and unify their behaviors · 196779b9
      Tejun Heo 提交于
      Both dump_stack() and show_stack() are currently implemented by each
      architecture.  show_stack(NULL, NULL) dumps the backtrace for the
      current task as does dump_stack().  On some archs, dump_stack() prints
      extra information - pid, utsname and so on - in addition to the
      backtrace while the two are identical on other archs.
      
      The usages in arch-independent code of the two functions indicate
      show_stack(NULL, NULL) should print out bare backtrace while
      dump_stack() is used for debugging purposes when something went wrong,
      so it does make sense to print additional information on the task which
      triggered dump_stack().
      
      There's no reason to require archs to implement two separate but mostly
      identical functions.  It leads to unnecessary subtle information.
      
      This patch expands the dummy fallback dump_stack() implementation in
      lib/dump_stack.c such that it prints out debug information (taken from
      x86) and invokes show_stack(NULL, NULL) and drops arch-specific
      dump_stack() implementations in all archs except blackfin.  Blackfin's
      dump_stack() does something wonky that I don't understand.
      
      Debug information can be printed separately by calling
      dump_stack_print_info() so that arch-specific dump_stack()
      implementation can still emit the same debug information.  This is used
      in blackfin.
      
      This patch brings the following behavior changes.
      
      * On some archs, an extra level in backtrace for show_stack() could be
        printed.  This is because the top frame was determined in
        dump_stack() on those archs while generic dump_stack() can't do that
        reliably.  It can be compensated by inlining dump_stack() but not
        sure whether that'd be necessary.
      
      * Most archs didn't use to print debug info on dump_stack().  They do
        now.
      
      An example WARN dump follows.
      
       WARNING: at kernel/workqueue.c:4841 init_workqueues+0x35/0x505()
       Hardware name: empty
       Modules linked in:
       CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.9.0-rc1-work+ #9
        0000000000000009 ffff88007c861e08 ffffffff81c614dc ffff88007c861e48
        ffffffff8108f50f ffffffff82228240 0000000000000040 ffffffff8234a03c
        0000000000000000 0000000000000000 0000000000000000 ffff88007c861e58
       Call Trace:
        [<ffffffff81c614dc>] dump_stack+0x19/0x1b
        [<ffffffff8108f50f>] warn_slowpath_common+0x7f/0xc0
        [<ffffffff8108f56a>] warn_slowpath_null+0x1a/0x20
        [<ffffffff8234a071>] init_workqueues+0x35/0x505
        ...
      
      v2: CPU number added to the generic debug info as requested by s390
          folks and dropped the s390 specific dump_stack().  This loses %ksp
          from the debug message which the maintainers think isn't important
          enough to keep the s390-specific dump_stack() implementation.
      
          dump_stack_print_info() is moved to kernel/printk.c from
          lib/dump_stack.c.  Because linkage is per objecct file,
          dump_stack_print_info() living in the same lib file as generic
          dump_stack() means that archs which implement custom dump_stack()
          - at this point, only blackfin - can't use dump_stack_print_info()
          as that will bring in the generic version of dump_stack() too.  v1
          The v1 patch broke build on blackfin due to this issue.  The build
          breakage was reported by Fengguang Wu.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Acked-by: NVineet Gupta <vgupta@synopsys.com>
      Acked-by: NJesper Nilsson <jesper.nilsson@axis.com>
      Acked-by: NVineet Gupta <vgupta@synopsys.com>
      Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	[s390 bits]
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Mike Frysinger <vapier@gentoo.org>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Acked-by: Richard Kuo <rkuo@codeaurora.org>		[hexagon bits]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      196779b9
    • T
      x86: don't show trace beyond show_stack(NULL, NULL) · a77f2a4e
      Tejun Heo 提交于
      There are multiple ways a task can be dumped - explicit call to
      dump_stack(), triggering WARN() or BUG(), through sysrq-t and so on.
      Most of what gets printed is upto each architecture and the current
      state is not particularly pretty.  Different pieces of information are
      presented differently depending on which path the dump takes and which
      architecture it's running on.  This is messy for no good reason and
      makes it exceedingly difficult to add or modify debug information to
      task dumps.
      
      In all archs except for s390, there's nothing arch-specific about the
      printed debug information.  This patchset updates all those archs to use
      the same helpers to consistently print out the same debug information.
      
      An example WARN dump after this patchset.
      
       WARNING: at kernel/workqueue.c:4841 init_workqueues+0x35/0x505()
       Modules linked in:
       CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.9.0-rc1-work+ #3
       Hardware name: empty empty/S3992, BIOS 080011  10/26/2007
        0000000000000009 ffff88007c861e08 ffffffff81c614dc ffff88007c861e48
        ffffffff8108f500 ffffffff82228240 0000000000000040 ffffffff8234a08e
        0000000000000000 0000000000000000 0000000000000000 ffff88007c861e58
       Call Trace:
        [<ffffffff81c614dc>] dump_stack+0x19/0x1b
        [<ffffffff8108f500>] warn_slowpath_common+0x70/0xa0
        [<ffffffff8108f54a>] warn_slowpath_null+0x1a/0x20
        [<ffffffff8234a0c3>] init_workqueues+0x35/0x505
        ...
      
      And BUG dump.
      
       kernel BUG at kernel/workqueue.c:4841!
       invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
       Modules linked in:
       CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.9.0-rc1-work+ #7
       Hardware name: empty empty/S3992, BIOS 080011  10/26/2007
       task: ffff88007c85e040 ti: ffff88007c860000 task.ti: ffff88007c860000
       RIP: 0010:[<ffffffff8234a07e>]  [<ffffffff8234a07e>] init_workqueues+0x4/0x6
       RSP: 0000:ffff88007c861ec8  EFLAGS: 00010246
       RAX: ffff88007c861fd8 RBX: ffffffff824466a8 RCX: 0000000000000001
       RDX: 0000000000000046 RSI: 0000000000000001 RDI: ffffffff8234a07a
       RBP: ffff88007c861ec8 R08: 0000000000000000 R09: 0000000000000000
       R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff8234a07a
       R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
       FS:  0000000000000000(0000) GS:ffff88007dc00000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
       CR2: ffff88015f7ff000 CR3: 00000000021f1000 CR4: 00000000000007f0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
       Stack:
        ffff88007c861ef8 ffffffff81000312 ffffffff824466a8 ffff88007c85e650
        0000000000000003 0000000000000000 ffff88007c861f38 ffffffff82335e5d
        ffff88007c862080 ffffffff8223d8c0 ffff88007c862080 ffffffff81c47760
       Call Trace:
        [<ffffffff81000312>] do_one_initcall+0x122/0x170
        [<ffffffff82335e5d>] kernel_init_freeable+0x9b/0x1c8
        [<ffffffff81c47760>] ? rest_init+0x140/0x140
        [<ffffffff81c4776e>] kernel_init+0xe/0xf0
        [<ffffffff81c6be9c>] ret_from_fork+0x7c/0xb0
        [<ffffffff81c47760>] ? rest_init+0x140/0x140
        ...
      
      This patchset contains the following seven patches.
      
       0001-x86-don-t-show-trace-beyond-show_stack-NULL-NULL.patch
       0002-sparc32-make-show_stack-acquire-fp-if-_ksp-is-not-sp.patch
       0003-dump_stack-consolidate-dump_stack-implementations-an.patch
       0004-dmi-morph-dmi_dump_ids-into-dmi_format_ids-which-for.patch
       0005-dump_stack-implement-arch-specific-hardware-descript.patch
       0006-dump_stack-unify-debug-information-printed-by-show_r.patch
       0007-arc-print-fatal-signals-reduce-duplicated-informatio.patch
      
      0001-0002 update stack dumping functions in x86 and sparc32 in
      preparation.
      
      0003 makes all arches except blackfin use generic dump_stack().
      blackfin still uses the generic helper to print the same info.
      
      0004-0005 properly abstract DMI identifier printing in WARN() and
      show_regs() so that all dumps print out the information.  This enables
      show_regs() to use the same debug info message.
      
      0006 updates show_regs() of all arches to use a common generic helper
      to print debug info.
      
      0007 removes somem duplicate information from arc dumps.
      
      While this patchset changes how debug info is printed on some archs,
      the printed information is always superset of what used to be there.
      
      This patchset makes task dump debug messages consistent and enables
      adding more information.  Workqueue is scheduled to add worker
      information including the workqueue in use and work item specific
      description.
      
      While this patch touches a lot of archs, it isn't too likely to cause
      non-trivial conflicts with arch-specfic changes and would probably be
      best to route together either through -mm.
      
      x86 is tested but other archs are either only compile tested or not
      tested at all.  Changes to most archs are generally trivial.
      
      This patch:
      
      show_stack(current or NULL, NULL) is used to print the backtrace of the
      current task.  As trace beyond the function itself isn't of much
      interest to anyone, don't show it by determining sp and bp in
      show_stack()'s frame and passing them to show_stack_log_lvl().
      
      This brings show_stack(NULL, NULL)'s behavior in line with
      dump_stack().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Mike Frysinger <vapier@gentoo.org>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a77f2a4e