1. 02 12月, 2016 1 次提交
    • P
      ACPI / APEI: Fix NMI notification handling · a545715d
      Prarit Bhargava 提交于
      When removing and adding cpu 0 on a system with GHES NMI the following stack
      trace is seen when re-adding the cpu:
      
      WARNING: CPU: 0 PID: 0 at arch/x86/kernel/apic/apic.c:1349 setup_local_APIC+
      Modules linked in: nfsv3 rpcsec_gss_krb5 nfsv4 nfs fscache coretemp intel_ra
      CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.0-rc6+ #2
      Call Trace:
       dump_stack+0x63/0x8e
       __warn+0xd1/0xf0
       warn_slowpath_null+0x1d/0x20
       setup_local_APIC+0x275/0x370
       apic_ap_setup+0xe/0x20
       start_secondary+0x48/0x180
       set_init_arg+0x55/0x55
       early_idt_handler_array+0x120/0x120
       x86_64_start_reservations+0x2a/0x2c
       x86_64_start_kernel+0x13d/0x14c
      
      During the cpu bringup, wakeup_cpu_via_init_nmi() is called and issues an
      NMI on CPU 0.  The GHES NMI handler, ghes_notify_nmi() runs the
      ghes_proc_irq_work work queue which ends up setting IRQ_WORK_VECTOR
      (0xf6).  The "faulty" IR line set at arch/x86/kernel/apic/apic.c:1349 is  also
      0xf6 (specifically APIC IRR for irqs 255 to 224 is 0x400000) which confirms
      that something has set the IRQ_WORK_VECTOR line prior to the APIC being
      initialized.
      
      Commit 2383844d ("GHES: Elliminate double-loop in the NMI handler")
      incorrectly modified the behavior such that the handler returns
      NMI_HANDLED only if an error was processed, and incorrectly runs the ghes
      work queue for every NMI.
      
      This patch modifies the ghes_proc_irq_work() to run as it did prior to
      2383844d ("GHES: Elliminate double-loop in the NMI handler") by
      properly returning NMI_HANDLED and only calling the work queue if
      NMI_HANDLED has been set.
      
      Fixes: 2383844d (GHES: Elliminate double-loop in the NMI handler)
      Signed-off-by: NPrarit Bhargava <prarit@redhat.com>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      a545715d
  2. 28 11月, 2016 3 次提交
  3. 27 11月, 2016 8 次提交
  4. 26 11月, 2016 22 次提交
  5. 25 11月, 2016 6 次提交
    • J
      parisc: Also flush data TLB in flush_icache_page_asm · 5035b230
      John David Anglin 提交于
      This is the second issue I noticed in reviewing the parisc TLB code.
      
      The fic instruction may use either the instruction or data TLB in
      flushing the instruction cache.  Thus, on machines with a split TLB, we
      should also flush the data TLB after setting up the temporary alias
      registers.
      
      Although this has no functional impact, I changed the pdtlb and pitlb
      instructions to consistently use the index register %r0.  These
      instructions do not support integer displacements.
      
      Tested on rp3440 and c8000.
      Signed-off-by: NJohn David Anglin  <dave.anglin@bell.net>
      Cc: <stable@vger.kernel.org> # v3.16+
      Signed-off-by: NHelge Deller <deller@gmx.de>
      5035b230
    • J
      parisc: Fix race in pci-dma.c · c0452fb9
      John David Anglin 提交于
      We are still troubled by occasional random segmentation faults and
      memory memory corruption on SMP machines.  The causes quite a few
      package builds to fail on the Debian buildd machines for parisc.  When
      gcc-6 failed to build three times in a row, I looked again at the TLB
      related code.  I found a couple of issues.  This is the first.
      
      In general, we need to ensure page table updates and corresponding TLB
      purges are atomic.  The attached patch fixes an instance in pci-dma.c
      where the page table update was not guarded by the TLB lock.
      
      Tested on rp3440 and c8000.  So far, no further random segmentation
      faults have been observed.
      Signed-off-by: NJohn David Anglin  <dave.anglin@bell.net>
      Cc: <stable@vger.kernel.org> # v3.16+
      Signed-off-by: NHelge Deller <deller@gmx.de>
      c0452fb9
    • H
      parisc: Switch to generic sched_clock implementation · 43b1f6ab
      Helge Deller 提交于
      Drop the open-coded sched_clock() function and replace it by the provided
      GENERIC_SCHED_CLOCK implementation.  We have seen quite some hung tasks in the
      past, which seem to be fixed by this patch.
      Signed-off-by: NHelge Deller <deller@gmx.de>
      Cc: <stable@vger.kernel.org> # v4.7+
      Signed-off-by: NHelge Deller <deller@gmx.de>
      43b1f6ab
    • J
      parisc: Fix races in parisc_setup_cache_timing() · 741dc7bf
      John David Anglin 提交于
      Helge reported to me the following startup crash:
      
      [    0.000000] Linux version 4.8.0-1-parisc64-smp (debian-kernel@lists.debian.org) (gcc version 5.4.1 20161019 (GCC) ) #1 SMP Debian 4.8.7-1 (2016-11-13)
      [    0.000000] The 64-bit Kernel has started...
      [    0.000000] Kernel default page size is 4 KB. Huge pages enabled with 1 MB physical and 2 MB virtual size.
      [    0.000000] Determining PDC firmware type: System Map.
      [    0.000000] model 9000/785/J5000
      [    0.000000] Total Memory: 2048 MB
      [    0.000000] Memory: 2018528K/2097152K available (9272K kernel code, 3053K rwdata, 1319K rodata, 1024K init, 840K bss, 78624K reserved, 0K cma-reserved)
      [    0.000000] virtual kernel memory layout:
      [    0.000000]     vmalloc : 0x0000000000008000 - 0x000000003f000000   (1007 MB)
      [    0.000000]     memory  : 0x0000000040000000 - 0x00000000c0000000   (2048 MB)
      [    0.000000]       .init : 0x0000000040100000 - 0x0000000040200000   (1024 kB)
      [    0.000000]       .data : 0x0000000040b0e000 - 0x0000000040f533e0   (4372 kB)
      [    0.000000]       .text : 0x0000000040200000 - 0x0000000040b0e000   (9272 kB)
      [    0.768910] Brought up 1 CPUs
      [    0.992465] NET: Registered protocol family 16
      [    2.429981] Releasing cpu 1 now, hpa=fffffffffffa2000
      [    2.635751] CPU(s): 2 out of 2 PA8500 (PCX-W) at 440.000000 MHz online
      [    2.726692] Setting cache flush threshold to 1024 kB
      [    2.729932] Not-handled unaligned insn 0x43ffff80
      [    2.798114] Setting TLB flush threshold to 140 kB
      [    2.928039] Unaligned handler failed, ret = -1
      [    3.000419]       _______________________________
      [    3.000419]      < Your System ate a SPARC! Gah! >
      [    3.000419]       -------------------------------
      [    3.000419]              \   ^__^
      [    3.000419]                  (__)\       )\/\
      [    3.000419]                   U  ||----w |
      [    3.000419]                      ||     ||
      [    9.340055] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.8.0-1-parisc64-smp #1 Debian 4.8.7-1
      [    9.448082] task: 00000000bfd48060 task.stack: 00000000bfd50000
      [    9.528040]
      [   10.760029] IASQ: 0000000000000000 0000000000000000 IAOQ: 000000004025d154 000000004025d158
      [   10.868052]  IIR: 43ffff80    ISR: 0000000000340000  IOR: 000001ff54150960
      [   10.960029]  CPU:        1   CR30: 00000000bfd50000 CR31: 0000000011111111
      [   11.052057]  ORIG_R28: 000000004021e3b4
      [   11.100045]  IAOQ[0]: irq_exit+0x94/0x120
      [   11.152062]  IAOQ[1]: irq_exit+0x98/0x120
      [   11.208031]  RP(r2): irq_exit+0xb8/0x120
      [   11.256074] Backtrace:
      [   11.288067]  [<00000000402cd944>] cpu_startup_entry+0x1e4/0x598
      [   11.368058]  [<0000000040109528>] smp_callin+0x2c0/0x2f0
      [   11.436308]  [<00000000402b53fc>] update_curr+0x18c/0x2d0
      [   11.508055]  [<00000000402b73b8>] dequeue_entity+0x2c0/0x1030
      [   11.584040]  [<00000000402b3cc0>] set_next_entity+0x80/0xd30
      [   11.660069]  [<00000000402c1594>] pick_next_task_fair+0x614/0x720
      [   11.740085]  [<000000004020dd34>] __schedule+0x394/0xa60
      [   11.808054]  [<000000004020e488>] schedule+0x88/0x118
      [   11.876039]  [<0000000040283d3c>] rescuer_thread+0x4d4/0x5b0
      [   11.948090]  [<000000004028fc4c>] kthread+0x1ec/0x248
      [   12.016053]  [<0000000040205020>] end_fault_vector+0x20/0xc0
      [   12.092239]  [<00000000402050c0>] _switch_to_ret+0x0/0xf40
      [   12.164044]
      [   12.184036] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.8.0-1-parisc64-smp #1 Debian 4.8.7-1
      [   12.244040] Backtrace:
      [   12.244040]  [<000000004021c480>] show_stack+0x68/0x80
      [   12.244040]  [<00000000406f332c>] dump_stack+0xec/0x168
      [   12.244040]  [<000000004021c74c>] die_if_kernel+0x25c/0x430
      [   12.244040]  [<000000004022d320>] handle_unaligned+0xb48/0xb50
      [   12.244040]
      [   12.632066] ---[ end trace 9ca05a7215c7bbb2 ]---
      [   12.692036] Kernel panic - not syncing: Attempted to kill the idle task!
      
      We have the insn 0x43ffff80 in IIR but from IAOQ we should have:
         4025d150:   0f f3 20 df     ldd,s r19(r31),r31
         4025d154:   0f 9f 00 9c     ldw r31(ret0),ret0
         4025d158:   bf 80 20 58     cmpb,*<> r0,ret0,4025d18c <irq_exit+0xcc>
      
      Cpu0 has just completed running parisc_setup_cache_timing:
      
      [    2.429981] Releasing cpu 1 now, hpa=fffffffffffa2000
      [    2.635751] CPU(s): 2 out of 2 PA8500 (PCX-W) at 440.000000 MHz online
      [    2.726692] Setting cache flush threshold to 1024 kB
      [    2.729932] Not-handled unaligned insn 0x43ffff80
      [    2.798114] Setting TLB flush threshold to 140 kB
      [    2.928039] Unaligned handler failed, ret = -1
      
      From the backtrace, cpu1 is in smp_callin:
      
      void __init smp_callin(void)
      {
             int slave_id = cpu_now_booting;
      
             smp_cpu_init(slave_id);
             preempt_disable();
      
             flush_cache_all_local(); /* start with known state */
             flush_tlb_all_local(NULL);
      
             local_irq_enable();  /* Interrupts have been off until now */
      
             cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
      
      So, it has just flushed its caches and the TLB. It would seem either the
      flushes in parisc_setup_cache_timing or smp_callin have corrupted kernel
      memory.
      
      The attached patch reworks parisc_setup_cache_timing to remove the races
      in setting the cache and TLB flush thresholds. It also corrects the
      number of bytes flushed in the TLB calculation.
      
      The patch flushes the cache and TLB on cpu0 before starting the
      secondary processors so that they are started from a known state.
      
      Tested with a few reboots on c8000.
      Signed-off-by: NJohn David Anglin  <dave.anglin@bell.net>
      Cc: <stable@vger.kernel.org> # v3.18+
      Signed-off-by: NHelge Deller <deller@gmx.de>
      741dc7bf
    • V
      mfd: wm8994-core: Don't use managed regulator bulk get API · 1a41741f
      Viresh Kumar 提交于
      The kernel WARNs and then crashes today if wm8994_device_init() fails
      after calling devm_regulator_bulk_get().
      
      That happens because there are multiple devices involved here and the
      order in which managed resources are freed isn't correct.
      
      The regulators are added as children of wm8994->dev.  Whereas,
      devm_regulator_bulk_get() receives wm8994->dev as the device, though it
      gets the same regulators which were added as children of wm8994->dev
      earlier.
      
      During failures, the children are removed first and the core eventually
      calls regulator_unregister() for them. As regulator_put() was never done
      for them (opposite of devm_regulator_bulk_get()), the kernel WARNs at
      
      	WARN_ON(rdev->open_count);
      
      And eventually it crashes from debugfs_remove_recursive().
      
      --------x------------------x----------------
      
       wm8994 3-001a: Device is not a WM8994, ID is 0
       ------------[ cut here ]------------
       WARNING: CPU: 0 PID: 1 at /mnt/ssd/all/work/repos/devel/linux/drivers/regulator/core.c:4072 regulator_unregister+0xc8/0xd0
       Modules linked in:
       CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.8.0-rc6-00154-g54fe84cbd50b #41
       Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
       [<c010e24c>] (unwind_backtrace) from [<c010af38>] (show_stack+0x10/0x14)
       [<c010af38>] (show_stack) from [<c032a1c4>] (dump_stack+0x88/0x9c)
       [<c032a1c4>] (dump_stack) from [<c011a98c>] (__warn+0xe8/0x100)
       [<c011a98c>] (__warn) from [<c011aa54>] (warn_slowpath_null+0x20/0x28)
       [<c011aa54>] (warn_slowpath_null) from [<c0384a0c>] (regulator_unregister+0xc8/0xd0)
       [<c0384a0c>] (regulator_unregister) from [<c0406434>] (release_nodes+0x16c/0x1dc)
       [<c0406434>] (release_nodes) from [<c04039c4>] (__device_release_driver+0x8c/0x110)
       [<c04039c4>] (__device_release_driver) from [<c0403a64>] (device_release_driver+0x1c/0x28)
       [<c0403a64>] (device_release_driver) from [<c0402b24>] (bus_remove_device+0xd8/0x104)
       [<c0402b24>] (bus_remove_device) from [<c03ffcd8>] (device_del+0x10c/0x218)
       [<c03ffcd8>] (device_del) from [<c0404e4c>] (platform_device_del+0x1c/0x88)
       [<c0404e4c>] (platform_device_del) from [<c0404ec4>] (platform_device_unregister+0xc/0x20)
       [<c0404ec4>] (platform_device_unregister) from [<c0428bc0>] (mfd_remove_devices_fn+0x5c/0x64)
       [<c0428bc0>] (mfd_remove_devices_fn) from [<c03ff9d8>] (device_for_each_child_reverse+0x4c/0x78)
       [<c03ff9d8>] (device_for_each_child_reverse) from [<c04288c4>] (mfd_remove_devices+0x20/0x30)
       [<c04288c4>] (mfd_remove_devices) from [<c042758c>] (wm8994_device_init+0x2ac/0x7f0)
       [<c042758c>] (wm8994_device_init) from [<c04f14a8>] (i2c_device_probe+0x178/0x1fc)
       [<c04f14a8>] (i2c_device_probe) from [<c04036fc>] (driver_probe_device+0x214/0x2c0)
       [<c04036fc>] (driver_probe_device) from [<c0403854>] (__driver_attach+0xac/0xb0)
       [<c0403854>] (__driver_attach) from [<c0401a74>] (bus_for_each_dev+0x68/0x9c)
       [<c0401a74>] (bus_for_each_dev) from [<c0402cf0>] (bus_add_driver+0x1a0/0x218)
       [<c0402cf0>] (bus_add_driver) from [<c040406c>] (driver_register+0x78/0xf8)
       [<c040406c>] (driver_register) from [<c04f20a0>] (i2c_register_driver+0x34/0x84)
       [<c04f20a0>] (i2c_register_driver) from [<c01017d0>] (do_one_initcall+0x40/0x170)
       [<c01017d0>] (do_one_initcall) from [<c0a00dbc>] (kernel_init_freeable+0x15c/0x1fc)
       [<c0a00dbc>] (kernel_init_freeable) from [<c06e07b0>] (kernel_init+0x8/0x114)
       [<c06e07b0>] (kernel_init) from [<c0107978>] (ret_from_fork+0x14/0x3c)
       ---[ end trace 0919d3d0bc998260 ]---
      
       [snip..]
      
       Unable to handle kernel NULL pointer dereference at virtual address 00000078
       pgd = c0004000
       [00000078] *pgd=00000000
       Internal error: Oops: 5 [#1] PREEMPT SMP ARM
       Modules linked in:
       CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W       4.8.0-rc6-00154-g54fe84cbd50b #41
       Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
       task: ee874000 task.stack: ee878000
       PC is at down_write+0x14/0x54
       LR is at debugfs_remove_recursive+0x30/0x150
      
       [snip..]
      
       [<c06e489c>] (down_write) from [<c02e9954>] (debugfs_remove_recursive+0x30/0x150)
       [<c02e9954>] (debugfs_remove_recursive) from [<c0382b78>] (_regulator_put+0x24/0xac)
       [<c0382b78>] (_regulator_put) from [<c0382c1c>] (regulator_put+0x1c/0x2c)
       [<c0382c1c>] (regulator_put) from [<c0406434>] (release_nodes+0x16c/0x1dc)
       [<c0406434>] (release_nodes) from [<c04035d4>] (driver_probe_device+0xec/0x2c0)
       [<c04035d4>] (driver_probe_device) from [<c0403854>] (__driver_attach+0xac/0xb0)
       [<c0403854>] (__driver_attach) from [<c0401a74>] (bus_for_each_dev+0x68/0x9c)
       [<c0401a74>] (bus_for_each_dev) from [<c0402cf0>] (bus_add_driver+0x1a0/0x218)
       [<c0402cf0>] (bus_add_driver) from [<c040406c>] (driver_register+0x78/0xf8)
       [<c040406c>] (driver_register) from [<c04f20a0>] (i2c_register_driver+0x34/0x84)
       [<c04f20a0>] (i2c_register_driver) from [<c01017d0>] (do_one_initcall+0x40/0x170)
       [<c01017d0>] (do_one_initcall) from [<c0a00dbc>] (kernel_init_freeable+0x15c/0x1fc)
       [<c0a00dbc>] (kernel_init_freeable) from [<c06e07b0>] (kernel_init+0x8/0x114)
       [<c06e07b0>] (kernel_init) from [<c0107978>] (ret_from_fork+0x14/0x3c)
       Code: e1a04000 f590f000 e3a03001 e34f3fff (e1902f9f)
       ---[ end trace 0919d3d0bc998262 ]---
      
      --------x------------------x----------------
      
      Fix the kernel warnings and crashes by using regulator_bulk_get()
      instead of devm_regulator_bulk_get() and explicitly freeing the supplies
      in exit paths.
      
      Tested on Exynos 5250, dual core ARM A15 machine.
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Acked-by: NCharles Keepax <ckeepax@opensource.wolfsonmicro.com>
      Signed-off-by: NLee Jones <lee.jones@linaro.org>
      1a41741f
    • V
      mfd: wm8994-core: Disable regulators before removing them · 3cfc43df
      Viresh Kumar 提交于
      The order in which resources were freed in wm8994_device_exit() isn't
      correct. The regulators are removed before they are disabled.
      
      Fix it by reordering code a bit, which makes it exact opposite of
      wm8994_device_init() as well.
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Acked-by: NCharles Keepax <ckeepax@opensource.wolfsonmicro.com>
      Signed-off-by: NLee Jones <lee.jones@linaro.org>
      3cfc43df