1. 30 4月, 2017 1 次提交
  2. 24 4月, 2017 2 次提交
  3. 21 4月, 2017 1 次提交
    • V
      ARCv2: entry: save Accumulator register pair (r58:59) if present · 3d5e8012
      Vineet Gupta 提交于
      Accumulator is present in configs with FPU and/or DSP MPY (mpy > 6)
      
      Instead of doing this in pt_regs (and thus every kernel entry/exit),
      this could have been done in context switch (and for user task only) as
      currently kernel doesn't clobber these registers for its own accord.
      However we will soon start using 64-bit multiply instructions for kernel
      which can clobber these. Also gcc folks also plan to start using these
      as GPRs, hence better to always save/restore them
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      3d5e8012
  4. 19 4月, 2017 4 次提交
    • J
      x86/build: convert function graph '-Os' error to warning · a5859c6d
      Josh Poimboeuf 提交于
      For pre-4.6.0 versions of GCC, which don't have '-mfentry', the
      '-maccumulate-outgoing-args' option is required for function graph
      tracing in order to avoid GCC bug 42109.
      
      However, GCC ignores '-maccumulate-outgoing-args' when '-Os' is
      also set.
      
      Currently we force a build error to prevent that scenario, but that
      breaks randconfigs.  So change the error to a warning which also
      disables CONFIG_CC_OPTIMIZE_FOR_SIZE.
      Reported-by: NAndi Kleen <andi@firstfloor.org>
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: kbuild test robot <fengguang.wu@intel.com>
      Cc: kbuild-all@01.org
      Link: http://lkml.kernel.org/r/20170418214429.o7fbwbmf4nqosezy@trebleSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a5859c6d
    • V
      x86/mce: Make the MCE notifier a blocking one · 0dc9c639
      Vishal Verma 提交于
      The NFIT MCE handler callback (for handling media errors on NVDIMMs)
      takes a mutex to add the location of a memory error to a list. But since
      the notifier call chain for machine checks (x86_mce_decoder_chain) is
      atomic, we get a lockdep splat like:
      
        BUG: sleeping function called from invalid context at kernel/locking/mutex.c:620
        in_atomic(): 1, irqs_disabled(): 0, pid: 4, name: kworker/0:0
        [..]
        Call Trace:
         dump_stack
         ___might_sleep
         __might_sleep
         mutex_lock_nested
         ? __lock_acquire
         nfit_handle_mce
         notifier_call_chain
         atomic_notifier_call_chain
         ? atomic_notifier_call_chain
         mce_gen_pool_process
      
      Convert the notifier to a blocking one which gets to run only in process
      context.
      
      Boris: remove the notifier call in atomic context in print_mce(). For
      now, let's print the MCE on the atomic path so that we can make sure
      they go out and get logged at least.
      
      Fixes: 6839a6d9 ("nfit: do an ARS scrub on hitting a latent media error")
      Reported-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NVishal Verma <vishal.l.verma@intel.com>
      Acked-by: NTony Luck <tony.luck@intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: linux-edac <linux-edac@vger.kernel.org>
      Cc: x86-ml <x86@kernel.org>
      Cc: <stable@vger.kernel.org>
      Link: http://lkml.kernel.org/r/20170411224457.24777-1-vishal.l.verma@intel.comSigned-off-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      0dc9c639
    • N
      sparc64: Fix hugepage page table free · 544f8f93
      Nitin Gupta 提交于
      Make sure the start adderess is aligned to PMD_SIZE
      boundary when freeing page table backing a hugepage
      region. The issue was causing segfaults when a region
      backed by 64K pages was unmapped since such a region
      is in general not PMD_SIZE aligned.
      Signed-off-by: NNitin Gupta <nitin.m.gupta@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      544f8f93
    • D
      sparc64: Use LOCKDEP_SMALL, not PROVE_LOCKING_SMALL · 395102db
      Daniel Jordan 提交于
      CONFIG_PROVE_LOCKING_SMALL shrinks the memory usage of lockdep so the
      kernel text, data, and bss fit in the required 32MB limit, but this
      option is not set for every config that enables lockdep.
      
      A 4.10 kernel fails to boot with the console output
      
          Kernel: Using 8 locked TLB entries for main kernel image.
          hypervisor_tlb_lock[2000000:0:8000000071c007c3:1]: errors with f
          Program terminated
      
      with these config options
      
          CONFIG_LOCKDEP=y
          CONFIG_LOCK_STAT=y
          CONFIG_PROVE_LOCKING=n
      
      To fix, rename CONFIG_PROVE_LOCKING_SMALL to CONFIG_LOCKDEP_SMALL, and
      enable this option with CONFIG_LOCKDEP=y so we get the reduced memory
      usage every time lockdep is turned on.
      
      Tested that CONFIG_LOCKDEP_SMALL is set to 'y' if and only if
      CONFIG_LOCKDEP is set to 'y'.  When other lockdep-related config options
      that select CONFIG_LOCKDEP are enabled (e.g. CONFIG_LOCK_STAT or
      CONFIG_PROVE_LOCKING), verified that CONFIG_LOCKDEP_SMALL is also
      enabled.
      
      Fixes: e6b5f1be ("config: Adding the new config parameter CONFIG_PROVE_LOCKING_SMALL for sparc")
      Signed-off-by: NDaniel Jordan <daniel.m.jordan@oracle.com>
      Reviewed-by: NBabu Moger <babu.moger@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      395102db
  5. 18 4月, 2017 2 次提交
    • M
      powerpc/64: Fix HMI exception on LE with CONFIG_RELOCATABLE=y · be5c5e84
      Michael Ellerman 提交于
      Prior to commit 2337d207 ("powerpc/64: CONFIG_RELOCATABLE support for hmi
      interrupts"), the branch from hmi_exception_early() to hmi_exception_realmode()
      was just a bl hmi_exception_realmode, which the linker would turn into a bl to
      the local entry point of hmi_exception_realmode. This was broken when
      CONFIG_RELOCATABLE=y because hmi_exception_realmode() is not in the low part of
      the kernel text that is copied down to 0x0.
      
      But in fixing that, we added a new bug on little endian kernels. Because the
      branch is now a bctrl when CONFIG_RELOCATABLE=y, we branch to the global entry
      point of hmi_exception_realmode(). The global entry point must be called with
      r12 containing the address of hmi_exception_realmode(), because it uses that
      value to calculate the TOC value (r2).
      
      This may manifest as a checkstop, because we take a junk value from r12 which
      came from HSRR1, add a small constant to it and then use that as the TOC
      pointer. The HSRR1 value will have 0x9 as the top nibble, which puts it above
      RAM and somewhere in MMIO space.
      
      Fix it by changing the BRANCH_LINK_TO_FAR() macro to always use r12 to load the
      label we're branching to. This means r12 will be setup correctly on LE, fixing
      this bug, and r12 is also volatile across function calls on BE so it's a good
      choice anyway.
      
      Fixes: 2337d207 ("powerpc/64: CONFIG_RELOCATABLE support for hmi interrupts")
      Reported-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Acked-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      be5c5e84
    • R
      powerpc/kprobe: Fix oops when kprobed on 'stdu' instruction · 9e1ba4f2
      Ravi Bangoria 提交于
      If we set a kprobe on a 'stdu' instruction on powerpc64, we see a kernel
      OOPS:
      
        Bad kernel stack pointer cd93c840 at c000000000009868
        Oops: Bad kernel stack pointer, sig: 6 [#1]
        ...
        GPR00: c000001fcd93cb30 00000000cd93c840 c0000000015c5e00 00000000cd93c840
        ...
        NIP [c000000000009868] resume_kernel+0x2c/0x58
        LR [c000000000006208] program_check_common+0x108/0x180
      
      On a 64-bit system when the user probes on a 'stdu' instruction, the kernel does
      not emulate actual store in emulate_step() because it may corrupt the exception
      frame. So the kernel does the actual store operation in exception return code
      i.e. resume_kernel().
      
      resume_kernel() loads the saved stack pointer from memory using lwz, which only
      loads the low 32-bits of the address, causing the kernel crash.
      
      Fix this by loading the 64-bit value instead.
      
      Fixes: be96f633 ("powerpc: Split out instruction analysis part of emulate_step()")
      Cc: stable@vger.kernel.org # v3.18+
      Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Reviewed-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Reviewed-by: NAnanth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
      [mpe: Change log massage, add stable tag]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      9e1ba4f2
  6. 16 4月, 2017 1 次提交
  7. 15 4月, 2017 2 次提交
    • M
      parisc: fix bugs in pa_memcpy · 409c1b25
      Mikulas Patocka 提交于
      The patch 554bfece ("parisc: Fix access
      fault handling in pa_memcpy()") reimplements the pa_memcpy function.
      Unfortunatelly, it makes the kernel unbootable. The crash happens in the
      function ide_complete_cmd where memcpy is called with the same source
      and destination address.
      
      This patch fixes a few bugs in pa_memcpy:
      
      * When jumping to .Lcopy_loop_16 for the first time, don't skip the
        instruction "ldi 31,t0" (this bug made the kernel unbootable)
      * Use the COND macro when comparing length, so that the comparison is
        64-bit (a theoretical issue, in case the length is greater than
        0xffffffff)
      * Don't use the COND macro after the "extru" instruction (the PA-RISC
        specification says that the upper 32-bits of extru result are undefined,
        although they are set to zero in practice)
      * Fix exception addresses in .Lcopy16_fault and .Lcopy8_fault
      * Rename .Lcopy_loop_4 to .Lcopy_loop_8 (so that it is consistent with
        .Lcopy8_fault)
      
      Cc: <stable@vger.kernel.org> # v4.9+
      Fixes: 554bfece ("parisc: Fix access fault handling in pa_memcpy()")
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NHelge Deller <deller@gmx.de>
      409c1b25
    • N
      ARC: [plat-eznps] Fix build error · 6492f09e
      Noam Camus 提交于
      Make ATOMIC_INIT available for all ARC platforms (including plat-eznps)
      
      Cc: <stable@vger.kernel.org>	# 4.9+
      Signed-off-by: NNoam Camus <noamca@mellanox.com>
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      6492f09e
  8. 14 4月, 2017 3 次提交
  9. 13 4月, 2017 6 次提交
    • O
      x86/efi: Don't try to reserve runtime regions · 6f6266a5
      Omar Sandoval 提交于
      Reserving a runtime region results in splitting the EFI memory
      descriptors for the runtime region. This results in runtime region
      descriptors with bogus memory mappings, leading to interesting crashes
      like the following during a kexec:
      
        general protection fault: 0000 [#1] SMP
        Modules linked in:
        CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.11.0-rc1 #53
        Hardware name: Wiwynn Leopard-Orv2/Leopard-DDR BW, BIOS LBM05   09/30/2016
        RIP: 0010:virt_efi_set_variable()
        ...
        Call Trace:
         efi_delete_dummy_variable()
         efi_enter_virtual_mode()
         start_kernel()
         ? set_init_arg()
         x86_64_start_reservations()
         x86_64_start_kernel()
         start_cpu()
        ...
        Kernel panic - not syncing: Fatal exception
      
      Runtime regions will not be freed and do not need to be reserved, so
      skip the memmap modification in this case.
      Signed-off-by: NOmar Sandoval <osandov@fb.com>
      Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk>
      Cc: <stable@vger.kernel.org> # v4.9+
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Jones <pjones@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-efi@vger.kernel.org
      Fixes: 8e80632f ("efi/esrt: Use efi_mem_reserve() and avoid a kmalloc()")
      Link: http://lkml.kernel.org/r/20170412152719.9779-2-matt@codeblueprint.co.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6f6266a5
    • M
      MIPS: PCI: add controllers before the specified head · edb0b6a0
      Mathias Kresin 提交于
      With commit 23dac14d ("MIPS: PCI: Use struct list_head lists") new
      controllers are added after the specified head where they where added
      before the specified head previously.
      
      Use list_add_tail to restore the former order.
      
      This patches fixes the following PCI error on lantiq:
      
        pci 0000:01:00.0: BAR 0: error updating (0x1c000004 != 0x000000)
      
      Fixes: 23dac14d ("MIPS: PCI: Use struct list_head lists")
      Signed-off-by: NMathias Kresin <dev@kresin.me>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/15808/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      edb0b6a0
    • D
      x86, pmem: fix broken __copy_user_nocache cache-bypass assumptions · 11e63f6d
      Dan Williams 提交于
      Before we rework the "pmem api" to stop abusing __copy_user_nocache()
      for memcpy_to_pmem() we need to fix cases where we may strand dirty data
      in the cpu cache. The problem occurs when copy_from_iter_pmem() is used
      for arbitrary data transfers from userspace. There is no guarantee that
      these transfers, performed by dax_iomap_actor(), will have aligned
      destinations or aligned transfer lengths. Backstop the usage
      __copy_user_nocache() with explicit cache management in these unaligned
      cases.
      
      Yes, copy_from_iter_pmem() is now too big for an inline, but addressing
      that is saved for a later patch that moves the entirety of the "pmem
      api" into the pmem driver directly.
      
      Fixes: 5de490da ("pmem: add copy_from_iter_pmem() and clear_pmem()")
      Cc: <stable@vger.kernel.org>
      Cc: <x86@kernel.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Matthew Wilcox <mawilcox@microsoft.com>
      Reviewed-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      11e63f6d
    • J
      MIPS: KGDB: Use kernel context for sleeping threads · 162b270c
      James Hogan 提交于
      KGDB is a kernel debug stub and it can't be used to debug userland as it
      can only safely access kernel memory.
      
      On MIPS however KGDB has always got the register state of sleeping
      processes from the userland register context at the beginning of the
      kernel stack. This is meaningless for kernel threads (which never enter
      userland), and for user threads it prevents the user seeing what it is
      doing while in the kernel:
      
      (gdb) info threads
        Id   Target Id         Frame
        ...
        3    Thread 2 (kthreadd) 0x0000000000000000 in ?? ()
        2    Thread 1 (init)   0x000000007705c4b4 in ?? ()
        1    Thread -2 (shadowCPU0) 0xffffffff8012524c in arch_kgdb_breakpoint () at arch/mips/kernel/kgdb.c:201
      
      Get the register state instead from the (partial) kernel register
      context stored in the task's thread_struct for resume() to restore. All
      threads now correctly appear to be in context_switch():
      
      (gdb) info threads
        Id   Target Id         Frame
        ...
        3    Thread 2 (kthreadd) context_switch (rq=<optimized out>, cookie=..., next=<optimized out>, prev=0x0) at kernel/sched/core.c:2903
        2    Thread 1 (init)   context_switch (rq=<optimized out>, cookie=..., next=<optimized out>, prev=0x0) at kernel/sched/core.c:2903
        1    Thread -2 (shadowCPU0) 0xffffffff8012524c in arch_kgdb_breakpoint () at arch/mips/kernel/kgdb.c:201
      
      Call clobbered registers which aren't saved and exception registers
      (BadVAddr & Cause) which can't be easily determined without stack
      unwinding are reported as 0. The PC is taken from the return address,
      such that the state presented matches that found immediately after
      returning from resume().
      
      Fixes: 88547001 ("[MIPS] kgdb: add arch support for the kernel's kgdb core")
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Jason Wessel <jason.wessel@windriver.com>
      Cc: linux-mips@linux-mips.org
      Cc: stable@vger.kernel.org
      Patchwork: https://patchwork.linux-mips.org/patch/15829/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      162b270c
    • M
      MIPS: smp-cps: Fix potentially uninitialised value of core · bac06cf0
      Matt Redfearn 提交于
      Turning on DEBUG in smp-cps.c, or compiling the kernel with
      CONFIG_DYNAMIC_DEBUG enabled results the build error:
      
      arch/mips/kernel/smp-cps.c: In function 'play_dead':
      ./include/linux/dynamic_debug.h:126:3: error: 'core' may be used
      uninitialized in this function [-Werror=maybe-uninitialized]
      
      Fix this by always initialising the variable.
      
      Fixes: 0d2808f3 ("MIPS: smp-cps: Add support for CPU hotplug of MIPSr6 processors")
      Signed-off-by: NMatt Redfearn <matt.redfearn@imgtec.com>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: linux-mips@linux-mips.org
      Cc: linux-kernel@vger.kernel.org
      Patchwork: https://patchwork.linux-mips.org/patch/15848/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      bac06cf0
    • K
      mm: Tighten x86 /dev/mem with zeroing reads · a4866aa8
      Kees Cook 提交于
      Under CONFIG_STRICT_DEVMEM, reading System RAM through /dev/mem is
      disallowed. However, on x86, the first 1MB was always allowed for BIOS
      and similar things, regardless of it actually being System RAM. It was
      possible for heap to end up getting allocated in low 1MB RAM, and then
      read by things like x86info or dd, which would trip hardened usercopy:
      
      usercopy: kernel memory exposure attempt detected from ffff880000090000 (dma-kmalloc-256) (4096 bytes)
      
      This changes the x86 exception for the low 1MB by reading back zeros for
      System RAM areas instead of blindly allowing them. More work is needed to
      extend this to mmap, but currently mmap doesn't go through usercopy, so
      hardened usercopy won't Oops the kernel.
      Reported-by: NTommi Rantala <tommi.t.rantala@nokia.com>
      Tested-by: NTommi Rantala <tommi.t.rantala@nokia.com>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      a4866aa8
  10. 12 4月, 2017 4 次提交
  11. 11 4月, 2017 5 次提交
  12. 10 4月, 2017 3 次提交
    • J
      MIPS: cevt-r4k: Fix out-of-bounds array access · 9d7f29cd
      James Hogan 提交于
      calculate_min_delta() may incorrectly access a 4th element of buf2[]
      which only has 3 elements. This may trigger undefined behaviour and has
      been reported to cause strange crashes in start_kernel() sometime after
      timer initialization when built with GCC 5.3, possibly due to
      register/stack corruption:
      
      sched_clock: 32 bits at 200MHz, resolution 5ns, wraps every 10737418237ns
      CPU 0 Unable to handle kernel paging request at virtual address ffffb0aa, epc == 8067daa8, ra == 8067da84
      Oops[#1]:
      CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.18 #51
      task: 8065e3e0 task.stack: 80644000
      $ 0   : 00000000 00000001 00000000 00000000
      $ 4   : 8065b4d0 00000000 805d0000 00000010
      $ 8   : 00000010 80321400 fffff000 812de408
      $12   : 00000000 00000000 00000000 ffffffff
      $16   : 00000002 ffffffff 80660000 806a666c
      $20   : 806c0000 00000000 00000000 00000000
      $24   : 00000000 00000010
      $28   : 80644000 80645ed0 00000000 8067da84
      Hi    : 00000000
      Lo    : 00000000
      epc   : 8067daa8 start_kernel+0x33c/0x500
      ra    : 8067da84 start_kernel+0x318/0x500
      Status: 11000402 KERNEL EXL
      Cause : 4080040c (ExcCode 03)
      BadVA : ffffb0aa
      PrId  : 0501992c (MIPS 1004Kc)
      Modules linked in:
      Process swapper/0 (pid: 0, threadinfo=80644000, task=8065e3e0, tls=00000000)
      Call Trace:
      [<8067daa8>] start_kernel+0x33c/0x500
      Code: 24050240  0c0131f9  24849c64 <a200b0a8> 41606020  000000c0  0c1a45e6 00000000  0c1a5f44
      
      UBSAN also detects the same issue:
      
      ================================================================
      UBSAN: Undefined behaviour in arch/mips/kernel/cevt-r4k.c:85:41
      load of address 80647e4c with insufficient space
      for an object of type 'unsigned int'
      CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.18 #47
      Call Trace:
      [<80028f70>] show_stack+0x88/0xa4
      [<80312654>] dump_stack+0x84/0xc0
      [<8034163c>] ubsan_epilogue+0x14/0x50
      [<803417d8>] __ubsan_handle_type_mismatch+0x160/0x168
      [<8002dab0>] r4k_clockevent_init+0x544/0x764
      [<80684d34>] time_init+0x18/0x90
      [<8067fa5c>] start_kernel+0x2f0/0x500
      =================================================================
      
      buf2[] is intentionally only 3 elements so that the last element is the
      median once 5 samples have been inserted, so explicitly prevent the
      possibility of comparing against the 4th element rather than extending
      the array.
      
      Fixes: 1fa40555 ("MIPS: cevt-r4k: Dynamically calculate min_delta_ns")
      Reported-by: NRabin Vincent <rabinv@axis.com>
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Tested-by: NRabin Vincent <rabinv@axis.com>
      Cc: linux-mips@linux-mips.org
      Cc: <stable@vger.kernel.org> # 4.7.x-
      Patchwork: https://patchwork.linux-mips.org/patch/15892/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      9d7f29cd
    • R
      MIPS: perf: fix deadlock · f2b42866
      Rabin Vincent 提交于
      mipsxx_pmu_handle_shared_irq() calls irq_work_run() while holding the
      pmuint_rwlock for read.  irq_work_run() can, via perf_pending_event(),
      call try_to_wake_up() which can try to take rq->lock.
      
      However, perf can also call perf_pmu_enable() (and thus take the
      pmuint_rwlock for write) while holding the rq->lock, from
      finish_task_switch() via perf_event_context_sched_in().
      
      This leads to an ABBA deadlock:
      
       PID: 3855   TASK: 8f7ce288  CPU: 2   COMMAND: "process"
        #0 [89c39ac8] __delay at 803b5be4
        #1 [89c39ac8] do_raw_spin_lock at 8008fdcc
        #2 [89c39af8] try_to_wake_up at 8006e47c
        #3 [89c39b38] pollwake at 8018eab0
        #4 [89c39b68] __wake_up_common at 800879f4
        #5 [89c39b98] __wake_up at 800880e4
        #6 [89c39bc8] perf_event_wakeup at 8012109c
        #7 [89c39be8] perf_pending_event at 80121184
        #8 [89c39c08] irq_work_run_list at 801151f0
        #9 [89c39c38] irq_work_run at 80115274
       #10 [89c39c50] mipsxx_pmu_handle_shared_irq at 8002cc7c
      
       PID: 1481   TASK: 8eaac6a8  CPU: 3   COMMAND: "process"
        #0 [8de7f900] do_raw_write_lock at 800900e0
        #1 [8de7f918] perf_event_context_sched_in at 80122310
        #2 [8de7f938] __perf_event_task_sched_in at 80122608
        #3 [8de7f958] finish_task_switch at 8006b8a4
        #4 [8de7f998] __schedule at 805e4dc4
        #5 [8de7f9f8] schedule at 805e5558
        #6 [8de7fa10] schedule_hrtimeout_range_clock at 805e9984
        #7 [8de7fa70] poll_schedule_timeout at 8018e8f8
        #8 [8de7fa88] do_select at 8018f338
        #9 [8de7fd88] core_sys_select at 8018f5cc
       #10 [8de7fee0] sys_select at 8018f854
       #11 [8de7ff28] syscall_common at 80028fc8
      
      The lock seems to be there to protect the hardware counters so there is
      no need to hold it across irq_work_run().
      Signed-off-by: NRabin Vincent <rabinv@axis.com>
      Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      f2b42866
    • M
      MIPS: Malta: Fix i8259 irqchip setup · 9eec1c01
      Matt Redfearn 提交于
      Since commit 4cfffcfa ("irqchip/mips-gic: Fix local interrupts"),
      the gic driver has been allocating virq's for local interrupts during
      its initialisation. Unfortunately on Malta platforms, these are the
      first IRQs to be allocated and so are allocated virqs 1-3. The i8259
      driver uses a legacy irq domain which expects to map virqs 0-15. Probing
      of that driver therefore fails because some of those virqs are already
      taken, with the warning:
      
      WARNING: CPU: 0 PID: 0 at kernel/irq/irqdomain.c:344
      irq_domain_associate+0x1e8/0x228
      error: virq1 is already associated
      Modules linked in:
      CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.10.0-rc6-00011-g4cfffcfa #368
      Stack : 00000000 00000000 807ae03a 0000004d 00000000 806c1010 0000000b ffff0a01
              80725467 807258f4 806a64a4 00000000 00000000 807a9acc 00000100 80713e68
              806d5598 8017593c 8072bf90 8072bf94 806ac358 00000000 806abb60 80713ce4
              00000100 801b22d4 806d5598 8017593c 807ae03a 00000000 80713ce4 80720000
              00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
              ...
      Call Trace:
      [<8010c480>] show_stack+0x88/0xa4
      [<80376758>] dump_stack+0x88/0xd0
      [<8012c4a8>] __warn+0x104/0x118
      [<8012c4ec>] warn_slowpath_fmt+0x30/0x3c
      [<8017edfc>] irq_domain_associate+0x1e8/0x228
      [<8017efd0>] irq_domain_add_legacy+0x7c/0xb0
      [<80764c50>] __init_i8259_irqs+0x64/0xa0
      [<80764ca4>] i8259_of_init+0x18/0x74
      [<8076ddc0>] of_irq_init+0x19c/0x310
      [<80752dd8>] arch_init_irq+0x28/0x19c
      [<80750a08>] start_kernel+0x2a8/0x434
      
      Fix this by reserving the required i8259 virqs in malta platform code
      before probing any irq chips.
      
      Fixes: 4cfffcfa ("irqchip/mips-gic: Fix local interrupts")
      Signed-off-by: NMatt Redfearn <matt.redfearn@imgtec.com>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Cc: linux-kernel@vger.kernel.org
      Patchwork: https://patchwork.linux-mips.org/patch/15919/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      9eec1c01
  13. 07 4月, 2017 6 次提交
    • W
      Revert "Revert "arm64: hugetlb: partial revert of 66b3923a"" · 6ae979ab
      Will Deacon 提交于
      The use of the contiguous bit by our hugetlb implementation violates
      the break-before-make requirements of the architecture and can lead to
      silent data corruption or TLB conflict aborts. Once again, disable these
      hugetlb sizes whilst it gets worked out.
      
      This reverts commit ab2e1b89.
      
      Conflicts:
      	arch/arm64/mm/hugetlbpage.c
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      6ae979ab
    • M
      powerpc/crypto/crc32c-vpmsum: Fix missing preempt_disable() · 4749228f
      Michael Ellerman 提交于
      In crc32c_vpmsum() we call enable_kernel_altivec() without first
      disabling preemption, which is not allowed:
      
        WARNING: CPU: 9 PID: 2949 at ../arch/powerpc/kernel/process.c:277 enable_kernel_altivec+0x100/0x120
        Modules linked in: dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c vmx_crypto ...
        CPU: 9 PID: 2949 Comm: docker Not tainted 4.11.0-rc5-compiler_gcc-6.3.1-00033-g308ac756 #381
        ...
        NIP [c00000000001e320] enable_kernel_altivec+0x100/0x120
        LR [d000000003df0910] crc32c_vpmsum+0x108/0x150 [crc32c_vpmsum]
        Call Trace:
          0xc138fd09 (unreliable)
          crc32c_vpmsum+0x108/0x150 [crc32c_vpmsum]
          crc32c_vpmsum_update+0x3c/0x60 [crc32c_vpmsum]
          crypto_shash_update+0x88/0x1c0
          crc32c+0x64/0x90 [libcrc32c]
          dm_bm_checksum+0x48/0x80 [dm_persistent_data]
          sb_check+0x84/0x120 [dm_thin_pool]
          dm_bm_validate_buffer.isra.0+0xc0/0x1b0 [dm_persistent_data]
          dm_bm_read_lock+0x80/0xf0 [dm_persistent_data]
          __create_persistent_data_objects+0x16c/0x810 [dm_thin_pool]
          dm_pool_metadata_open+0xb0/0x1a0 [dm_thin_pool]
          pool_ctr+0x4cc/0xb60 [dm_thin_pool]
          dm_table_add_target+0x16c/0x3c0
          table_load+0x184/0x400
          ctl_ioctl+0x2f0/0x560
          dm_ctl_ioctl+0x38/0x50
          do_vfs_ioctl+0xd8/0x920
          SyS_ioctl+0x68/0xc0
          system_call+0x38/0xfc
      
      It used to be sufficient just to call pagefault_disable(), because that
      also disabled preemption. But the two were decoupled in commit 8222dbe2
      ("sched/preempt, mm/fault: Decouple preemption from the page fault
      logic") in mid 2015.
      
      So add the missing preempt_disable/enable(). We should also call
      disable_kernel_fp(), although it does nothing by default, there is a
      debug switch to make it active and all enables should be paired with
      disables.
      
      Fixes: 6dd7a82c ("crypto: powerpc - Add POWER8 optimised crc32c")
      Cc: stable@vger.kernel.org # v4.8+
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      4749228f
    • M
      sparc: remove unused wp_works_ok macro · 86e1066f
      Mathias Krause 提交于
      It's unused for ages, used to be required for ksyms.c back in the v1.1
      times.
      Signed-off-by: NMathias Krause <minipli@googlemail.com>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      86e1066f
    • G
      sparc32: Export vac_cache_size to fix build error · 9d262d95
      Guenter Roeck 提交于
      sparc32:allmodconfig fails to build with the following error.
      
      ERROR: "vac_cache_size" [drivers/infiniband/sw/rxe/rdma_rxe.ko] undefined!
      
      Fixes: cb886455 ("infiniband: Fix alignment of mmap cookies ...")
      Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
      Cc: Doug Ledford <dledford@redhat.com>
      Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9d262d95
    • N
      sparc64: Fix memory corruption when THP is enabled · 76811263
      Nitin Gupta 提交于
      The memory corruption was happening due to incorrect
      TLB/TSB flushing of hugepages.
      Reported-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NNitin Gupta <nitin.m.gupta@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      76811263
    • T
      sparc64: Fix kernel panic due to erroneous #ifdef surrounding pmd_write() · 9ae34dbd
      Tom Hromatka 提交于
      This commit moves sparc64's prototype of pmd_write() outside
      of the CONFIG_TRANSPARENT_HUGEPAGE ifdef.
      
      In 2013, commit a7b9403f ("sparc64: Encode huge PMDs using PTE
      encoding.") exposed a path where pmd_write() could be called without
      CONFIG_TRANSPARENT_HUGEPAGE defined.  This can result in the panic below.
      
      The diff is awkward to read, but the changes are straightforward.
      pmd_write() was moved outside of #ifdef CONFIG_TRANSPARENT_HUGEPAGE.
      Also, __HAVE_ARCH_PMD_WRITE was defined.
      
      kernel BUG at include/asm-generic/pgtable.h:576!
                    \|/ ____ \|/
                    "@'/ .. \`@"
                    /_| \__/ |_\
                       \__U_/
      oracle_8114_cdb(8114): Kernel bad sw trap 5 [#1]
      CPU: 120 PID: 8114 Comm: oracle_8114_cdb Not tainted
      4.1.12-61.7.1.el6uek.rc1.sparc64 #1
      task: fff8400700a24d60 ti: fff8400700bc4000 task.ti: fff8400700bc4000
      TSTATE: 0000004411e01607 TPC: 00000000004609f8 TNPC: 00000000004609fc Y:
      00000005    Not tainted
      TPC: <gup_huge_pmd+0x198/0x1e0>
      g0: 000000000001c000 g1: 0000000000ef3954 g2: 0000000000000000 g3: 0000000000000001
      g4: fff8400700a24d60 g5: fff8001fa5c10000 g6: fff8400700bc4000 g7: 0000000000000720
      o0: 0000000000bc5058 o1: 0000000000000240 o2: 0000000000006000 o3: 0000000000001c00
      o4: 0000000000000000 o5: 0000048000080000 sp: fff8400700bc6ab1 ret_pc: 00000000004609f0
      RPC: <gup_huge_pmd+0x190/0x1e0>
      l0: fff8400700bc74fc l1: 0000000000020000 l2: 0000000000002000 l3: 0000000000000000
      l4: fff8001f93250950 l5: 000000000113f800 l6: 0000000000000004 l7: 0000000000000000
      i0: fff8400700ca46a0 i1: bd0000085e800453 i2: 000000026a0c4000 i3: 000000026a0c6000
      i4: 0000000000000001 i5: fff800070c958de8 i6: fff8400700bc6b61 i7: 0000000000460dd0
      I7: <gup_pud_range+0x170/0x1a0>
      Call Trace:
       [0000000000460dd0] gup_pud_range+0x170/0x1a0
       [0000000000460e84] get_user_pages_fast+0x84/0x120
       [00000000006f5a18] iov_iter_get_pages+0x98/0x240
       [00000000005fa744] do_direct_IO+0xf64/0x1e00
       [00000000005fbbc0] __blockdev_direct_IO+0x360/0x15a0
       [00000000101f74fc] ext4_ind_direct_IO+0xdc/0x400 [ext4]
       [00000000101af690] ext4_ext_direct_IO+0x1d0/0x2c0 [ext4]
       [00000000101af86c] ext4_direct_IO+0xec/0x220 [ext4]
       [0000000000553bd4] generic_file_read_iter+0x114/0x140
       [00000000005bdc2c] __vfs_read+0xac/0x100
       [00000000005bf254] vfs_read+0x54/0x100
       [00000000005bf368] SyS_pread64+0x68/0x80
      Signed-off-by: NTom Hromatka <tom.hromatka@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9ae34dbd