1. 15 11月, 2016 1 次提交
  2. 12 11月, 2016 5 次提交
  3. 27 10月, 2016 2 次提交
    • N
      powerpc/64s: relocation, register save fixes for system reset interrupt · fb479e44
      Nicholas Piggin 提交于
      This patch does a couple of things. First of all, powernv immediately
      explodes when running a relocated kernel, because the system reset
      exception for handling sleeps does not do correct relocated branches.
      
      Secondly, the sleep handling code trashes the condition and cfar
      registers, which we would like to preserve for debugging purposes (for
      non-sleep case exception).
      
      This patch changes the exception to use the standard format that saves
      registers before any tests or branches are made. It adds the test for
      idle-wakeup as an "extra" to break out of the normal exception path.
      Then it branches to a relocated idle handler that calls the various
      idle handling functions.
      
      After this patch, POWER8 CPU simulator now boots powernv kernel that is
      running at non-zero.
      
      Fixes: 948cf67c ("powerpc: Add NAP mode support on Power7 in HV mode")
      Cc: stable@vger.kernel.org # v3.0+
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Acked-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
      Acked-by: NBalbir Singh <bsingharora@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      fb479e44
    • V
      powerpc/process: Fix CONFIG_ALIVEC typo in restore_tm_state() · 39715bf9
      Valentin Rothberg 提交于
      It should be ALTIVEC, not ALIVEC.
      
      Cyril explains: If a thread performs a transaction with altivec and then
      gets preempted for whatever reason, this bug may cause the kernel to not
      re-enable altivec when that thread runs again. This will result in an
      altivec unavailable fault, when that fault happens inside a user
      transaction the kernel has no choice but to enable altivec and doom the
      transaction.
      
      The result is that transactions using altivec may get aborted more often
      than they should.
      
      The difficulty in catching this with a selftest is my deliberate use of
      the word may above. Optimisations to avoid FPU/altivec/VSX faults mean
      that the kernel will always leave them on for 255 switches. This code
      prevents the kernel turning it off if it got to the 256th switch (and
      userspace was transactional).
      
      Fixes: dc16b553 ("powerpc: Always restore FPU/VEC/VSX if hardware transactional memory in use")
      Reviewed-by: NCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: NValentin Rothberg <valentinrothberg@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      39715bf9
  4. 24 10月, 2016 2 次提交
    • P
      powerpc/64: Fix race condition in setting lock bit in idle/wakeup code · 09b7e37b
      Paul Mackerras 提交于
      This fixes a race condition where one thread that is entering or
      leaving a power-saving state can inadvertently ignore the lock bit
      that was set by another thread, and potentially also clear it.
      The core_idle_lock_held function is called when the lock bit is
      seen to be set.  It polls the lock bit until it is clear, then
      does a lwarx to load the word containing the lock bit and thread
      idle bits so it can be updated.  However, it is possible that the
      value loaded with the lwarx has the lock bit set, even though an
      immediately preceding lwz loaded a value with the lock bit clear.
      If this happens then we go ahead and update the word despite the
      lock bit being set, and when called from pnv_enter_arch207_idle_mode,
      we will subsequently clear the lock bit.
      
      No identifiable misbehaviour has been attributed to this race.
      
      This fixes it by checking the lock bit in the value loaded by the
      lwarx.  If it is set then we just go back and keep on polling.
      
      Fixes: b32aadc1 ("powerpc/powernv: Fix race in updating core_idle_state")
      Cc: stable@vger.kernel.org # v4.2+
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      09b7e37b
    • P
      powerpc/64: Re-fix race condition between going idle and entering guest · 56c46222
      Paul Mackerras 提交于
      Commit 8117ac6a ("powerpc/powernv: Switch off MMU before entering
      nap/sleep/rvwinkle mode", 2014-12-10) fixed a race condition where one
      thread entering a KVM guest could switch the MMU context to the guest
      while another thread was still in host kernel context with the MMU on.
      That commit moved the point where a thread entering a power-saving
      mode set its kvm_hstate.hwthread_state field in its PACA to
      KVM_HWTHREAD_IN_IDLE from a point where the MMU was on to after the
      MMU had been switched off.  That commit also added a comment
      explaining that we have to switch to real mode before setting
      hwthread_state to avoid this race.
      
      Nevertheless, commit 4eae2c9a ("powerpc/powernv: Make
      pnv_powersave_common more generic", 2016-07-08) subsequently moved
      the setting of hwthread_state back to a point where the MMU is on,
      thus reintroducing the race, despite the comment saying that this
      should not be done being included in full in the context lines of
      the patch that did it.
      
      This fixes the race again and adds a bigger and shoutier comment
      explaining the potential race condition.
      
      Fixes: 4eae2c9a ("powerpc/powernv: Make pnv_powersave_common more generic")
      Cc: stable@vger.kernel.org # v4.8+
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Reviewed-by: NShreyas B. Prabhu <shreyasbp@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      56c46222
  5. 12 10月, 2016 1 次提交
  6. 11 10月, 2016 2 次提交
    • N
      powerpc/64s: Fix power4_fixup_nap placement · 7c8cb4b5
      Nicholas Piggin 提交于
      power4_fixup_nap is called from the "common" handlers, not the virt/real
      handlers, therefore it should itself be a common handler. Placing it
      down in the trampoline space caused it to go out of reach of its
      callers, requiring a trampoline inserted at the start of the text
      section, which breaks the fixed section address calculations.
      
      Fixes: da2bc464 ("powerpc/64s: Add new exception vector macros")
      Reported-by: NGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      7c8cb4b5
    • E
      gcc-plugins: Add latent_entropy plugin · 38addce8
      Emese Revfy 提交于
      This adds a new gcc plugin named "latent_entropy". It is designed to
      extract as much possible uncertainty from a running system at boot time as
      possible, hoping to capitalize on any possible variation in CPU operation
      (due to runtime data differences, hardware differences, SMP ordering,
      thermal timing variation, cache behavior, etc).
      
      At the very least, this plugin is a much more comprehensive example for
      how to manipulate kernel code using the gcc plugin internals.
      
      The need for very-early boot entropy tends to be very architecture or
      system design specific, so this plugin is more suited for those sorts
      of special cases. The existing kernel RNG already attempts to extract
      entropy from reliable runtime variation, but this plugin takes the idea to
      a logical extreme by permuting a global variable based on any variation
      in code execution (e.g. a different value (and permutation function)
      is used to permute the global based on loop count, case statement,
      if/then/else branching, etc).
      
      To do this, the plugin starts by inserting a local variable in every
      marked function. The plugin then adds logic so that the value of this
      variable is modified by randomly chosen operations (add, xor and rol) and
      random values (gcc generates separate static values for each location at
      compile time and also injects the stack pointer at runtime). The resulting
      value depends on the control flow path (e.g., loops and branches taken).
      
      Before the function returns, the plugin mixes this local variable into
      the latent_entropy global variable. The value of this global variable
      is added to the kernel entropy pool in do_one_initcall() and _do_fork(),
      though it does not credit any bytes of entropy to the pool; the contents
      of the global are just used to mix the pool.
      
      Additionally, the plugin can pre-initialize arrays with build-time
      random contents, so that two different kernel builds running on identical
      hardware will not have the same starting values.
      Signed-off-by: NEmese Revfy <re.emese@gmail.com>
      [kees: expanded commit message and code comments]
      Signed-off-by: NKees Cook <keescook@chromium.org>
      38addce8
  7. 08 10月, 2016 2 次提交
    • C
      nmi_backtrace: generate one-line reports for idle cpus · 6727ad9e
      Chris Metcalf 提交于
      When doing an nmi backtrace of many cores, most of which are idle, the
      output is a little overwhelming and very uninformative.  Suppress
      messages for cpus that are idling when they are interrupted and just
      emit one line, "NMI backtrace for N skipped: idling at pc 0xNNN".
      
      We do this by grouping all the cpuidle code together into a new
      .cpuidle.text section, and then checking the address of the interrupted
      PC to see if it lies within that section.
      
      This commit suitably tags x86 and tile idle routines, and only adds in
      the minimal framework for other architectures.
      
      Link: http://lkml.kernel.org/r/1472487169-14923-5-git-send-email-cmetcalf@mellanox.comSigned-off-by: NChris Metcalf <cmetcalf@mellanox.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Tested-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Tested-by: Daniel Thompson <daniel.thompson@linaro.org> [arm]
      Tested-by: NPetr Mladek <pmladek@suse.com>
      Cc: Aaron Tomlin <atomlin@redhat.com>
      Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6727ad9e
    • S
      powerpc: implement arch_reserved_kernel_pages · 1e76609c
      Srikar Dronamraju 提交于
      Currently significant amount of memory is reserved only in kernel booted
      to capture kernel dump using the fa_dump method.
      
      Kernels compiled with CONFIG_DEFERRED_STRUCT_PAGE_INIT will initialize
      only certain size memory per node.  The certain size takes into account
      the dentry and inode cache sizes.  Currently the cache sizes are
      calculated based on the total system memory including the reserved
      memory.  However such a kernel when booting the same kernel as fadump
      kernel will not be able to allocate the required amount of memory to
      suffice for the dentry and inode caches.  This results in crashes like
      
      Hence only implement arch_reserved_kernel_pages() for CONFIG_FA_DUMP
      configurations.  The amount reserved will be reduced while calculating
      the large caches and will avoid crashes like the below on large systems
      such as 32 TB systems.
      
        Dentry cache hash table entries: 536870912 (order: 16, 4294967296 bytes)
        vmalloc: allocation failure, allocated 4097114112 of 17179934720 bytes
        swapper/0: page allocation failure: order:0, mode:0x2080020(GFP_ATOMIC)
        CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.6-master+ #3
        Call Trace:
           dump_stack+0xb0/0xf0 (unreliable)
           warn_alloc_failed+0x114/0x160
           __vmalloc_node_range+0x304/0x340
           __vmalloc+0x6c/0x90
           alloc_large_system_hash+0x1b8/0x2c0
           inode_init+0x94/0xe4
           vfs_caches_init+0x8c/0x13c
           start_kernel+0x50c/0x578
           start_here_common+0x20/0xa8
      
      Link: http://lkml.kernel.org/r/1472476010-4709-4-git-send-email-srikar@linux.vnet.ibm.comSigned-off-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Suggested-by: NMel Gorman <mgorman@techsingularity.net>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Cc: Hari Bathini <hbathini@linux.vnet.ibm.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Balbir Singh <bsingharora@gmail.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1e76609c
  8. 04 10月, 2016 25 次提交