1. 30 4月, 2019 1 次提交
    • N
      powerpc/64s: Reimplement book3s idle code in C · 10d91611
      Nicholas Piggin 提交于
      Reimplement Book3S idle code in C, moving POWER7/8/9 implementation
      speific HV idle code to the powernv platform code.
      
      Book3S assembly stubs are kept in common code and used only to save
      the stack frame and non-volatile GPRs before executing architected
      idle instructions, and restoring the stack and reloading GPRs then
      returning to C after waking from idle.
      
      The complex logic dealing with threads and subcores, locking, SPRs,
      HMIs, timebase resync, etc., is all done in C which makes it more
      maintainable.
      
      This is not a strict translation to C code, there are some
      significant differences:
      
      - Idle wakeup no longer uses the ->cpu_restore call to reinit SPRs,
        but saves and restores them itself.
      
      - The optimisation where EC=ESL=0 idle modes did not have to save GPRs
        or change MSR is restored, because it's now simple to do. ESL=1
        sleeps that do not lose GPRs can use this optimization too.
      
      - KVM secondary entry and cede is now more of a call/return style
        rather than branchy. nap_state_lost is not required because KVM
        always returns via NVGPR restoring path.
      
      - KVM secondary wakeup from offline sequence is moved entirely into
        the offline wakeup, which avoids a hwsync in the normal idle wakeup
        path.
      
      Performance measured with context switch ping-pong on different
      threads or cores, is possibly improved a small amount, 1-3% depending
      on stop state and core vs thread test for shallow states. Deep states
      it's in the noise compared with other latencies.
      
      KVM improvements:
      
      - Idle sleepers now always return to caller rather than branch out
        to KVM first.
      
      - This allows optimisations like very fast return to caller when no
        state has been lost.
      
      - KVM no longer requires nap_state_lost because it controls NVGPR
        save/restore itself on the way in and out.
      
      - The heavy idle wakeup KVM request check can be moved out of the
        normal host idle code and into the not-performance-critical offline
        code.
      
      - KVM nap code now returns from where it is called, which makes the
        flow a bit easier to follow.
      Reviewed-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      [mpe: Squash the KVM changes in]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      10d91611
  2. 20 4月, 2019 1 次提交
    • M
      powerpc: Add force enable of DAWR on P9 option · c1fe190c
      Michael Neuling 提交于
      This adds a flag so that the DAWR can be enabled on P9 via:
        echo Y > /sys/kernel/debug/powerpc/dawr_enable_dangerous
      
      The DAWR was previously force disabled on POWER9 in:
        96541531 powerpc: Disable DAWR in the base POWER9 CPU features
      Also see Documentation/powerpc/DAWR-POWER9.txt
      
      This is a dangerous setting, USE AT YOUR OWN RISK.
      
      Some users may not care about a bad user crashing their box
      (ie. single user/desktop systems) and really want the DAWR.  This
      allows them to force enable DAWR.
      
      This flag can also be used to disable DAWR access. Once this is
      cleared, all DAWR access should be cleared immediately and your
      machine once again safe from crashing.
      
      Userspace may get confused by toggling this. If DAWR is force
      enabled/disabled between getting the number of breakpoints (via
      PTRACE_GETHWDBGINFO) and setting the breakpoint, userspace will get an
      inconsistent view of what's available. Similarly for guests.
      
      For the DAWR to be enabled in a KVM guest, the DAWR needs to be force
      enabled in the host AND the guest. For this reason, this won't work on
      POWERVM as it doesn't allow the HCALL to work. Writes of 'Y' to the
      dawr_enable_dangerous file will fail if the hypervisor doesn't support
      writing the DAWR.
      
      To double check the DAWR is working, run this kernel selftest:
        tools/testing/selftests/powerpc/ptrace/ptrace-hwbreak.c
      Any errors/failures/skips mean something is wrong.
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      c1fe190c
  3. 11 4月, 2019 1 次提交
  4. 23 3月, 2019 2 次提交
  5. 22 3月, 2019 1 次提交
    • V
      x86/mm/pti: Make local symbols static · 4fe64a62
      Valdis Kletnieks 提交于
      With 'make C=2 W=1', sparse and gcc both complain:
      
        CHECK   arch/x86/mm/pti.c
      arch/x86/mm/pti.c:84:3: warning: symbol 'pti_mode' was not declared. Should it be static?
      arch/x86/mm/pti.c:605:6: warning: symbol 'pti_set_kernel_image_nonglobal' was not declared. Should it be static?
        CC      arch/x86/mm/pti.o
      arch/x86/mm/pti.c:605:6: warning: no previous prototype for 'pti_set_kernel_image_nonglobal' [-Wmissing-prototypes]
        605 | void pti_set_kernel_image_nonglobal(void)
            |      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      
      pti_set_kernel_image_nonglobal() is only used locally. 'pti_mode' exists in
      drivers/hwtracing/intel_th/pti.c as well, but it's a completely unrelated
      local (static) symbol.
      
      Make both static.
      Signed-off-by: NValdis Kletnieks <valdis.kletnieks@vt.edu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/27680.1552376873@turing-police
      
      4fe64a62
  6. 21 3月, 2019 9 次提交
  7. 20 3月, 2019 1 次提交
    • B
      powerpc/mm: Only define MAX_PHYSMEM_BITS in SPARSEMEM configurations · 8bc08689
      Ben Hutchings 提交于
      MAX_PHYSMEM_BITS only needs to be defined if CONFIG_SPARSEMEM is
      enabled, and that was the case before commit 4ffe713b
      ("powerpc/mm: Increase the max addressable memory to 2PB").
      
      On 32-bit systems, where CONFIG_SPARSEMEM is not enabled, we now
      define it as 46.  That is larger than the real number of physical
      address bits, and breaks calculations in zsmalloc:
      
        mm/zsmalloc.c:130:49: warning: right shift count is negative
          MAX(32, (ZS_MAX_PAGES_PER_ZSPAGE << PAGE_SHIFT >> OBJ_INDEX_BITS))
                                                         ^~
        ...
        mm/zsmalloc.c:253:21: error: variably modified 'size_class' at file scope
          struct size_class *size_class[ZS_SIZE_CLASSES];
                             ^~~~~~~~~~
      
      Fixes: 4ffe713b ("powerpc/mm: Increase the max addressable memory to 2PB")
      Cc: stable@vger.kernel.org # v4.20+
      Signed-off-by: NBen Hutchings <ben@decadent.org.uk>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      8bc08689
  8. 19 3月, 2019 9 次提交
  9. 18 3月, 2019 2 次提交
    • C
      powerpc/6xx: fix setup and use of SPRN_SPRG_PGDIR for hash32 · 4622a2d4
      Christophe Leroy 提交于
      Not only the 603 but all 6xx need SPRN_SPRG_PGDIR to be initialised at
      startup. This patch move it from __setup_cpu_603() to start_here()
      and __secondary_start(), close to the initialisation of SPRN_THREAD.
      
      Previously, virt addr of PGDIR was retrieved from thread struct.
      Now that it is the phys addr which is stored in SPRN_SPRG_PGDIR,
      hash_page() shall not convert it to phys anymore.
      This patch removes the conversion.
      
      Fixes: 93c4a162 ("powerpc/6xx: Store PGDIR physical address in a SPRG")
      Reported-by: NGuenter Roeck <linux@roeck-us.net>
      Tested-by: NGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      4622a2d4
    • M
      powerpc/vdso64: Fix CLOCK_MONOTONIC inconsistencies across Y2038 · b5b4453e
      Michael Ellerman 提交于
      Jakub Drnec reported:
        Setting the realtime clock can sometimes make the monotonic clock go
        back by over a hundred years. Decreasing the realtime clock across
        the y2k38 threshold is one reliable way to reproduce. Allegedly this
        can also happen just by running ntpd, I have not managed to
        reproduce that other than booting with rtc at >2038 and then running
        ntp. When this happens, anything with timers (e.g. openjdk) breaks
        rather badly.
      
      And included a test case (slightly edited for brevity):
        #define _POSIX_C_SOURCE 199309L
        #include <stdio.h>
        #include <time.h>
        #include <stdlib.h>
        #include <unistd.h>
      
        long get_time(void) {
          struct timespec tp;
          clock_gettime(CLOCK_MONOTONIC, &tp);
          return tp.tv_sec + tp.tv_nsec / 1000000000;
        }
      
        int main(void) {
          long last = get_time();
          while(1) {
            long now = get_time();
            if (now < last) {
              printf("clock went backwards by %ld seconds!\n", last - now);
            }
            last = now;
            sleep(1);
          }
          return 0;
        }
      
      Which when run concurrently with:
       # date -s 2040-1-1
       # date -s 2037-1-1
      
      Will detect the clock going backward.
      
      The root cause is that wtom_clock_sec in struct vdso_data is only a
      32-bit signed value, even though we set its value to be equal to
      tk->wall_to_monotonic.tv_sec which is 64-bits.
      
      Because the monotonic clock starts at zero when the system boots the
      wall_to_montonic.tv_sec offset is negative for current and future
      dates. Currently on a freshly booted system the offset will be in the
      vicinity of negative 1.5 billion seconds.
      
      However if the wall clock is set past the Y2038 boundary, the offset
      from wall to monotonic becomes less than negative 2^31, and no longer
      fits in 32-bits. When that value is assigned to wtom_clock_sec it is
      truncated and becomes positive, causing the VDSO assembly code to
      calculate CLOCK_MONOTONIC incorrectly.
      
      That causes CLOCK_MONOTONIC to jump ahead by ~4 billion seconds which
      it is not meant to do. Worse, if the time is then set back before the
      Y2038 boundary CLOCK_MONOTONIC will jump backward.
      
      We can fix it simply by storing the full 64-bit offset in the
      vdso_data, and using that in the VDSO assembly code. We also shuffle
      some of the fields in vdso_data to avoid creating a hole.
      
      The original commit that added the CLOCK_MONOTONIC support to the VDSO
      did actually use a 64-bit value for wtom_clock_sec, see commit
      a7f290da ("[PATCH] powerpc: Merge vdso's and add vdso support to
      32 bits kernel") (Nov 2005). However just 3 days later it was
      converted to 32-bits in commit 0c37ec2a ("[PATCH] powerpc: vdso
      fixes (take #2)"), and the bug has existed since then AFAICS.
      
      Fixes: 0c37ec2a ("[PATCH] powerpc: vdso fixes (take #2)")
      Cc: stable@vger.kernel.org # v2.6.15+
      Link: http://lkml.kernel.org/r/HaC.ZfES.62bwlnvAvMP.1STMMj@seznam.czReported-by: NJakub Drnec <jaydee@email.cz>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      b5b4453e
  10. 17 3月, 2019 3 次提交
  11. 16 3月, 2019 3 次提交
  12. 15 3月, 2019 3 次提交
    • P
      perf/x86: Fixup typo in stub functions · f764c58b
      Peter Zijlstra 提交于
      Guenter reported a build warning for CONFIG_CPU_SUP_INTEL=n:
      
        > With allmodconfig-CONFIG_CPU_SUP_INTEL, this patch results in:
        >
        > In file included from arch/x86/events/amd/core.c:8:0:
        > arch/x86/events/amd/../perf_event.h:1036:45: warning: ‘struct cpu_hw_event’ declared inside parameter list will not be visible outside of this definition or declaration
        >  static inline int intel_cpuc_prepare(struct cpu_hw_event *cpuc, int cpu)
      
      While harmless (an unsed pointer is an unused pointer, no matter the type)
      it needs fixing.
      Reported-by: NGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Fixes: d01b1f96 ("perf/x86/intel: Make cpuc allocations consistent")
      Link: http://lkml.kernel.org/r/20190315081410.GR5996@hirez.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f764c58b
    • P
      perf/x86/intel: Fix memory corruption · ede271b0
      Peter Zijlstra 提交于
      Through:
      
        validate_event()
          x86_pmu.get_event_constraints(.idx=-1)
            tfa_get_event_constraints()
              dyn_constraint()
      
      cpuc->constraint_list[-1] is used, which is an obvious out-of-bound access.
      
      In this case, simply skip the TFA constraint code, there is no event
      constraint with just PMC3, therefore the code will never result in the
      empty set.
      
      Fixes: 400816f6 ("perf/x86/intel: Implement support for TSX Force Abort")
      Reported-by: NTony Jones <tonyj@suse.com>
      Reported-by: N"DSouza, Nelson" <nelson.dsouza@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NTony Jones <tonyj@suse.com>
      Tested-by: N"DSouza, Nelson" <nelson.dsouza@intel.com>
      Cc: eranian@google.com
      Cc: jolsa@redhat.com
      Cc: stable@kernel.org
      Link: https://lkml.kernel.org/r/20190314130705.441549378@infradead.org
      ede271b0
    • P
      MIPS: Remove custom MIPS32 __kernel_fsid_t type · f6cab793
      Paul Burton 提交于
      For MIPS32 kernels we have a custom definition of __kernel_fsid_t. This
      differs from the asm-generic version used by all other architectures &
      MIPS64 in one way - it declares the val field as an array of long,
      rather than an array of int. Since int & long have identical size &
      alignment when targeting MIPS32 anyway, this makes little sense.
      
      Beyond the pointlessness this causes problems for code which prints
      entries from the val array, for example the fanotify_encode_fid()
      function [1]. If such code uses a format specified suited to an int then
      it encounters compiler warnings when building for MIPS32, such as:
      
        In file included from include/linux/kernel.h:14:0,
                         from include/linux/list.h:9,
                         from include/linux/preempt.h:11,
                         from include/linux/spinlock.h:51,
                         from include/linux/fdtable.h:11,
                         from fs/notify/fanotify/fanotify.c:3:
        fs/notify/fanotify/fanotify.c: In function 'fanotify_encode_fid':
        include/linux/kern_levels.h:5:18: warning: format '%x' expects argument
          of type 'unsigned int', but argument 2 has type 'long int' [-Wformat=]
      
      Remove the custom __kernel_fsid_t definition & make use of the
      asm-generic version which will have an identical layout in memory
      anyway, in order to remove the inconsistency with other architectures.
      
      One possible regression this could cause if is any code is attempting to
      print entries from the val array with a long-sized format specifier, in
      which case it would begin seeing compiler warnings when built against
      kernel headers including this change. Since such code is exceedingly
      rare, and would have to be MIPS32-specific to expect a long, this seems
      to be a problem that it's extremely unlikely anyone will encounter.
      
      [1] https://lore.kernel.org/linux-mips/CAOQ4uxiEkczB7PNCXegFC-eYb9zAGaio_o=OgHAJHFd7eavBxA@mail.gmail.com/T/#mb43103277c79ef06b884359209e817db1c136140Signed-off-by: NPaul Burton <paul.burton@mips.com>
      Cc: Amir Goldstein <amir73il@gmail.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Jan Kara <jack@suse.cz>
      Cc: linux-arch@vger.kernel.org
      Cc: linux-mips@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      f6cab793
  13. 14 3月, 2019 4 次提交
    • M
      unicore32: simplify linker script generation for decompressor · c649bd59
      Masahiro Yamada 提交于
      When I was searching for unneeded $(KCONFIG_CONFIG) usages, I noticed
      this strange build dependency.
      
      It can use $(call if_changed,...) in case ZTEXTADDR and ZBSSADDR are
      changed, but even a simpler way is to use the pattern rule in
      scripts/Makefile.build. This is what arch/arm/boot/compressed/Makefile
      does.
      
      I did only build test. I confirmed equivalent vmlinux.lds was generated.
      Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      c649bd59
    • M
      h8300: use cc-cross-prefix instead of hardcoding h8300-unknown-linux- · fc2b47b5
      Masahiro Yamada 提交于
      It believe it is a bad idea to hardcode a specific compiler prefix
      that may or may not be installed on a user's system. It is annoying
      when testing features that should not require compilers at all.
      
      For example, mrproper, headers_install, etc. should work without
      any compiler.
      
      They look like follows on my machine.
      
      $ make ARCH=h8300 mrproper
      ./scripts/gcc-version.sh: line 26: h8300-unknown-linux-gcc: command not found
      ./scripts/gcc-version.sh: line 27: h8300-unknown-linux-gcc: command not found
      make: h8300-unknown-linux-gcc: Command not found
      make: h8300-unknown-linux-gcc: Command not found
        [ a bunch of the same error messages continue ]
      
      $ make ARCH=h8300 headers_install
      ./scripts/gcc-version.sh: line 26: h8300-unknown-linux-gcc: command not found
      ./scripts/gcc-version.sh: line 27: h8300-unknown-linux-gcc: command not found
      make: h8300-unknown-linux-gcc: Command not found
        HOSTCC  scripts/basic/fixdep
      make: h8300-unknown-linux-gcc: Command not found
        WRAP    arch/h8300/include/generated/uapi/asm/kvm_para.h
        [ snip ]
      
      The solution is to delete this line, or to use cc-cross-prefix like
      some architectures do. I chose the latter as a moderate fixup.
      
      I added an alternative 'h8300-linux-' because it is available at:
      
      https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/x86_64/8.1.0/Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      fc2b47b5
    • M
      kbuild: move archive command to scripts/Makefile.lib · 898f5a00
      Masahiro Yamada 提交于
      scripts/Makefile.build and arch/s390/boot/Makefile use the same
      command (thin archiving with symbol table creation).
      
      Avoid the code duplication, and move it to scripts/Makefile.lib.
      Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      898f5a00
    • M
      ia64: prefix header search path with $(srctree)/ · 393492b5
      Masahiro Yamada 提交于
      Currently, the Kbuild core manipulates header search paths in a crazy
      way [1].
      
      To fix this mess, I want all Makefiles to add explicit $(srctree)/ to
      the search paths in the srctree. Some Makefiles are already written in
      that way, but not all. The goal of this work is to make the notation
      consistent, and finally get rid of the gross hacks.
      
      Having whitespaces after -I does not matter since commit 48f6e3cf
      ("kbuild: do not drop -I without parameter").
      
      I removed some header search paths because I was able to build ia64
      without them.
      
      [1]: https://patchwork.kernel.org/patch/9632347/Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      393492b5