1. 16 4月, 2019 4 次提交
  2. 15 4月, 2019 2 次提交
  3. 06 4月, 2019 1 次提交
    • A
      x86/asm: Use stricter assembly constraints in bitops · 5b77e95d
      Alexander Potapenko 提交于
      There's a number of problems with how arch/x86/include/asm/bitops.h
      is currently using assembly constraints for the memory region
      bitops are modifying:
      
      1) Use memory clobber in bitops that touch arbitrary memory
      
      Certain bit operations that read/write bits take a base pointer and an
      arbitrarily large offset to address the bit relative to that base.
      Inline assembly constraints aren't expressive enough to tell the
      compiler that the assembly directive is going to touch a specific memory
      location of unknown size, therefore we have to use the "memory" clobber
      to indicate that the assembly is going to access memory locations other
      than those listed in the inputs/outputs.
      
      To indicate that BTR/BTS instructions don't necessarily touch the first
      sizeof(long) bytes of the argument, we also move the address to assembly
      inputs.
      
      This particular change leads to size increase of 124 kernel functions in
      a defconfig build. For some of them the diff is in NOP operations, other
      end up re-reading values from memory and may potentially slow down the
      execution. But without these clobbers the compiler is free to cache
      the contents of the bitmaps and use them as if they weren't changed by
      the inline assembly.
      
      2) Use byte-sized arguments for operations touching single bytes.
      
      Passing a long value to ANDB/ORB/XORB instructions makes the compiler
      treat sizeof(long) bytes as being clobbered, which isn't the case. This
      may theoretically lead to worse code in the case of heavy optimization.
      
      Practical impact:
      
      I've built a defconfig kernel and looked through some of the functions
      generated by GCC 7.3.0 with and without this clobber, and didn't spot
      any miscompilations.
      
      However there is a (trivial) theoretical case where this code leads to
      miscompilation:
      
        https://lkml.org/lkml/2019/3/28/393
      
      using just GCC 8.3.0 with -O2.  It isn't hard to imagine someone writes
      such a function in the kernel someday.
      
      So the primary motivation is to fix an existing misuse of the asm
      directive, which happens to work in certain configurations now, but
      isn't guaranteed to work under different circumstances.
      
      [ --mingo: Added -stable tag because defconfig only builds a fraction
        of the kernel and the trivial testcase looks normal enough to
        be used in existing or in-development code. ]
      Signed-off-by: NAlexander Potapenko <glider@google.com>
      Cc: <stable@vger.kernel.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: James Y Knight <jyknight@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20190402112813.193378-1-glider@google.com
      [ Edited the changelog, tidied up one of the defines. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      5b77e95d
  4. 02 4月, 2019 1 次提交
    • X
      x86/resctrl: Fix typos in the mba_sc mount option · faa3604e
      Xiaochen Shen 提交于
      The user can control the MBA memory bandwidth in MBps (Mega
      Bytes per second) units of the MBA Software Controller (mba_sc)
      by using the "mba_MBps" mount option. For details, see
      Documentation/x86/resctrl_ui.txt.
      
      However, commit
      
        23bf1b6b ("kernfs, sysfs, cgroup, intel_rdt: Support fs_context")
      
      changed the mount option name from "mba_MBps" to "mba_mpbs" by mistake.
      
      Change it back from to "mba_MBps" because it is user-visible, and
      correct "Opt_mba_mpbs" spelling to "Opt_mba_mbps".
      
       [ bp: massage commit message. ]
      
      Fixes: 23bf1b6b ("kernfs, sysfs, cgroup, intel_rdt: Support fs_context")
      Signed-off-by: NXiaochen Shen <xiaochen.shen@intel.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: dhowells@redhat.com
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: pei.p.jia@intel.com
      Cc: Reinette Chatre <reinette.chatre@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/1553896238-22130-1-git-send-email-xiaochen.shen@intel.com
      faa3604e
  5. 29 3月, 2019 2 次提交
  6. 28 3月, 2019 2 次提交
    • R
      x86/mm: Don't exceed the valid physical address space · 92c77f7c
      Ralph Campbell 提交于
      valid_phys_addr_range() is used to sanity check the physical address range
      of an operation, e.g., access to /dev/mem. It uses __pa(high_memory)
      internally.
      
      If memory is populated at the end of the physical address space, then
      __pa(high_memory) is outside of the physical address space because:
      
         high_memory = (void *)__va(max_pfn * PAGE_SIZE - 1) + 1;
      
      For the comparison in valid_phys_addr_range() this is not an issue, but if
      CONFIG_DEBUG_VIRTUAL is enabled, __pa() maps to __phys_addr(), which
      verifies that the resulting physical address is within the valid physical
      address space of the CPU. So in the case that memory is populated at the
      end of the physical address space, this is not true and triggers a
      VIRTUAL_BUG_ON().
      
      Use __pa(high_memory - 1) to prevent the conversion from going beyond
      the end of valid physical addresses.
      
      Fixes: be62a320 ("x86/mm: Limit mmap() of /dev/mem to valid physical addresses")
      Signed-off-by: NRalph Campbell <rcampbell@nvidia.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Craig Bergstrom <craigb@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Hans Verkuil <hans.verkuil@cisco.com>
      Cc: Mauro Carvalho Chehab <mchehab@s-opensource.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sander Eikelenboom <linux@eikelenboom.it>
      Cc: Sean Young <sean@mess.org>
      
      Link: https://lkml.kernel.org/r/20190326001817.15413-2-rcampbell@nvidia.com
      92c77f7c
    • D
      x86/retpolines: Disable switch jump tables when retpolines are enabled · a9d57ef1
      Daniel Borkmann 提交于
      Commit ce02ef06 ("x86, retpolines: Raise limit for generating indirect
      calls from switch-case") raised the limit under retpolines to 20 switch
      cases where gcc would only then start to emit jump tables, and therefore
      effectively disabling the emission of slow indirect calls in this area.
      
      After this has been brought to attention to gcc folks [0], Martin Liska
      has then fixed gcc to align with clang by avoiding to generate switch jump
      tables entirely under retpolines. This is taking effect in gcc starting
      from stable version 8.4.0. Given kernel supports compilation with older
      versions of gcc where the fix is not being available or backported anymore,
      we need to keep the extra KBUILD_CFLAGS around for some time and generally
      set the -fno-jump-tables to align with what more recent gcc is doing
      automatically today.
      
      More than 20 switch cases are not expected to be fast-path critical, but
      it would still be good to align with gcc behavior for versions < 8.4.0 in
      order to have consistency across supported gcc versions. vmlinux size is
      slightly growing by 0.27% for older gcc. This flag is only set to work
      around affected gcc, no change for clang.
      
        [0] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86952Suggested-by: NMartin Liska <mliska@suse.cz>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jesper Dangaard Brouer <brouer@redhat.com>
      Cc: Björn Töpel<bjorn.topel@intel.com>
      Cc: Magnus Karlsson <magnus.karlsson@intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: H.J. Lu <hjl.tools@gmail.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: David S. Miller <davem@davemloft.net>
      Link: https://lkml.kernel.org/r/20190325135620.14882-1-daniel@iogearbox.net
      a9d57ef1
  7. 27 3月, 2019 2 次提交
  8. 25 3月, 2019 1 次提交
  9. 23 3月, 2019 2 次提交
  10. 22 3月, 2019 1 次提交
    • V
      x86/mm/pti: Make local symbols static · 4fe64a62
      Valdis Kletnieks 提交于
      With 'make C=2 W=1', sparse and gcc both complain:
      
        CHECK   arch/x86/mm/pti.c
      arch/x86/mm/pti.c:84:3: warning: symbol 'pti_mode' was not declared. Should it be static?
      arch/x86/mm/pti.c:605:6: warning: symbol 'pti_set_kernel_image_nonglobal' was not declared. Should it be static?
        CC      arch/x86/mm/pti.o
      arch/x86/mm/pti.c:605:6: warning: no previous prototype for 'pti_set_kernel_image_nonglobal' [-Wmissing-prototypes]
        605 | void pti_set_kernel_image_nonglobal(void)
            |      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      
      pti_set_kernel_image_nonglobal() is only used locally. 'pti_mode' exists in
      drivers/hwtracing/intel_th/pti.c as well, but it's a completely unrelated
      local (static) symbol.
      
      Make both static.
      Signed-off-by: NValdis Kletnieks <valdis.kletnieks@vt.edu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/27680.1552376873@turing-police
      
      4fe64a62
  11. 21 3月, 2019 9 次提交
  12. 20 3月, 2019 1 次提交
    • B
      powerpc/mm: Only define MAX_PHYSMEM_BITS in SPARSEMEM configurations · 8bc08689
      Ben Hutchings 提交于
      MAX_PHYSMEM_BITS only needs to be defined if CONFIG_SPARSEMEM is
      enabled, and that was the case before commit 4ffe713b
      ("powerpc/mm: Increase the max addressable memory to 2PB").
      
      On 32-bit systems, where CONFIG_SPARSEMEM is not enabled, we now
      define it as 46.  That is larger than the real number of physical
      address bits, and breaks calculations in zsmalloc:
      
        mm/zsmalloc.c:130:49: warning: right shift count is negative
          MAX(32, (ZS_MAX_PAGES_PER_ZSPAGE << PAGE_SHIFT >> OBJ_INDEX_BITS))
                                                         ^~
        ...
        mm/zsmalloc.c:253:21: error: variably modified 'size_class' at file scope
          struct size_class *size_class[ZS_SIZE_CLASSES];
                             ^~~~~~~~~~
      
      Fixes: 4ffe713b ("powerpc/mm: Increase the max addressable memory to 2PB")
      Cc: stable@vger.kernel.org # v4.20+
      Signed-off-by: NBen Hutchings <ben@decadent.org.uk>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      8bc08689
  13. 19 3月, 2019 9 次提交
  14. 18 3月, 2019 2 次提交
    • C
      powerpc/6xx: fix setup and use of SPRN_SPRG_PGDIR for hash32 · 4622a2d4
      Christophe Leroy 提交于
      Not only the 603 but all 6xx need SPRN_SPRG_PGDIR to be initialised at
      startup. This patch move it from __setup_cpu_603() to start_here()
      and __secondary_start(), close to the initialisation of SPRN_THREAD.
      
      Previously, virt addr of PGDIR was retrieved from thread struct.
      Now that it is the phys addr which is stored in SPRN_SPRG_PGDIR,
      hash_page() shall not convert it to phys anymore.
      This patch removes the conversion.
      
      Fixes: 93c4a162 ("powerpc/6xx: Store PGDIR physical address in a SPRG")
      Reported-by: NGuenter Roeck <linux@roeck-us.net>
      Tested-by: NGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      4622a2d4
    • M
      powerpc/vdso64: Fix CLOCK_MONOTONIC inconsistencies across Y2038 · b5b4453e
      Michael Ellerman 提交于
      Jakub Drnec reported:
        Setting the realtime clock can sometimes make the monotonic clock go
        back by over a hundred years. Decreasing the realtime clock across
        the y2k38 threshold is one reliable way to reproduce. Allegedly this
        can also happen just by running ntpd, I have not managed to
        reproduce that other than booting with rtc at >2038 and then running
        ntp. When this happens, anything with timers (e.g. openjdk) breaks
        rather badly.
      
      And included a test case (slightly edited for brevity):
        #define _POSIX_C_SOURCE 199309L
        #include <stdio.h>
        #include <time.h>
        #include <stdlib.h>
        #include <unistd.h>
      
        long get_time(void) {
          struct timespec tp;
          clock_gettime(CLOCK_MONOTONIC, &tp);
          return tp.tv_sec + tp.tv_nsec / 1000000000;
        }
      
        int main(void) {
          long last = get_time();
          while(1) {
            long now = get_time();
            if (now < last) {
              printf("clock went backwards by %ld seconds!\n", last - now);
            }
            last = now;
            sleep(1);
          }
          return 0;
        }
      
      Which when run concurrently with:
       # date -s 2040-1-1
       # date -s 2037-1-1
      
      Will detect the clock going backward.
      
      The root cause is that wtom_clock_sec in struct vdso_data is only a
      32-bit signed value, even though we set its value to be equal to
      tk->wall_to_monotonic.tv_sec which is 64-bits.
      
      Because the monotonic clock starts at zero when the system boots the
      wall_to_montonic.tv_sec offset is negative for current and future
      dates. Currently on a freshly booted system the offset will be in the
      vicinity of negative 1.5 billion seconds.
      
      However if the wall clock is set past the Y2038 boundary, the offset
      from wall to monotonic becomes less than negative 2^31, and no longer
      fits in 32-bits. When that value is assigned to wtom_clock_sec it is
      truncated and becomes positive, causing the VDSO assembly code to
      calculate CLOCK_MONOTONIC incorrectly.
      
      That causes CLOCK_MONOTONIC to jump ahead by ~4 billion seconds which
      it is not meant to do. Worse, if the time is then set back before the
      Y2038 boundary CLOCK_MONOTONIC will jump backward.
      
      We can fix it simply by storing the full 64-bit offset in the
      vdso_data, and using that in the VDSO assembly code. We also shuffle
      some of the fields in vdso_data to avoid creating a hole.
      
      The original commit that added the CLOCK_MONOTONIC support to the VDSO
      did actually use a 64-bit value for wtom_clock_sec, see commit
      a7f290da ("[PATCH] powerpc: Merge vdso's and add vdso support to
      32 bits kernel") (Nov 2005). However just 3 days later it was
      converted to 32-bits in commit 0c37ec2a ("[PATCH] powerpc: vdso
      fixes (take #2)"), and the bug has existed since then AFAICS.
      
      Fixes: 0c37ec2a ("[PATCH] powerpc: vdso fixes (take #2)")
      Cc: stable@vger.kernel.org # v2.6.15+
      Link: http://lkml.kernel.org/r/HaC.ZfES.62bwlnvAvMP.1STMMj@seznam.czReported-by: NJakub Drnec <jaydee@email.cz>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      b5b4453e
  15. 17 3月, 2019 1 次提交