1. 17 5月, 2016 1 次提交
    • D
      bpf: split HAVE_BPF_JIT into cBPF and eBPF variant · 6077776b
      Daniel Borkmann 提交于
      Split the HAVE_BPF_JIT into two for distinguishing cBPF and eBPF JITs.
      
      Current cBPF ones:
      
        # git grep -n HAVE_CBPF_JIT arch/
        arch/arm/Kconfig:44:    select HAVE_CBPF_JIT
        arch/mips/Kconfig:18:   select HAVE_CBPF_JIT if !CPU_MICROMIPS
        arch/powerpc/Kconfig:129:       select HAVE_CBPF_JIT
        arch/sparc/Kconfig:35:  select HAVE_CBPF_JIT
      
      Current eBPF ones:
      
        # git grep -n HAVE_EBPF_JIT arch/
        arch/arm64/Kconfig:61:  select HAVE_EBPF_JIT
        arch/s390/Kconfig:126:  select HAVE_EBPF_JIT if PACK_STACK && HAVE_MARCH_Z196_FEATURES
        arch/x86/Kconfig:94:    select HAVE_EBPF_JIT                    if X86_64
      
      Later code also needs this facility to check for eBPF JITs.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6077776b
  2. 15 5月, 2016 1 次提交
    • Z
      arm64: bpf: jit JMP_JSET_{X,K} · 98397fc5
      Zi Shen Lim 提交于
      Original implementation commit e54bcde3 ("arm64: eBPF JIT compiler")
      had the relevant code paths, but due to an oversight always fail jiting.
      
      As a result, we had been falling back to BPF interpreter whenever a BPF
      program has JMP_JSET_{X,K} instructions.
      
      With this fix, we confirm that the corresponding tests in lib/test_bpf
      continue to pass, and also jited.
      
      ...
      [    2.784553] test_bpf: #30 JSET jited:1 188 192 197 PASS
      [    2.791373] test_bpf: #31 tcpdump port 22 jited:1 325 677 625 PASS
      [    2.808800] test_bpf: #32 tcpdump complex jited:1 323 731 991 PASS
      ...
      [    3.190759] test_bpf: #237 JMP_JSET_K: if (0x3 & 0x2) return 1 jited:1 110 PASS
      [    3.192524] test_bpf: #238 JMP_JSET_K: if (0x3 & 0xffffffff) return 1 jited:1 98 PASS
      [    3.211014] test_bpf: #249 JMP_JSET_X: if (0x3 & 0x2) return 1 jited:1 120 PASS
      [    3.212973] test_bpf: #250 JMP_JSET_X: if (0x3 & 0xffffffff) return 1 jited:1 89 PASS
      ...
      
      Fixes: e54bcde3 ("arm64: eBPF JIT compiler")
      Signed-off-by: NZi Shen Lim <zlim.lnx@gmail.com>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Acked-by: NYang Shi <yang.shi@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      98397fc5
  3. 12 5月, 2016 5 次提交
    • A
      perf/x86/intel/pt: Generate PMI in the STOP region as well · ab92b232
      Alexander Shishkin 提交于
      Currently, the PT driver always sets the PMI bit one region (page) before
      the STOP region so that we can wake up the consumer before we run out of
      room in the buffer and have to disable the event. However, we also need
      an interrupt in the last output region, so that we actually get to disable
      the event (if no more room from new data is available at that point),
      otherwise hardware just quietly refuses to start, but the event is
      scheduled in and we end up losing trace data till the event gets removed.
      
      For a cpu-wide event it is even worse since there may not be any
      re-scheduling at all and no chance for the ring buffer code to notice
      that its buffer is filled up and the event needs to be disabled (so that
      the consumer can re-enable it when it finishes reading the data out). In
      other words, all the trace data will be lost after the buffer gets filled
      up.
      
      This patch makes PT also generate a PMI when the last output region is
      full.
      Reported-by: NMarkus Metzger <markus.t.metzger@intel.com>
      Signed-off-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: <stable@vger.kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: vince@deater.net
      Link: http://lkml.kernel.org/r/1462886313-13660-2-git-send-email-alexander.shishkin@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ab92b232
    • A
      perf/x86: Fix undefined shift on 32-bit kernels · 6d6f2833
      Andrey Ryabinin 提交于
      Jim reported:
      
      	UBSAN: Undefined behaviour in arch/x86/events/intel/core.c:3708:12
      	shift exponent 35 is too large for 32-bit type 'long unsigned int'
      
      The use of 'unsigned long' type obviously is not correct here, make it
      'unsigned long long' instead.
      Reported-by: NJim Cromie <jim.cromie@gmail.com>
      Signed-off-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: <stable@vger.kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Imre Palik <imrep@amazon.de>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Fixes: 2c33645d ("perf/x86: Honor the architectural performance monitoring version")
      Link: http://lkml.kernel.org/r/1462974711-10037-1-git-send-email-aryabinin@virtuozzo.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6d6f2833
    • P
      perf/x86/msr: Fix SMI overflow · 3c3116b7
      Peter Zijlstra 提交于
      We compute 'delta' and properly sign extend it and then ignore it and
      recompute the raw value, loosing the sign extention.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: kan.liang@intel.com
      Cc: linux-kernel@vger.kernel.org
      Cc: luto@kernel.org
      Cc: ray.huang@amd.com
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      3c3116b7
    • H
      perf/x86/intel/uncore: Fix CHA registers configuration procedure for Knights Landing platform · ec336c87
      hchrzani 提交于
      CHA events in Knights Landing platform require programming filter registers properly.
      Remote node, local node and NonNearMemCachable bits should be set to 1 at all times.
      Signed-off-by: NHubert Chrzaniuk <hubert.chrzaniuk@intel.com>
      Signed-off-by: NLawrence F Meadows <lawrence.f.meadows@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: bp@suse.de
      Cc: harish.chegondi@intel.com
      Cc: hpa@zytor.com
      Cc: izumi.taku@jp.fujitsu.com
      Cc: kan.liang@intel.com
      Cc: lukasz.anaczkowski@intel.com
      Cc: vthakkar1994@gmail.com
      Fixes: 77af0037 ('perf/x86/intel/uncore: Add Knights Landing uncore PMU support')
      Link: http://lkml.kernel.org/r/1462779419-17115-2-git-send-email-hubert.chrzaniuk@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ec336c87
    • M
      x86/extable: ensure entries are swapped completely when sorting · 50c73890
      Mathias Krause 提交于
      The x86 exception table sorting was changed in commit 29934b0f
      ("x86/extable: use generic search and sort routines") to use the arch
      independent code in lib/extable.c.  However, the patch was mangled
      somehow on its way into the kernel from the last version posted at [1].
      The committed version kind of attempted to incorporate the changes of
      commit 548acf19 ("x86/mm: Expand the exception table logic to allow
      new handling options") as in _completely_ _ignoring_ the x86 specific
      'handler' member of struct exception_table_entry.  This effectively
      broke the sorting as entries will only partly be swapped now.
      
      Fortunately, the x86 Kconfig selects BUILDTIME_EXTABLE_SORT, so the
      exception table doesn't need to be sorted at runtime. However, in case
      that ever changes, we better not break the exception table sorting just
      because of that.
      
      [ Ard Biesheuvel points out that BUILDTIME_EXTABLE_SORT applies to the
        core image only, but we still rely on the sorting routines for modules
        in that case - Linus ]
      
      Fix this by providing a swap_ex_entry_fixup() macro that takes care of
      the 'handler' member.
      
      [1] https://lkml.org/lkml/2016/1/27/232Signed-off-by: NMathias Krause <minipli@googlemail.com>
      Fixes: 29934b0f ("x86/extable: use generic search and sort routines")
      Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: H. Peter Anvin <hpa@linux.intel.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      50c73890
  4. 11 5月, 2016 4 次提交
  5. 10 5月, 2016 3 次提交
  6. 07 5月, 2016 1 次提交
  7. 06 5月, 2016 6 次提交
    • D
      parisc: fix a bug when syscall number of tracee is __NR_Linux_syscalls · f0b22d1b
      Dmitry V. Levin 提交于
      Do not load one entry beyond the end of the syscall table when the
      syscall number of a traced process equals to __NR_Linux_syscalls.
      Similar bug with regular processes was fixed by commit 3bb457af
      ("[PARISC] Fix bug when syscall nr is __NR_Linux_syscalls").
      
      This bug was found by strace test suite.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NDmitry V. Levin <ldv@altlinux.org>
      Acked-by: NHelge Deller <deller@gmx.de>
      Signed-off-by: NHelge Deller <deller@gmx.de>
      f0b22d1b
    • C
      x86/tsc: Read all ratio bits from MSR_PLATFORM_INFO · 886123fb
      Chen Yu 提交于
      Currently we read the tsc radio: ratio = (MSR_PLATFORM_INFO >> 8) & 0x1f;
      
      Thus we get bit 8-12 of MSR_PLATFORM_INFO, however according to the SDM
      (35.5), the ratio bits are bit 8-15.
      
      Ignoring the upper bits can result in an incorrect tsc ratio, which causes the
      TSC calibration and the Local APIC timer frequency to be incorrect.
      
      Fix this problem by masking 0xff instead.
      
      [ tglx: Massaged changelog ]
      
      Fixes: 7da7c156 "x86, tsc: Add static (MSR) TSC calibration on Intel Atom SoCs"
      Signed-off-by: NChen Yu <yu.c.chen@intel.com>
      Cc: "Rafael J. Wysocki" <rafael@kernel.org>
      Cc: stable@vger.kernel.org
      Cc: Bin Gao <bin.gao@intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Link: http://lkml.kernel.org/r/1462505619-5516-1-git-send-email-yu.c.chen@intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      886123fb
    • A
      mm: thp: kvm: fix memory corruption in KVM with THP enabled · 127393fb
      Andrea Arcangeli 提交于
      After the THP refcounting change, obtaining a compound pages from
      get_user_pages() no longer allows us to assume the entire compound page
      is immediately mappable from a secondary MMU.
      
      A secondary MMU doesn't want to call get_user_pages() more than once for
      each compound page, in order to know if it can map the whole compound
      page.  So a secondary MMU needs to know from a single get_user_pages()
      invocation when it can map immediately the entire compound page to avoid
      a flood of unnecessary secondary MMU faults and spurious
      atomic_inc()/atomic_dec() (pages don't have to be pinned by MMU notifier
      users).
      
      Ideally instead of the page->_mapcount < 1 check, get_user_pages()
      should return the granularity of the "page" mapping in the "mm" passed
      to get_user_pages().  However it's non trivial change to pass the "pmd"
      status belonging to the "mm" walked by get_user_pages up the stack (up
      to the caller of get_user_pages).  So the fix just checks if there is
      not a single pte mapping on the page returned by get_user_pages, and in
      turn if the caller can assume that the whole compound page is mapped in
      the current "mm" (in a pmd_trans_huge()).  In such case the entire
      compound page is safe to map into the secondary MMU without additional
      get_user_pages() calls on the surrounding tail/head pages.  In addition
      of being faster, not having to run other get_user_pages() calls also
      reduces the memory footprint of the secondary MMU fault in case the pmd
      split happened as result of memory pressure.
      
      Without this fix after a MADV_DONTNEED (like invoked by QEMU during
      postcopy live migration or balloning) or after generic swapping (with a
      failure in split_huge_page() that would only result in pmd splitting and
      not a physical page split), KVM would map the whole compound page into
      the shadow pagetables, despite regular faults or userfaults (like
      UFFDIO_COPY) may map regular pages into the primary MMU as result of the
      pte faults, leading to the guest mode and userland mode going out of
      sync and not working on the same memory at all times.
      
      Any other secondary MMU notifier manager (KVM is just one of the many
      MMU notifier users) will need the same information if it doesn't want to
      run a flood of get_user_pages_fast and it can support multiple
      granularity in the secondary MMU mappings, so I think it is justified to
      be exposed not just to KVM.
      
      The other option would be to move transparent_hugepage_adjust to
      mm/huge_memory.c but that currently has all kind of KVM data structures
      in it, so it's definitely not a cut-and-paste work, so I couldn't do a
      fix as cleaner as this one for 4.6.
      Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
      Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: "Li, Liang Z" <liang.z.li@intel.com>
      Cc: Amit Shah <amit.shah@redhat.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      127393fb
    • V
      ARM: 8573/1: domain: move {set,get}_domain under config guard · ec953b70
      Vladimir Murzin 提交于
      Recursive undefined instrcution falut is seen with R-class taking an
      exception. The reson for that is __show_regs() tries to get domain
      information, but domains is not available on !MMU cores, like R/M
      class.
      
      Fix it by puting {set,get}_domain functions under CONFIG_CPU_CP15_MMU
      guard and providing stubs for the case where domains is not supported.
      Signed-off-by: NVladimir Murzin <vladimir.murzin@arm.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      ec953b70
    • J
      ARM: 8572/1: nommu: change memory reserve for the vectors · 5b526bd9
      Jean-Philippe Brucker 提交于
      Commit 19accfd3 (ARM: move vector stubs) moved the vector stubs in an
      additional page above the base vector one. This change wasn't taken into
      account by the nommu memreserve.
      This patch ensures that the kernel won't overwrite any vector stub on
      nommu.
      
      [changed the MPU side too]
      Signed-off-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
      Signed-off-by: NVladimir Murzin <vladimir.murzin@arm.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      5b526bd9
    • J
      ARM: 8571/1: nommu: fix PMSAv7 setup · 695665b0
      Jean-Philippe Brucker 提交于
      Commit 1c2f87c2 (ARM: 8025/1: Get rid of meminfo) broke the support for
      MPU on ARMv7-R. This patch adapts the code inside CONFIG_ARM_MPU to use
      memblocks appropriately.
      
      MPU initialisation only uses the first memory region, and removes all
      subsequent ones. Because looping over all regions that need removal is
      inefficient, and memblock_remove already handles memory ranges, we can
      flatten the 'for_each_memblock' part.
      Signed-off-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
      Signed-off-by: NVladimir Murzin <vladimir.murzin@arm.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      695665b0
  8. 05 5月, 2016 8 次提交
  9. 04 5月, 2016 1 次提交
    • J
      x86/efi-bgrt: Switch all pr_err() to pr_notice() for invalid BGRT · 7f9b474c
      Josh Boyer 提交于
      The promise of pretty boot splashes from firmware via BGRT was at
      best only that; a promise.  The kernel diligently checks to make
      sure the BGRT data firmware gives it is valid, and dutifully warns
      the user when it isn't.  However, it does so via the pr_err log
      level which seems unnecessary.  The user cannot do anything about
      this and there really isn't an error on the part of Linux to
      correct.
      
      This lowers the log level by using pr_notice instead.  Users will
      no longer have their boot process uglified by the kernel reminding
      us that firmware can and often is broken when the 'quiet' kernel
      parameter is specified.  Ironic, considering BGRT is supposed to
      make boot pretty to begin with.
      Signed-off-by: NJosh Boyer <jwboyer@fedoraproject.org>
      Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Môshe van der Sterre <me@moshe.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-efi@vger.kernel.org
      Link: http://lkml.kernel.org/r/1462303781-8686-4-git-send-email-matt@codeblueprint.co.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>
      7f9b474c
  10. 03 5月, 2016 1 次提交
  11. 02 5月, 2016 1 次提交
    • A
      powerpc: Fix bad inline asm constraint in create_zero_mask() · b4c11211
      Anton Blanchard 提交于
      In create_zero_mask() we have:
      
      	addi	%1,%2,-1
      	andc	%1,%1,%2
      	popcntd	%0,%1
      
      using the "r" constraint for %2. r0 is a valid register in the "r" set,
      but addi X,r0,X turns it into an li:
      
      	li	r7,-1
      	andc	r7,r7,r0
      	popcntd	r4,r7
      
      Fix this by using the "b" constraint, for which r0 is not a valid
      register.
      
      This was found with a kernel build using gcc trunk, narrowed down to
      when -frename-registers was enabled at -O2. It is just luck however
      that we aren't seeing this on older toolchains.
      
      Thanks to Segher for working with me to find this issue.
      
      Cc: stable@vger.kernel.org
      Fixes: d0cebfa6 ("powerpc: word-at-a-time optimization for 64-bit Little Endian")
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      b4c11211
  12. 30 4月, 2016 2 次提交
  13. 28 4月, 2016 6 次提交