1. 12 5月, 2016 4 次提交
  2. 10 5月, 2016 1 次提交
  3. 07 5月, 2016 1 次提交
  4. 06 5月, 2016 6 次提交
    • D
      parisc: fix a bug when syscall number of tracee is __NR_Linux_syscalls · f0b22d1b
      Dmitry V. Levin 提交于
      Do not load one entry beyond the end of the syscall table when the
      syscall number of a traced process equals to __NR_Linux_syscalls.
      Similar bug with regular processes was fixed by commit 3bb457af
      ("[PARISC] Fix bug when syscall nr is __NR_Linux_syscalls").
      
      This bug was found by strace test suite.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NDmitry V. Levin <ldv@altlinux.org>
      Acked-by: NHelge Deller <deller@gmx.de>
      Signed-off-by: NHelge Deller <deller@gmx.de>
      f0b22d1b
    • C
      x86/tsc: Read all ratio bits from MSR_PLATFORM_INFO · 886123fb
      Chen Yu 提交于
      Currently we read the tsc radio: ratio = (MSR_PLATFORM_INFO >> 8) & 0x1f;
      
      Thus we get bit 8-12 of MSR_PLATFORM_INFO, however according to the SDM
      (35.5), the ratio bits are bit 8-15.
      
      Ignoring the upper bits can result in an incorrect tsc ratio, which causes the
      TSC calibration and the Local APIC timer frequency to be incorrect.
      
      Fix this problem by masking 0xff instead.
      
      [ tglx: Massaged changelog ]
      
      Fixes: 7da7c156 "x86, tsc: Add static (MSR) TSC calibration on Intel Atom SoCs"
      Signed-off-by: NChen Yu <yu.c.chen@intel.com>
      Cc: "Rafael J. Wysocki" <rafael@kernel.org>
      Cc: stable@vger.kernel.org
      Cc: Bin Gao <bin.gao@intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Link: http://lkml.kernel.org/r/1462505619-5516-1-git-send-email-yu.c.chen@intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      886123fb
    • A
      mm: thp: kvm: fix memory corruption in KVM with THP enabled · 127393fb
      Andrea Arcangeli 提交于
      After the THP refcounting change, obtaining a compound pages from
      get_user_pages() no longer allows us to assume the entire compound page
      is immediately mappable from a secondary MMU.
      
      A secondary MMU doesn't want to call get_user_pages() more than once for
      each compound page, in order to know if it can map the whole compound
      page.  So a secondary MMU needs to know from a single get_user_pages()
      invocation when it can map immediately the entire compound page to avoid
      a flood of unnecessary secondary MMU faults and spurious
      atomic_inc()/atomic_dec() (pages don't have to be pinned by MMU notifier
      users).
      
      Ideally instead of the page->_mapcount < 1 check, get_user_pages()
      should return the granularity of the "page" mapping in the "mm" passed
      to get_user_pages().  However it's non trivial change to pass the "pmd"
      status belonging to the "mm" walked by get_user_pages up the stack (up
      to the caller of get_user_pages).  So the fix just checks if there is
      not a single pte mapping on the page returned by get_user_pages, and in
      turn if the caller can assume that the whole compound page is mapped in
      the current "mm" (in a pmd_trans_huge()).  In such case the entire
      compound page is safe to map into the secondary MMU without additional
      get_user_pages() calls on the surrounding tail/head pages.  In addition
      of being faster, not having to run other get_user_pages() calls also
      reduces the memory footprint of the secondary MMU fault in case the pmd
      split happened as result of memory pressure.
      
      Without this fix after a MADV_DONTNEED (like invoked by QEMU during
      postcopy live migration or balloning) or after generic swapping (with a
      failure in split_huge_page() that would only result in pmd splitting and
      not a physical page split), KVM would map the whole compound page into
      the shadow pagetables, despite regular faults or userfaults (like
      UFFDIO_COPY) may map regular pages into the primary MMU as result of the
      pte faults, leading to the guest mode and userland mode going out of
      sync and not working on the same memory at all times.
      
      Any other secondary MMU notifier manager (KVM is just one of the many
      MMU notifier users) will need the same information if it doesn't want to
      run a flood of get_user_pages_fast and it can support multiple
      granularity in the secondary MMU mappings, so I think it is justified to
      be exposed not just to KVM.
      
      The other option would be to move transparent_hugepage_adjust to
      mm/huge_memory.c but that currently has all kind of KVM data structures
      in it, so it's definitely not a cut-and-paste work, so I couldn't do a
      fix as cleaner as this one for 4.6.
      Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
      Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: "Li, Liang Z" <liang.z.li@intel.com>
      Cc: Amit Shah <amit.shah@redhat.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      127393fb
    • V
      ARM: 8573/1: domain: move {set,get}_domain under config guard · ec953b70
      Vladimir Murzin 提交于
      Recursive undefined instrcution falut is seen with R-class taking an
      exception. The reson for that is __show_regs() tries to get domain
      information, but domains is not available on !MMU cores, like R/M
      class.
      
      Fix it by puting {set,get}_domain functions under CONFIG_CPU_CP15_MMU
      guard and providing stubs for the case where domains is not supported.
      Signed-off-by: NVladimir Murzin <vladimir.murzin@arm.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      ec953b70
    • J
      ARM: 8572/1: nommu: change memory reserve for the vectors · 5b526bd9
      Jean-Philippe Brucker 提交于
      Commit 19accfd3 (ARM: move vector stubs) moved the vector stubs in an
      additional page above the base vector one. This change wasn't taken into
      account by the nommu memreserve.
      This patch ensures that the kernel won't overwrite any vector stub on
      nommu.
      
      [changed the MPU side too]
      Signed-off-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
      Signed-off-by: NVladimir Murzin <vladimir.murzin@arm.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      5b526bd9
    • J
      ARM: 8571/1: nommu: fix PMSAv7 setup · 695665b0
      Jean-Philippe Brucker 提交于
      Commit 1c2f87c2 (ARM: 8025/1: Get rid of meminfo) broke the support for
      MPU on ARMv7-R. This patch adapts the code inside CONFIG_ARM_MPU to use
      memblocks appropriately.
      
      MPU initialisation only uses the first memory region, and removes all
      subsequent ones. Because looping over all regions that need removal is
      inefficient, and memblock_remove already handles memory ranges, we can
      flatten the 'for_each_memblock' part.
      Signed-off-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
      Signed-off-by: NVladimir Murzin <vladimir.murzin@arm.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      695665b0
  5. 05 5月, 2016 13 次提交
  6. 04 5月, 2016 1 次提交
    • J
      x86/efi-bgrt: Switch all pr_err() to pr_notice() for invalid BGRT · 7f9b474c
      Josh Boyer 提交于
      The promise of pretty boot splashes from firmware via BGRT was at
      best only that; a promise.  The kernel diligently checks to make
      sure the BGRT data firmware gives it is valid, and dutifully warns
      the user when it isn't.  However, it does so via the pr_err log
      level which seems unnecessary.  The user cannot do anything about
      this and there really isn't an error on the part of Linux to
      correct.
      
      This lowers the log level by using pr_notice instead.  Users will
      no longer have their boot process uglified by the kernel reminding
      us that firmware can and often is broken when the 'quiet' kernel
      parameter is specified.  Ironic, considering BGRT is supposed to
      make boot pretty to begin with.
      Signed-off-by: NJosh Boyer <jwboyer@fedoraproject.org>
      Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Môshe van der Sterre <me@moshe.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-efi@vger.kernel.org
      Link: http://lkml.kernel.org/r/1462303781-8686-4-git-send-email-matt@codeblueprint.co.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>
      7f9b474c
  7. 02 5月, 2016 1 次提交
    • A
      powerpc: Fix bad inline asm constraint in create_zero_mask() · b4c11211
      Anton Blanchard 提交于
      In create_zero_mask() we have:
      
      	addi	%1,%2,-1
      	andc	%1,%1,%2
      	popcntd	%0,%1
      
      using the "r" constraint for %2. r0 is a valid register in the "r" set,
      but addi X,r0,X turns it into an li:
      
      	li	r7,-1
      	andc	r7,r7,r0
      	popcntd	r4,r7
      
      Fix this by using the "b" constraint, for which r0 is not a valid
      register.
      
      This was found with a kernel build using gcc trunk, narrowed down to
      when -frename-registers was enabled at -O2. It is just luck however
      that we aren't seeing this on older toolchains.
      
      Thanks to Segher for working with me to find this issue.
      
      Cc: stable@vger.kernel.org
      Fixes: d0cebfa6 ("powerpc: word-at-a-time optimization for 64-bit Little Endian")
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      b4c11211
  8. 30 4月, 2016 1 次提交
    • A
      ARM: davinci: only use NVMEM when available · 04b9665b
      Arnd Bergmann 提交于
      The davinci platform contains code that calls into the nvmem
      subsystem, but that might be a loadable module, causing a
      link error:
      
      arch/arm/mach-davinci/built-in.o: In function `davinci_get_mac_addr':
      :(.text+0x1088): undefined reference to `nvmem_device_read'
      arch/arm/mach-davinci/built-in.o: In function `read_factory_config':
      :(.text+0x214c): undefined reference to `nvmem_device_read'
      
      Also, when NVMEM is completely disabled, the functions fail with
      nonobvious error messages.
      
      This ensures we only call the API functions when the code is actually
      reachable from the board file, and otherwise prints a unique log
      message.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Fixes: bec3c11b ("misc: at24: replace memory_accessor with nvmem_device_read")
      Signed-off-by: NSekhar Nori <nsekhar@ti.com>
      Signed-off-by: NKevin Hilman <khilman@baylibre.com>
      04b9665b
  9. 28 4月, 2016 6 次提交
  10. 27 4月, 2016 6 次提交
    • A
      perf core: Allow setting up max frame stack depth via sysctl · c5dfd78e
      Arnaldo Carvalho de Melo 提交于
      The default remains 127, which is good for most cases, and not even hit
      most of the time, but then for some cases, as reported by Brendan, 1024+
      deep frames are appearing on the radar for things like groovy, ruby.
      
      And in some workloads putting a _lower_ cap on this may make sense. One
      that is per event still needs to be put in place tho.
      
      The new file is:
      
        # cat /proc/sys/kernel/perf_event_max_stack
        127
      
      Chaging it:
      
        # echo 256 > /proc/sys/kernel/perf_event_max_stack
        # cat /proc/sys/kernel/perf_event_max_stack
        256
      
      But as soon as there is some event using callchains we get:
      
        # echo 512 > /proc/sys/kernel/perf_event_max_stack
        -bash: echo: write error: Device or resource busy
        #
      
      Because we only allocate the callchain percpu data structures when there
      is a user, which allows for changing the max easily, its just a matter
      of having no callchain users at that point.
      Reported-and-Tested-by: NBrendan Gregg <brendan.d.gregg@gmail.com>
      Reviewed-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Link: http://lkml.kernel.org/r/20160426002928.GB16708@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c5dfd78e
    • A
      ARC: add support for reserved memory defined by device tree · 1b10cb21
      Alexey Brodkin 提交于
      Enable reserved memory initialization from device tree.
      Signed-off-by: NAlexey Brodkin <abrodkin@synopsys.com>
      Cc: Grant Likely <grant.likely@linaro.org>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      1b10cb21
    • A
      ARC: support generic per-device coherent dma mem · 32ed9a0e
      Alexey Brodkin 提交于
      Signed-off-by: NAlexey Brodkin <abrodkin@synopsys.com>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      32ed9a0e
    • R
      nios2: memset: use the right constraint modifier for the %4 output operand · a8950e49
      Romain Perier 提交于
      Depending on the size of the area to be memset'ed, the nios2 memset implementation
      either uses a naive loop (for buffers smaller or equal than 8 bytes) or a more optimized
      implementation (for buffers larger than 8 bytes). This implementation does 4-byte stores
      rather than 1-byte stores to speed up memset.
      
      However, we discovered that on our nios2 platform, memset() was not properly setting the
      buffer to the expected value. A memset of 0xff would not set the entire buffer to 0xff, but to:
      
      0xff 0x00 0xff 0x00 0xff 0x00 0xff 0x00 ...
      
      Which is obviously incorrect. Our investigation has revealed that the problem lies in the
      incorrect constraints used in the inline assembly.
      
      The following piece of assembly, from the nios2 memset implementation, is supposed to
      create a 4-byte value that repeats 4 times the 1-byte pattern passed as memset argument:
      
      /* fill8 %3, %5 (c & 0xff) */
      "       slli    %4, %5, 8\n"
      "       or      %4, %4, %5\n"
      "       slli    %3, %4, 16\n"
      "       or      %3, %3, %4\n"
      
      However, depending on the compiler and optimization level, this code might be compiled as:
      
      34:	280a923a 	slli	r5,r5,8
      38:	294ab03a 	or	r5,r5,r5
      3c:	2808943a 	slli	r4,r5,16
      40:	2148b03a 	or	r4,r4,r5
      
      This is wrong because r5 gets used both for %5 and %4, which leads to the final pattern
      stored in r4 to be 0xff00ff00 rather than the expected 0xffffffff.
      
      %4 is defined with the "=r" constraint, i.e as an output operand. However, as explained in
      http://www.ethernut.de/en/documents/arm-inline-asm.html, this does not prevent gcc from
      using the same register for an output operand (%4) and input operand (%5). By using the
      constraint modifier '&', we indicate that the register should be used for output only. With this
      change, we get the following assembly output:
      
      34:	2810923a 	slli	r8,r5,8
      38:	4150b03a 	or	r8,r8,r5
      3c:	400e943a 	slli	r7,r8,16
      40:	3a0eb03a 	or	r7,r7,r8
      
      Which correctly produces the 0xffffffff pattern when 0xff is passed as the memset() pattern.
      
      It is worth mentioning the observed consequence of this bug: we were hitting the kernel
      BUG() in mm/bootmem.c:__free() that verifies when marking a page as free that it was
      previously marked as occupied (i.e that the bit was set to 1). The entire bootmem bitmap is
      set to 0xff bit via a memset() during the bootmem initialization. The bootmem_free() call right
      after the initialization was finding some bits to be set to 0, which didn't make sense since the
      bitmap has just been memset'ed to 0xff. Except that due to the bug explained above, the
      bitmap was in fact initialized to 0xff00ff00.
      
      Thanks to Marek Vasut for his help and feedback.
      Signed-off-by: NRomain Perier <romain.perier@free-electrons.com>
      Acked-by: NMarek Vasut <marex@denx.de>
      Acked-by: NLey Foon Tan <lftan@altera.com>
      a8950e49
    • R
      powerpc: wire up preadv2 and pwritev2 syscalls · d701cca6
      Rui Salvaterra 提交于
      Wire up preadv2/pwritev2 in the same way as preadv/pwritev. Fixes two
      build warnings on ppc64.
      
      mpe: Lightly tested with fio (slightly hacked to add the syscall
      wrappers):
      
        fio-4217  [009] ....  1304.635300: sys_preadv2(fd: 3, vec:
        10025821de0, vlen: 1, pos_l: 6253000, pos_h: 0, flags: 1)
        fio-4217  [009] ....  1304.635474: sys_preadv2 -> 0x1000
      Signed-off-by: NRui Salvaterra <rsalvaterra@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      d701cca6
    • A
      Revert "x86/mm/32: Set NX in __supported_pte_mask before enabling paging" · e16d8a6c
      Andy Lutomirski 提交于
      This reverts commit 320d25b6.
      
      This change was problematic for a couple of reasons:
      
      1. It missed a some entry points (Xen things and 64-bit native).
      
      2. The entry it changed can be executed more than once.  This isn't
         really a problem, but it conflated per-cpu state setup and global
         state setup.
      
      3. It broke 64-bit non-NX.  64-bit non-NX worked the other way around from
         32-bit -- __supported_pte_mask had NX set initially and was *cleared*
         in x86_configure_nx.  With the patch applied, it never got cleared.
      Reported-and-tested-by: NMeelis Roos <mroos@linux.ee>
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/59bd15f7f4b56b633a611b7f70876c6d2ad01a98.1461685884.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e16d8a6c