1. 10 8月, 2016 3 次提交
    • T
      x86/mm/KASLR: Increase BRK pages for KASLR memory randomization · fb754f95
      Thomas Garnier 提交于
      Default implementation expects 6 pages maximum are needed for low page
      allocations. If KASLR memory randomization is enabled, the worse case
      of e820 layout would require 12 pages (no large pages). It is due to the
      PUD level randomization and the variable e820 memory layout.
      
      This bug was found while doing extensive testing of KASLR memory
      randomization on different type of hardware.
      Signed-off-by: NThomas Garnier <thgarnie@google.com>
      Cc: Aleksey Makarov <aleksey.makarov@linaro.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Fabian Frederick <fabf@skynet.be>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Lv Zheng <lv.zheng@intel.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: kernel-hardening@lists.openwall.com
      Fixes: 021182e5 ("Enable KASLR for physical mapping memory regions")
      Link: http://lkml.kernel.org/r/1470762665-88032-2-git-send-email-thgarnie@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      fb754f95
    • T
      x86/mm/KASLR: Fix physical memory calculation on KASLR memory randomization · c7d2361f
      Thomas Garnier 提交于
      Initialize KASLR memory randomization after max_pfn is initialized. Also
      ensure the size is rounded up. It could create problems on machines
      with more than 1Tb of memory on certain random addresses.
      Signed-off-by: NThomas Garnier <thgarnie@google.com>
      Cc: Aleksey Makarov <aleksey.makarov@linaro.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Fabian Frederick <fabf@skynet.be>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Lv Zheng <lv.zheng@intel.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: kernel-hardening@lists.openwall.com
      Fixes: 021182e5 ("Enable KASLR for physical mapping memory regions")
      Link: http://lkml.kernel.org/r/1470762665-88032-1-git-send-email-thgarnie@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c7d2361f
    • A
      x86, kasan, ftrace: Put APIC interrupt handlers into .irqentry.text · 469f0023
      Alexander Potapenko 提交于
      Dmitry Vyukov has reported unexpected KASAN stackdepot growth:
      
        https://github.com/google/kasan/issues/36
      
      ... which is caused by the APIC handlers not being present in .irqentry.text:
      
      When building with CONFIG_FUNCTION_GRAPH_TRACER=y or CONFIG_KASAN=y, put the
      APIC interrupt handlers into the .irqentry.text section. This is needed
      because both KASAN and function graph tracer use __irqentry_text_start and
      __irqentry_text_end to determine whether a function is an IRQ entry point.
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NAlexander Potapenko <glider@google.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: aryabinin@virtuozzo.com
      Cc: kasan-dev@googlegroups.com
      Cc: kcc@google.com
      Cc: rostedt@goodmis.org
      Link: http://lkml.kernel.org/r/1468575763-144889-1-git-send-email-glider@google.com
      [ Minor edits. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      469f0023
  2. 09 8月, 2016 2 次提交
    • L
      unsafe_[get|put]_user: change interface to use a error target label · 1bd4403d
      Linus Torvalds 提交于
      When I initially added the unsafe_[get|put]_user() helpers in commit
      5b24a7a2 ("Add 'unsafe' user access functions for batched
      accesses"), I made the mistake of modeling the interface on our
      traditional __[get|put]_user() functions, which return zero on success,
      or -EFAULT on failure.
      
      That interface is fairly easy to use, but it's actually fairly nasty for
      good code generation, since it essentially forces the caller to check
      the error value for each access.
      
      In particular, since the error handling is already internally
      implemented with an exception handler, and we already use "asm goto" for
      various other things, we could fairly easily make the error cases just
      jump directly to an error label instead, and avoid the need for explicit
      checking after each operation.
      
      So switch the interface to pass in an error label, rather than checking
      the error value in the caller.  Best do it now before we start growing
      more users (the signal handling code in particular would be a good place
      to use the new interface).
      
      So rather than
      
      	if (unsafe_get_user(x, ptr))
      		... handle error ..
      
      the interface is now
      
      	unsafe_get_user(x, ptr, label);
      
      where an error during the user mode fetch will now just cause a jump to
      'label' in the caller.
      
      Right now the actual _implementation_ of this all still ends up being a
      "if (err) goto label", and does not take advantage of any exception
      label tricks, but for "unsafe_put_user()" in particular it should be
      fairly straightforward to convert to using the exception table model.
      
      Note that "unsafe_get_user()" is much harder to convert to a clever
      exception table model, because current versions of gcc do not allow the
      use of "asm goto" (for the exception) with output values (for the actual
      value to be fetched).  But that is hopefully not a limitation in the
      long term.
      
      [ Also note that it might be a good idea to switch unsafe_get_user() to
        actually _return_ the value it fetches from user space, but this
        commit only changes the error handling semantics ]
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1bd4403d
    • V
      x86/hweight: Don't clobber %rdi · 65ea11ec
      Ville Syrjälä 提交于
      The caller expects %rdi to remain intact, push+pop it make that happen.
      
      Fixes the following kind of explosions on my core2duo machine when
      trying to reboot or shut down:
      
        general protection fault: 0000 [#1] PREEMPT SMP
        Modules linked in: i915 i2c_algo_bit drm_kms_helper cfbfillrect syscopyarea cfbimgblt sysfillrect sysimgblt fb_sys_fops cfbcopyarea drm netconsole configfs binfmt_misc iTCO_wdt psmouse pcspkr snd_hda_codec_idt e100 coretemp hwmon snd_hda_codec_generic i2c_i801 mii i2c_smbus lpc_ich mfd_core snd_hda_intel uhci_hcd snd_hda_codec snd_hwdep snd_hda_core ehci_pci 8250 ehci_hcd snd_pcm 8250_base usbcore evdev serial_core usb_common parport_pc parport snd_timer snd soundcore
        CPU: 0 PID: 3070 Comm: reboot Not tainted 4.8.0-rc1-perf-dirty #69
        Hardware name:                  /D946GZIS, BIOS TS94610J.86A.0087.2007.1107.1049 11/07/2007
        task: ffff88012a0b4080 task.stack: ffff880123850000
        RIP: 0010:[<ffffffff81003c92>]  [<ffffffff81003c92>] x86_perf_event_update+0x52/0xc0
        RSP: 0018:ffff880123853b60  EFLAGS: 00010087
        RAX: 0000000000000001 RBX: ffff88012fc0a3c0 RCX: 000000000000001e
        RDX: 0000000000000000 RSI: 0000000040000000 RDI: ffff88012b014800
        RBP: ffff880123853b88 R08: ffffffffffffffff R09: 0000000000000000
        R10: ffffea0004a012c0 R11: ffffea0004acedc0 R12: ffffffff80000001
        R13: ffff88012b0149c0 R14: ffff88012b014800 R15: 0000000000000018
        FS:  00007f8b155cd700(0000) GS:ffff88012fc00000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 00007f8b155f5000 CR3: 000000012a2d7000 CR4: 00000000000006f0
        Stack:
         ffff88012fc0a3c0 ffff88012b014800 0000000000000004 0000000000000001
         ffff88012fc1b750 ffff880123853bb0 ffffffff81003d59 ffff88012b014800
         ffff88012fc0a3c0 ffff88012b014800 ffff880123853bd8 ffffffff81003e13
        Call Trace:
         [<ffffffff81003d59>] x86_pmu_stop+0x59/0xd0
         [<ffffffff81003e13>] x86_pmu_del+0x43/0x140
         [<ffffffff8111705d>] event_sched_out.isra.105+0xbd/0x260
         [<ffffffff8111738d>] __perf_remove_from_context+0x2d/0xb0
         [<ffffffff8111745d>] __perf_event_exit_context+0x4d/0x70
         [<ffffffff810c8826>] generic_exec_single+0xb6/0x140
         [<ffffffff81117410>] ? __perf_remove_from_context+0xb0/0xb0
         [<ffffffff81117410>] ? __perf_remove_from_context+0xb0/0xb0
         [<ffffffff810c898f>] smp_call_function_single+0xdf/0x140
         [<ffffffff81113d27>] perf_event_exit_cpu_context+0x87/0xc0
         [<ffffffff81113d73>] perf_reboot+0x13/0x40
         [<ffffffff8107578a>] notifier_call_chain+0x4a/0x70
         [<ffffffff81075ad7>] __blocking_notifier_call_chain+0x47/0x60
         [<ffffffff81075b06>] blocking_notifier_call_chain+0x16/0x20
         [<ffffffff81076a1d>] kernel_restart_prepare+0x1d/0x40
         [<ffffffff81076ae2>] kernel_restart+0x12/0x60
         [<ffffffff81076d56>] SYSC_reboot+0xf6/0x1b0
         [<ffffffff811a823c>] ? mntput_no_expire+0x2c/0x1b0
         [<ffffffff811a83e4>] ? mntput+0x24/0x40
         [<ffffffff811894fc>] ? __fput+0x16c/0x1e0
         [<ffffffff811895ae>] ? ____fput+0xe/0x10
         [<ffffffff81072fc3>] ? task_work_run+0x83/0xa0
         [<ffffffff81001623>] ? exit_to_usermode_loop+0x53/0xc0
         [<ffffffff8100105a>] ? trace_hardirqs_on_thunk+0x1a/0x1c
         [<ffffffff81076e6e>] SyS_reboot+0xe/0x10
         [<ffffffff814c4ba5>] entry_SYSCALL_64_fastpath+0x18/0xa3
        Code: 7c 4c 8d af c0 01 00 00 49 89 fe eb 10 48 09 c2 4c 89 e0 49 0f b1 55 00 4c 39 e0 74 35 4d 8b a6 c0 01 00 00 41 8b 8e 60 01 00 00 <0f> 33 8b 35 6e 02 8c 00 48 c1 e2 20 85 f6 7e d2 48 89 d3 89 cf
        RIP  [<ffffffff81003c92>] x86_perf_event_update+0x52/0xc0
         RSP <ffff880123853b60>
        ---[ end trace 7ec95181faf211be ]---
        note: reboot[3070] exited with preempt_count 2
      
      Cc: Borislav Petkov <bp@suse.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Fixes: f5967101 ("x86/hweight: Get rid of the special calling convention")
      Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      65ea11ec
  3. 05 8月, 2016 14 次提交
  4. 04 8月, 2016 21 次提交
    • J
      arm: jump label may reference text in __exit · ddb45306
      Jason Baron 提交于
      The jump table can reference text found in an __exit section.  Thus,
      instead of discarding it at build time, include EXIT_TEXT as part of
      __init and it will be released when the system boots.
      
      Link: http://lkml.kernel.org/r/60284113bb759121e8ae3e99af1535647e52123f.1467837322.git.jbaron@akamai.comSigned-off-by: NJason Baron <jbaron@akamai.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Chris Metcalf <cmetcalf@mellanox.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ddb45306
    • C
      tile: support static_key usage in non-module __exit sections · c14b4bcf
      Chris Metcalf 提交于
      Previously, all the __exit sections were just dropped by the link phase.
      However, if there are static_key (jump label) constructs in __exit
      sections that are not modules, the link fails with the message:
      
         `.exit.text' referenced in section `__jump_table' of xxx.o:
         defined in discarded section `.exit.text' of xxx.o
      
      Support this usage by keeping the .exit.text sections in the final image
      if JUMP_LABEL is defined, then discarding them once initialization is
      complete.
      
      Link: http://lkml.kernel.org/r/bfd7c107c610c30e992868ebfe2a5d796a097464.1467837322.git.jbaron@akamai.comSigned-off-by: NJason Baron <jbaron@akamai.com>
      Signed-off-by: NChris Metcalf <cmetcalf@mellanox.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c14b4bcf
    • J
      sparc: support static_key usage in non-module __exit sections · 10d7227b
      Jason Baron 提交于
      The jump table can reference text found in an __exit section.  Thus,
      instead of discarding it at build/link time, include EXIT_TEXT as part
      of __init and release it at system boot time.
      
      Without this patch the link fails with:
      
          `.exit.text' referenced in section `__jump_table' of xxx.o:
          defined in discarded section `.exit.text' of xxx.o
      
      Link: http://lkml.kernel.org/r/d822da427ab07a02a394602eca687104ff682f83.1467837322.git.jbaron@akamai.comSigned-off-by: NJason Baron <jbaron@akamai.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Chris Metcalf <cmetcalf@mellanox.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      10d7227b
    • J
      powerpc: add explicit #include <asm/asm-compat.h> for jump label · 5411fd7f
      Jason Baron 提交于
      The stringify_in_c() macro may not be included. Make the dependency
      explicit.
      
      Link: http://lkml.kernel.org/r/564720c5328edd53c9d56db325be7215440eec3e.1467837322.git.jbaron@akamai.comSigned-off-by: NJason Baron <jbaron@akamai.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Joe Perches <joe@perches.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Chris Metcalf <cmetcalf@mellanox.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5411fd7f
    • K
      dma-mapping: use unsigned long for dma_attrs · 00085f1e
      Krzysztof Kozlowski 提交于
      The dma-mapping core and the implementations do not change the DMA
      attributes passed by pointer.  Thus the pointer can point to const data.
      However the attributes do not have to be a bitfield.  Instead unsigned
      long will do fine:
      
      1. This is just simpler.  Both in terms of reading the code and setting
         attributes.  Instead of initializing local attributes on the stack
         and passing pointer to it to dma_set_attr(), just set the bits.
      
      2. It brings safeness and checking for const correctness because the
         attributes are passed by value.
      
      Semantic patches for this change (at least most of them):
      
          virtual patch
          virtual context
      
          @r@
          identifier f, attrs;
      
          @@
          f(...,
          - struct dma_attrs *attrs
          + unsigned long attrs
          , ...)
          {
          ...
          }
      
          @@
          identifier r.f;
          @@
          f(...,
          - NULL
          + 0
           )
      
      and
      
          // Options: --all-includes
          virtual patch
          virtual context
      
          @r@
          identifier f, attrs;
          type t;
      
          @@
          t f(..., struct dma_attrs *attrs);
      
          @@
          identifier r.f;
          @@
          f(...,
          - NULL
          + 0
           )
      
      Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.comSigned-off-by: NKrzysztof Kozlowski <k.kozlowski@samsung.com>
      Acked-by: NVineet Gupta <vgupta@synopsys.com>
      Acked-by: NRobin Murphy <robin.murphy@arm.com>
      Acked-by: NHans-Christian Noren Egtvedt <egtvedt@samfundet.no>
      Acked-by: Mark Salter <msalter@redhat.com> [c6x]
      Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> [cris]
      Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> [drm]
      Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com>
      Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
      Acked-by: Fabien Dessenne <fabien.dessenne@st.com> [bdisp]
      Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com> [vb2-core]
      Acked-by: David Vrabel <david.vrabel@citrix.com> [xen]
      Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [xen swiotlb]
      Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
      Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon]
      Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k]
      Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390]
      Acked-by: NBjorn Andersson <bjorn.andersson@linaro.org>
      Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> [avr32]
      Acked-by: Vineet Gupta <vgupta@synopsys.com> [arc]
      Acked-by: Robin Murphy <robin.murphy@arm.com> [arm64 and dma-iommu]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      00085f1e
    • M
      tree-wide: replace config_enabled() with IS_ENABLED() · 97f2645f
      Masahiro Yamada 提交于
      The use of config_enabled() against config options is ambiguous.  In
      practical terms, config_enabled() is equivalent to IS_BUILTIN(), but the
      author might have used it for the meaning of IS_ENABLED().  Using
      IS_ENABLED(), IS_BUILTIN(), IS_MODULE() etc.  makes the intention
      clearer.
      
      This commit replaces config_enabled() with IS_ENABLED() where possible.
      This commit is only touching bool config options.
      
      I noticed two cases where config_enabled() is used against a tristate
      option:
      
       - config_enabled(CONFIG_HWMON)
        [ drivers/net/wireless/ath/ath10k/thermal.c ]
      
       - config_enabled(CONFIG_BACKLIGHT_CLASS_DEVICE)
        [ drivers/gpu/drm/gma500/opregion.c ]
      
      I did not touch them because they should be converted to IS_BUILTIN()
      in order to keep the logic, but I was not sure it was the authors'
      intention.
      
      Link: http://lkml.kernel.org/r/1465215656-20569-1-git-send-email-yamada.masahiro@socionext.comSigned-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      Acked-by: NKees Cook <keescook@chromium.org>
      Cc: Stas Sergeev <stsp@list.ru>
      Cc: Matt Redfearn <matt.redfearn@imgtec.com>
      Cc: Joshua Kinard <kumba@gentoo.org>
      Cc: Jiri Slaby <jslaby@suse.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Markos Chandras <markos.chandras@imgtec.com>
      Cc: "Dmitry V. Levin" <ldv@altlinux.org>
      Cc: yu-cheng yu <yu-cheng.yu@intel.com>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Johannes Berg <johannes@sipsolutions.net>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Will Drewry <wad@chromium.org>
      Cc: Nikolay Martynov <mar.kolya@gmail.com>
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com>
      Cc: Rafal Milecki <zajec5@gmail.com>
      Cc: James Cowgill <James.Cowgill@imgtec.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Alex Smith <alex.smith@imgtec.com>
      Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
      Cc: Qais Yousef <qais.yousef@imgtec.com>
      Cc: Jiang Liu <jiang.liu@linux.intel.com>
      Cc: Mikko Rapeli <mikko.rapeli@iki.fi>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Brian Norris <computersforpeace@gmail.com>
      Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
      Cc: "Luis R. Rodriguez" <mcgrof@do-not-panic.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Roland McGrath <roland@hack.frob.com>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Kalle Valo <kvalo@qca.qualcomm.com>
      Cc: Viresh Kumar <viresh.kumar@linaro.org>
      Cc: Tony Wu <tung7970@gmail.com>
      Cc: Huaitong Han <huaitong.han@intel.com>
      Cc: Sumit Semwal <sumit.semwal@linaro.org>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Jason Cooper <jason@lakedaemon.net>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andrea Gelmini <andrea.gelmini@gelma.net>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Rabin Vincent <rabin@rab.in>
      Cc: "Maciej W. Rozycki" <macro@imgtec.com>
      Cc: David Daney <david.daney@cavium.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      97f2645f
    • S
      arm64: Fix copy-on-write referencing in HugeTLB · 747a70e6
      Steve Capper 提交于
      set_pte_at(.) will set or unset the PTE_RDONLY hardware bit before
      writing the entry to the table.
      
      This can cause problems with the copy-on-write logic in hugetlb_cow:
       *) hugetlb_cow(.) called to handle a write fault on read only pte,
       *) Before the copy-on-write updates the new page table a call is
          made to pte_same(huge_ptep_get(ptep), pte)), to check for a race,
       *) Because set_pte_at(.) changed the pte, *ptep != pte, and the
          hugetlb_cow(.) code erroneously assumes that it lost the race,
       *) The new page is subsequently freed without being used.
      
      On arm64 this problem only becomes apparent when we apply:
      67961f9d mm/hugetlb: fix huge page reserve accounting for private
      mappings
      
      When one runs the libhugetlbfs test suite, there are allocation errors
      and hugetlbfs pages become erroneously locked in memory as reserved.
      (There is a high HugePages_Rsvd: count).
      
      In this patch we introduce pte_same which ignores the PTE_RDONLY bit,
      allowing for the libhugetlbfs test suite to pass as expected and
      without leaking any reserved HugeTLB pages.
      Reported-by: NHuang Shijie <shijie.huang@arm.com>
      Signed-off-by: NSteve Capper <steve.capper@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      747a70e6
    • B
      nvmx: mark ept single context invalidation as supported · 45e11817
      Bandan Das 提交于
      Commit 4b855078 ("KVM: nVMX: Don't advertise single
      context invalidation for invept") removed advertising
      single context invalidation since the spec does not mandate it.
      However, some hypervisors (such as ESX) require it to be present
      before willing to use ept in a nested environment. Advertise it
      and fallback to the global case.
      Signed-off-by: NBandan Das <bsd@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      45e11817
    • B
      nvmx: remove comment about missing nested vpid support · 03331b4b
      Bandan Das 提交于
      Nested vpid is already supported and both single/global
      modes are advertised to the guest
      Signed-off-by: NBandan Das <bsd@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      03331b4b
    • W
      KVM: lapic: fix access preemption timer stuff even if kernel_irqchip=off · 91005300
      Wanpeng Li 提交于
      BUG: unable to handle kernel NULL pointer dereference at 000000000000008c
      IP: [<ffffffffc04e0180>] kvm_lapic_hv_timer_in_use+0x10/0x20 [kvm]
      PGD 0
      Oops: 0000 [#1] SMP
      Call Trace:
       kvm_arch_vcpu_load+0x86/0x260 [kvm]
       vcpu_load+0x46/0x60 [kvm]
       kvm_vcpu_ioctl+0x79/0x7c0 [kvm]
       ? __lock_is_held+0x54/0x70
       do_vfs_ioctl+0x96/0x6a0
       ? __fget_light+0x2a/0x90
       SyS_ioctl+0x79/0x90
       do_syscall_64+0x7c/0x1e0
       entry_SYSCALL64_slow_path+0x25/0x25
      RIP  [<ffffffffc04e0180>] kvm_lapic_hv_timer_in_use+0x10/0x20 [kvm]
       RSP <ffff8800db1f3d70>
      CR2: 000000000000008c
      ---[ end trace a55fb79d2b3b4ee8 ]---
      
      This can be reproduced steadily by kernel_irqchip=off.
      
      We should not access preemption timer stuff if lapic is emulated in userspace.
      This patch fix it by avoiding access preemption timer stuff when kernel_irqchip=off.
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Yunhong Jiang <yunhong.jiang@intel.com>
      Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      91005300
    • P
      x86: vdso: use __pvclock_read_cycles · abe9efa7
      Paolo Bonzini 提交于
      The new simplified __pvclock_read_cycles does the same computation
      as vread_pvclock, except that (because it takes the pvclock_vcpu_time_info
      pointer) it has to be moved inside the loop.  Since the loop is expected to
      never roll, this makes no difference.
      Acked-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      abe9efa7
    • P
      pvclock: introduce seqcount-like API · 3aed64f6
      Paolo Bonzini 提交于
      The version field in struct pvclock_vcpu_time_info basically implements
      a seqcount.  Wrap it with the usual read_begin and read_retry functions,
      and use these APIs instead of peppering the code with smp_rmb()s.
      While at it, change it to the more pedantically correct virt_rmb().
      
      With this change, __pvclock_read_cycles can be simplified noticeably.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      3aed64f6
    • M
      powerpc/mm: Move register_process_table() out of ppc_md · eea8148c
      Michael Ellerman 提交于
      We want to initialise register_process_table() before ppc_md is setup,
      so that it can be called as part of MMU init (at least on Radix ATM).
      
      That no longer works because probe_machine() requires that ppc_md be
      empty before it's called, and we now do probe_machine() much later.
      
      So make register_process_table a global for now. It will probably move
      into a mmu_radix_ops struct at some point in the future.
      
      This was broken by me when applying commit 7025776e "powerpc/mm:
      Move hash table ops to a separate structure" due to conflicts with other
      patches.
      
      Fixes: 7025776e ("powerpc/mm: Move hash table ops to a separate structure")
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      eea8148c
    • M
      powerpc/perf: Fix incorrect event codes in power9-event-list · 1a058f16
      Madhavan Srinivasan 提交于
      These have been changed in the hardware, update Linux's version.
      Signed-off-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      1a058f16
    • V
      um: Support kcov · 915eed20
      Vegard Nossum 提交于
      This adds support for kcov to UML.
      
      There is a small problem where UML will randomly segfault during boot;
      this is because current_thread_info() occasionally returns an invalid
      (non-NULL) pointer and we try to dereference it in
      __sanitizer_cov_trace_pc(). I consider this a bug in UML itself and this
      patch merely exposes it.
      
      [v2: disable instrumentation in UML-specific code]
      
      Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Thomas Meyer <thomas@m3y3r.de>
      Cc: user-mode-linux-devel <user-mode-linux-devel@lists.sourceforge.net>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NVegard Nossum <vegard.nossum@oracle.com>
      Signed-off-by: NRichard Weinberger <richard@nod.at>
      915eed20
    • R
      um: Enable TRACE_IRQFLAGS_SUPPORT · 8e99bc70
      Richard Weinberger 提交于
      Now we have everything we need, so enable
      TRACE_IRQFLAGS_SUPPORT.
      Signed-off-by: NRichard Weinberger <richard@nod.at>
      8e99bc70
    • D
      um: Use asm-generic/irqflags.h · 3e938957
      Daniel Wagner 提交于
      Instead proving its own arch_local_irq_save() and arch_irqs_disabled()
      version use the generic version from asm-generic/irqflags.h.
      
      A nice side effect is that um gets a few additional arch_ functions
      as well.
      Signed-off-by: NDaniel Wagner <daniel.wagner@bmw-carit.de>
      [rw: Massaged commit message]
      Signed-off-by: NRichard Weinberger <richard@nod.at>
      3e938957
    • R
      um: Fix possible deadlock in sig_handler_common() · 57a05d83
      Richard Weinberger 提交于
      We are in atomic context and must not sleep.
      Sleeping here is possible since malloc() maps
      to kmalloc() with GFP_KERNEL.
      
      Cc: stable@vger.kernel.org
      Fixes: b6024b21 ("um: extend fpstate to _xstate to support YMM registers")
      Signed-off-by: NRichard Weinberger <richard@nod.at>
      57a05d83
    • R
      um: Select HAVE_DEBUG_KMEMLEAK · 5609a3d3
      Richard Weinberger 提交于
      Now we have the infrastructure to support kmemleak.
      Enable the HAVE flag.
      Signed-off-by: NRichard Weinberger <richard@nod.at>
      5609a3d3
    • R
      um: Setup physical memory in setup_arch() · b6323697
      Richard Weinberger 提交于
      Currently UML sets up physical memory very early,
      long before setup_arch() was called by the kernel main
      function.
      This can cause problems when code paths in UML's memory setup
      code assume that the kernel is already running.
      i.e. when kmemleak is enabled it will evaluate current()
      in free_bootmem(). That early current() is undefined and
      UML explodes.
      
      Solve the problem by setting up physical memory in setup_arch(),
      at this stage the kernel has materialized and basic infrastructure
      such as current() works.
      Signed-off-by: NRichard Weinberger <richard@nod.at>
      b6323697
    • A
      um: Eliminate null test after alloc_bootmem · fed4c726
      Amitoj Kaur Chawla 提交于
      alloc_bootmem function never returns NULL. Thus a NULL test after a
      call to this function is unnecessary.
      
      The Coccinelle semantic patch used to make this change is follows:
      @@
      expression E;
      statement S;
      @@
      
      E =
      alloc_bootmem(...)
      ... when != E
      - if (E == NULL) S
      Signed-off-by: NAmitoj Kaur Chawla <amitoj1606@gmail.com>
      Signed-off-by: NRichard Weinberger <richard@nod.at>
      fed4c726