1. 19 11月, 2015 2 次提交
  2. 14 11月, 2015 1 次提交
  3. 12 11月, 2015 5 次提交
    • H
      perf/x86/intel/rapl: Remove the unused RAPL_EVENT_DESC() macro · 41ac18eb
      Huang Rui 提交于
      Signed-off-by: NHuang Rui <ray.huang@amd.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Li <tony.li@amd.com>
      Link: http://lkml.kernel.org/r/1446630233-3166-1-git-send-email-ray.huang@amd.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      41ac18eb
    • H
      x86/fpu: Fix get_xsave_addr() behavior under virtualization · a05917b6
      Huaitong Han 提交于
      KVM uses the get_xsave_addr() function in a different fashion from
      the native kernel, in that the 'xsave' parameter belongs to guest vcpu,
      not the currently running task.
      
      But 'xsave' is replaced with current task's (host) xsave structure, so
      get_xsave_addr() will incorrectly return the bad xsave address to KVM.
      
      Fix it so that the passed in 'xsave' address is used - as intended
      originally.
      Signed-off-by: NHuaitong Han <huaitong.han@intel.com>
      Reviewed-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: <stable@vger.kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: dave.hansen@intel.com
      Link: http://lkml.kernel.org/r/1446800423-21622-1-git-send-email-huaitong.han@intel.com
      [ Tidied up the changelog. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      a05917b6
    • D
      x86/fpu: Fix 32-bit signal frame handling · ab6b5294
      Dave Hansen 提交于
      (This should have gone to LKML originally. Sorry for the extra
       noise, folks on the cc.)
      
      Background:
      
      Signal frames on x86 have two formats:
      
        1. For 32-bit executables (whether on a real 32-bit kernel or
           under 32-bit emulation on a 64-bit kernel) we have a
          'fpregset_t' that includes the "FSAVE" registers.
      
        2. For 64-bit executables (on 64-bit kernels obviously), the
           'fpregset_t' is smaller and does not contain the "FSAVE"
           state.
      
      When creating the signal frame, we have to be aware of whether
      we are running a 32 or 64-bit executable so we create the
      correct format signal frame.
      
      Problem:
      
      save_xstate_epilog() uses 'fx_sw_reserved_ia32' whenever it is
      called for a 32-bit executable.  This is for real 32-bit and
      ia32 emulation.
      
      But, fpu__init_prepare_fx_sw_frame() only initializes
      'fx_sw_reserved_ia32' when emulation is enabled, *NOT* for real
      32-bit kernels.
      
      This leads to really wierd situations where 32-bit programs
      lose their extended state when returning from a signal handler.
      The kernel copies the uninitialized (zero) 'fx_sw_reserved_ia32'
      out to userspace in save_xstate_epilog().  But when returning
      from the signal, the kernel errors out in check_for_xstate()
      when it does not see FP_XSTATE_MAGIC1 present (because it was
      zeroed).  This leads to the FPU/XSAVE state being initialized.
      
      For MPX, this leads to the most permissive state and means we
      silently lose bounds violations.  I think this would also mean
      that we could lose *ANY* FPU/SSE/AVX state.  I'm not sure why
      no one has spotted this bug.
      
      I believe this was broken by:
      
      	72a671ce ("x86, fpu: Unify signal handling code paths for x86 and x86_64 kernels")
      
      way back in 2012.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: <stable@vger.kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: dave@sr71.net
      Cc: fenghua.yu@intel.com
      Cc: yu-cheng.yu@intel.com
      Link: http://lkml.kernel.org/r/20151111002354.A0799571@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ab6b5294
    • D
      x86/mpx: Fix 32-bit address space calculation · f3119b83
      Dave Hansen 提交于
      I received a bug report that running 32-bit MPX binaries on
      64-bit kernels was broken.  I traced it down to this little code
      snippet.  We were switching our "number of bounds directory
      entries" calculation correctly.  But, we didn't switch the other
      side of the calculation: the virtual space size.
      
      This meant that we were calculating an absurd size for
      bd_entry_virt_space() on 32-bit because we used the 64-bit
      virt_space.
      
      This was _also_ broken for 32-bit kernels running on 64-bit
      hardware since boot_cpu_data.x86_virt_bits=48 even when running
      in 32-bit mode.
      
      Correct that and properly handle all 3 possible cases:
      
       1. 32-bit binary on 64-bit kernel
       2. 64-bit binary on 64-bit kernel
       3. 32-bit binary on 32-bit kernel
      
      This manifested in having bounds tables not properly unmapped.
      It "leaked" memory but had no functional impact otherwise.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: <stable@vger.kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20151111181934.FA7FAC34@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f3119b83
    • D
      x86/mpx: Do proper get_user() when running 32-bit binaries on 64-bit kernels · 46561c39
      Dave Hansen 提交于
      When you call get_user(foo, bar), you effectively do a
      
      	copy_from_user(&foo, bar, sizeof(*bar));
      
      Note that the sizeof() is implicit.
      
      When we reach out to userspace to try to zap an entire "bounds
      table" we need to go read a "bounds directory entry" in order to
      locate the table's address.  The size of a "directory entry"
      depends on the binary being run and is always the size of a
      pointer.
      
      But, when we have a 64-bit kernel and a 32-bit application, the
      directory entry is still only 32-bits long, but we fetch it with
      a 64-bit pointer which makes get_user() does a 64-bit fetch.
      Reading 4 extra bytes isn't harmful, unless we are at the end of
      and run off the table.  It might also cause the zero page to get
      faulted in unnecessarily even if you are not at the end.
      
      Fix it up by doing a special 32-bit get_user() via a cast when
      we have 32-bit userspace.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: <stable@vger.kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20151111181931.3ACF6822@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      46561c39
  4. 10 11月, 2015 18 次提交
  5. 07 11月, 2015 12 次提交
  6. 06 11月, 2015 2 次提交
    • J
      livepatch: Fix crash with !CONFIG_DEBUG_SET_MODULE_RONX · e2391a2d
      Josh Poimboeuf 提交于
      When loading a patch module on a kernel with
      !CONFIG_DEBUG_SET_MODULE_RONX, the following crash occurs:
      
        [  205.988776] livepatch: enabling patch 'kpatch_meminfo_string'
        [  205.989829] BUG: unable to handle kernel paging request at ffffffffa08d2fc0
        [  205.989863] IP: [<ffffffff8154fecb>] do_init_module+0x8c/0x1ba
        [  205.989888] PGD 1a10067 PUD 1a11063 PMD 7bcde067 PTE 3740e161
        [  205.989915] Oops: 0003 [#1] SMP
        [  205.990187] CPU: 2 PID: 14570 Comm: insmod Tainted: G           O  K 4.1.12
        [  205.990214] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014
        [  205.990249] task: ffff8800374aaa90 ti: ffff8800794b8000 task.ti: ffff8800794b8000
        [  205.990276] RIP: 0010:[<ffffffff8154fecb>]  [<ffffffff8154fecb>] do_init_module+0x8c/0x1ba
        [  205.990307] RSP: 0018:ffff8800794bbd58  EFLAGS: 00010246
        [  205.990327] RAX: 0000000000000000 RBX: ffffffffa08d2fc0 RCX: 0000000000000000
        [  205.990356] RDX: 01ffff8000000080 RSI: 0000000000000000 RDI: ffffffff81a54b40
        [  205.990382] RBP: ffff88007b4c4d80 R08: 0000000000000007 R09: 0000000000000000
        [  205.990408] R10: 0000000000000008 R11: ffffea0001f18840 R12: 0000000000000000
        [  205.990433] R13: 0000000000000001 R14: ffffffffa08d2fc0 R15: ffff88007bd0bc40
        [  205.990459] FS:  00007f1128fbc700(0000) GS:ffff88007fc80000(0000) knlGS:0000000000000000
        [  205.990488] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        [  205.990509] CR2: ffffffffa08d2fc0 CR3: 000000002606e000 CR4: 00000000001406e0
        [  205.990536] Stack:
        [  205.990545]  ffff8800794bbec8 0000000000000001 ffffffffa08d3010 ffffffff810ecea9
        [  205.990576]  ffffffff810e8e40 000000000005f360 ffff88007bd0bc50 ffffffffa08d3240
        [  205.990608]  ffffffffa08d52c0 ffffffffa08d3210 ffff8800794bbed8 ffff8800794bbf1c
        [  205.990639] Call Trace:
        [  205.990651]  [<ffffffff810ecea9>] ? load_module+0x1e59/0x23a0
        [  205.990672]  [<ffffffff810e8e40>] ? store_uevent+0x40/0x40
        [  205.990693]  [<ffffffff810e99b5>] ? copy_module_from_fd.isra.49+0xb5/0x140
        [  205.990718]  [<ffffffff810ed5bd>] ? SyS_finit_module+0x7d/0xa0
        [  205.990741]  [<ffffffff81556832>] ? system_call_fastpath+0x16/0x75
        [  205.990763] Code: f9 00 00 00 74 23 49 c7 c0 92 e1 60 81 48 8d 53 18 89 c1 4c 89 c6 48 c7 c7 f0 85 7d 81 31 c0 e8 71 fa ff ff e8 58 0e 00 00 31 f6 <c7> 03 00 00 00 00 48 89 da 48 c7 c7 20 c7 a5 81 e8 d0 ec b3 ff
        [  205.990916] RIP  [<ffffffff8154fecb>] do_init_module+0x8c/0x1ba
        [  205.990940]  RSP <ffff8800794bbd58>
        [  205.990953] CR2: ffffffffa08d2fc0
      
      With !CONFIG_DEBUG_SET_MODULE_RONX, module text and rodata pages are
      writable, and the debug_align() macro allows the module struct to share
      a page with executable text.  When klp_write_module_reloc() calls
      set_memory_ro() on the page, it effectively turns the module struct into
      a read-only structure, resulting in a page fault when load_module() does
      "mod->state = MODULE_STATE_LIVE".
      Reported-by: NCyril B. <cbay@alwaysdata.com>
      Tested-by: NCyril B. <cbay@alwaysdata.com>
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: NJiri Kosina <jkosina@suse.cz>
      e2391a2d
    • E
      mm: mlock: add new mlock system call · a8ca5d0e
      Eric B Munson 提交于
      With the refactored mlock code, introduce a new system call for mlock.
      The new call will allow the user to specify what lock states are being
      added.  mlock2 is trivial at the moment, but a follow on patch will add a
      new mlock state making it useful.
      Signed-off-by: NEric B Munson <emunson@akamai.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Shuah Khan <shuahkh@osg.samsung.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a8ca5d0e