1. 20 7月, 2018 1 次提交
    • J
      x86/ldt: Define LDT_END_ADDR · 8195d869
      Joerg Roedel 提交于
      It marks the end of the address-space range reserved for the LDT. The
      LDT-code will use it when unmapping the LDT for user-space.
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NPavel Machek <pavel@ucw.cz>
      Cc: "H . Peter Anvin" <hpa@zytor.com>
      Cc: linux-mm@kvack.org
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Waiman Long <llong@redhat.com>
      Cc: "David H . Gutteridge" <dhgutteridge@sympatico.ca>
      Cc: joro@8bytes.org
      Link: https://lkml.kernel.org/r/1531906876-13451-35-git-send-email-joro@8bytes.org
      8195d869
  2. 17 4月, 2018 1 次提交
  3. 12 4月, 2018 1 次提交
    • D
      x86/mm: Do not auto-massage page protections · fb43d6cb
      Dave Hansen 提交于
      A PTE is constructed from a physical address and a pgprotval_t.
      __PAGE_KERNEL, for instance, is a pgprot_t and must be converted
      into a pgprotval_t before it can be used to create a PTE.  This is
      done implicitly within functions like pfn_pte() by massage_pgprot().
      
      However, this makes it very challenging to set bits (and keep them
      set) if your bit is being filtered out by massage_pgprot().
      
      This moves the bit filtering out of pfn_pte() and friends.  For
      users of PAGE_KERNEL*, filtering will be done automatically inside
      those macros but for users of __PAGE_KERNEL*, they need to do their
      own filtering now.
      
      Note that we also just move pfn_pte/pmd/pud() over to check_pgprot()
      instead of massage_pgprot().  This way, we still *look* for
      unsupported bits and properly warn about them if we find them.  This
      might happen if an unfiltered __PAGE_KERNEL* value was passed in,
      for instance.
      
      - printk format warning fix from: Arnd Bergmann <arnd@arndb.de>
      - boot crash fix from:            Tom Lendacky <thomas.lendacky@amd.com>
      - crash bisected by:              Mike Galbraith <efault@gmx.de>
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Reported-and-fixed-by: NArnd Bergmann <arnd@arndb.de>
      Fixed-by: NTom Lendacky <thomas.lendacky@amd.com>
      Bisected-by: NMike Galbraith <efault@gmx.de>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kees Cook <keescook@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nadav Amit <namit@vmware.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/20180406205509.77E1D7F6@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      fb43d6cb
  4. 31 12月, 2017 2 次提交
  5. 24 12月, 2017 2 次提交
    • T
      x86/ldt: Make the LDT mapping RO · 9f5cb6b3
      Thomas Gleixner 提交于
      Now that the LDT mapping is in a known area when PAGE_TABLE_ISOLATION is
      enabled its a primary target for attacks, if a user space interface fails
      to validate a write address correctly. That can never happen, right?
      
      The SDM states:
      
          If the segment descriptors in the GDT or an LDT are placed in ROM, the
          processor can enter an indefinite loop if software or the processor
          attempts to update (write to) the ROM-based segment descriptors. To
          prevent this problem, set the accessed bits for all segment descriptors
          placed in a ROM. Also, remove operating-system or executive code that
          attempts to modify segment descriptors located in ROM.
      
      So its a valid approach to set the ACCESS bit when setting up the LDT entry
      and to map the table RO. Fixup the selftest so it can handle that new mode.
      
      Remove the manual ACCESS bit setter in set_tls_desc() as this is now
      pointless. Folded the patch from Peter Ziljstra.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      9f5cb6b3
    • A
      x86/pti: Put the LDT in its own PGD if PTI is on · f55f0501
      Andy Lutomirski 提交于
      With PTI enabled, the LDT must be mapped in the usermode tables somewhere.
      The LDT is per process, i.e. per mm.
      
      An earlier approach mapped the LDT on context switch into a fixmap area,
      but that's a big overhead and exhausted the fixmap space when NR_CPUS got
      big.
      
      Take advantage of the fact that there is an address space hole which
      provides a completely unused pgd. Use this pgd to manage per-mm LDT
      mappings.
      
      This has a down side: the LDT isn't (currently) randomized, and an attack
      that can write the LDT is instant root due to call gates (thanks, AMD, for
      leaving call gates in AMD64 but designing them wrong so they're only useful
      for exploits).  This can be mitigated by making the LDT read-only or
      randomizing the mapping, either of which is strightforward on top of this
      patch.
      
      This will significantly slow down LDT users, but that shouldn't matter for
      important workloads -- the LDT is only used by DOSEMU(2), Wine, and very
      old libc implementations.
      
      [ tglx: Cleaned it up. ]
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Kirill A. Shutemov <kirill@shutemov.name>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      f55f0501
  6. 23 12月, 2017 2 次提交
    • T
      x86/ldt: Prevent LDT inheritance on exec · a4828f81
      Thomas Gleixner 提交于
      The LDT is inherited across fork() or exec(), but that makes no sense
      at all because exec() is supposed to start the process clean.
      
      The reason why this happens is that init_new_context_ldt() is called from
      init_new_context() which obviously needs to be called for both fork() and
      exec().
      
      It would be surprising if anything relies on that behaviour, so it seems to
      be safe to remove that misfeature.
      
      Split the context initialization into two parts. Clear the LDT pointer and
      initialize the mutex from the general context init and move the LDT
      duplication to arch_dup_mmap() which is only called on fork().
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Andy Lutomirsky <luto@kernel.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: dan.j.williams@intel.com
      Cc: hughd@google.com
      Cc: keescook@google.com
      Cc: kirill.shutemov@linux.intel.com
      Cc: linux-mm@kvack.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      a4828f81
    • P
      x86/ldt: Rework locking · c2b3496b
      Peter Zijlstra 提交于
      The LDT is duplicated on fork() and on exec(), which is wrong as exec()
      should start from a clean state, i.e. without LDT. To fix this the LDT
      duplication code will be moved into arch_dup_mmap() which is only called
      for fork().
      
      This introduces a locking problem. arch_dup_mmap() holds mmap_sem of the
      parent process, but the LDT duplication code needs to acquire
      mm->context.lock to access the LDT data safely, which is the reverse lock
      order of write_ldt() where mmap_sem nests into context.lock.
      
      Solve this by introducing a new rw semaphore which serializes the
      read/write_ldt() syscall operations and use context.lock to protect the
      actual installment of the LDT descriptor.
      
      So context.lock stabilizes mm->context.ldt and can nest inside of the new
      semaphore or mmap_sem.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Andy Lutomirsky <luto@kernel.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: dan.j.williams@intel.com
      Cc: hughd@google.com
      Cc: keescook@google.com
      Cc: kirill.shutemov@linux.intel.com
      Cc: linux-mm@kvack.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      c2b3496b
  7. 17 12月, 2017 1 次提交
  8. 02 11月, 2017 1 次提交
    • G
      License cleanup: add SPDX GPL-2.0 license identifier to files with no license · b2441318
      Greg Kroah-Hartman 提交于
      Many source files in the tree are missing licensing information, which
      makes it harder for compliance tools to determine the correct license.
      
      By default all files without license information are under the default
      license of the kernel, which is GPL version 2.
      
      Update the files which contain no license information with the 'GPL-2.0'
      SPDX license identifier.  The SPDX identifier is a legally binding
      shorthand, which can be used instead of the full boiler plate text.
      
      This patch is based on work done by Thomas Gleixner and Kate Stewart and
      Philippe Ombredanne.
      
      How this work was done:
      
      Patches were generated and checked against linux-4.14-rc6 for a subset of
      the use cases:
       - file had no licensing information it it.
       - file was a */uapi/* one with no licensing information in it,
       - file was a */uapi/* one with existing licensing information,
      
      Further patches will be generated in subsequent months to fix up cases
      where non-standard license headers were used, and references to license
      had to be inferred by heuristics based on keywords.
      
      The analysis to determine which SPDX License Identifier to be applied to
      a file was done in a spreadsheet of side by side results from of the
      output of two independent scanners (ScanCode & Windriver) producing SPDX
      tag:value files created by Philippe Ombredanne.  Philippe prepared the
      base worksheet, and did an initial spot review of a few 1000 files.
      
      The 4.13 kernel was the starting point of the analysis with 60,537 files
      assessed.  Kate Stewart did a file by file comparison of the scanner
      results in the spreadsheet to determine which SPDX license identifier(s)
      to be applied to the file. She confirmed any determination that was not
      immediately clear with lawyers working with the Linux Foundation.
      
      Criteria used to select files for SPDX license identifier tagging was:
       - Files considered eligible had to be source code files.
       - Make and config files were included as candidates if they contained >5
         lines of source
       - File already had some variant of a license header in it (even if <5
         lines).
      
      All documentation files were explicitly excluded.
      
      The following heuristics were used to determine which SPDX license
      identifiers to apply.
      
       - when both scanners couldn't find any license traces, file was
         considered to have no license information in it, and the top level
         COPYING file license applied.
      
         For non */uapi/* files that summary was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0                                              11139
      
         and resulted in the first patch in this series.
      
         If that file was a */uapi/* path one, it was "GPL-2.0 WITH
         Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0 WITH Linux-syscall-note                        930
      
         and resulted in the second patch in this series.
      
       - if a file had some form of licensing information in it, and was one
         of the */uapi/* ones, it was denoted with the Linux-syscall-note if
         any GPL family license was found in the file or had no licensing in
         it (per prior point).  Results summary:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|------
         GPL-2.0 WITH Linux-syscall-note                       270
         GPL-2.0+ WITH Linux-syscall-note                      169
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
         LGPL-2.1+ WITH Linux-syscall-note                      15
         GPL-1.0+ WITH Linux-syscall-note                       14
         ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
         LGPL-2.0+ WITH Linux-syscall-note                       4
         LGPL-2.1 WITH Linux-syscall-note                        3
         ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
         ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1
      
         and that resulted in the third patch in this series.
      
       - when the two scanners agreed on the detected license(s), that became
         the concluded license(s).
      
       - when there was disagreement between the two scanners (one detected a
         license but the other didn't, or they both detected different
         licenses) a manual inspection of the file occurred.
      
       - In most cases a manual inspection of the information in the file
         resulted in a clear resolution of the license that should apply (and
         which scanner probably needed to revisit its heuristics).
      
       - When it was not immediately clear, the license identifier was
         confirmed with lawyers working with the Linux Foundation.
      
       - If there was any question as to the appropriate license identifier,
         the file was flagged for further research and to be revisited later
         in time.
      
      In total, over 70 hours of logged manual review was done on the
      spreadsheet to determine the SPDX license identifiers to apply to the
      source files by Kate, Philippe, Thomas and, in some cases, confirmation
      by lawyers working with the Linux Foundation.
      
      Kate also obtained a third independent scan of the 4.13 code base from
      FOSSology, and compared selected files where the other two scanners
      disagreed against that SPDX file, to see if there was new insights.  The
      Windriver scanner is based on an older version of FOSSology in part, so
      they are related.
      
      Thomas did random spot checks in about 500 files from the spreadsheets
      for the uapi headers and agreed with SPDX license identifier in the
      files he inspected. For the non-uapi files Thomas did random spot checks
      in about 15000 files.
      
      In initial set of patches against 4.14-rc6, 3 files were found to have
      copy/paste license identifier errors, and have been fixed to reflect the
      correct identifier.
      
      Additionally Philippe spent 10 hours this week doing a detailed manual
      inspection and review of the 12,461 patched files from the initial patch
      version early this week with:
       - a full scancode scan run, collecting the matched texts, detected
         license ids and scores
       - reviewing anything where there was a license detected (about 500+
         files) to ensure that the applied SPDX license was correct
       - reviewing anything where there was no detection but the patch license
         was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
         SPDX license was correct
      
      This produced a worksheet with 20 files needing minor correction.  This
      worksheet was then exported into 3 different .csv files for the
      different types of files to be modified.
      
      These .csv files were then reviewed by Greg.  Thomas wrote a script to
      parse the csv files and add the proper SPDX tag to the file, in the
      format that the file expected.  This script was further refined by Greg
      based on the output to detect more types of files automatically and to
      distinguish between header and source .c files (which need different
      comment types.)  Finally Greg ran the script using the .csv files to
      generate the patches.
      Reviewed-by: NKate Stewart <kstewart@linuxfoundation.org>
      Reviewed-by: NPhilippe Ombredanne <pombredanne@nexb.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b2441318
  9. 24 10月, 2017 1 次提交
  10. 20 10月, 2017 1 次提交
  11. 27 7月, 2017 1 次提交
    • A
      x86/ldt/64: Refresh DS and ES when modify_ldt changes an entry · a6323757
      Andy Lutomirski 提交于
      On x86_32, modify_ldt() implicitly refreshes the cached DS and ES
      segments because they are refreshed on return to usermode.
      
      On x86_64, they're not refreshed on return to usermode.  To improve
      determinism and match x86_32's behavior, refresh them when we update
      the LDT.
      
      This avoids a situation in which the DS points to a descriptor that is
      changed but the old cached segment persists until the next reschedule.
      If this happens, then the user-visible state will change
      nondeterministically some time after modify_ldt() returns, which is
      unfortunate.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Chang Seok <chang.seok.bae@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      a6323757
  12. 08 6月, 2017 1 次提交
  13. 05 6月, 2017 1 次提交
    • A
      x86/mm: Rework lazy TLB to track the actual loaded mm · 3d28ebce
      Andy Lutomirski 提交于
      Lazy TLB state is currently managed in a rather baroque manner.
      AFAICT, there are three possible states:
      
       - Non-lazy.  This means that we're running a user thread or a
         kernel thread that has called use_mm().  current->mm ==
         current->active_mm == cpu_tlbstate.active_mm and
         cpu_tlbstate.state == TLBSTATE_OK.
      
       - Lazy with user mm.  We're running a kernel thread without an mm
         and we're borrowing an mm_struct.  We have current->mm == NULL,
         current->active_mm == cpu_tlbstate.active_mm, cpu_tlbstate.state
         != TLBSTATE_OK (i.e. TLBSTATE_LAZY or 0).  The current cpu is set
         in mm_cpumask(current->active_mm).  CR3 points to
         current->active_mm->pgd.  The TLB is up to date.
      
       - Lazy with init_mm.  This happens when we call leave_mm().  We
         have current->mm == NULL, current->active_mm ==
         cpu_tlbstate.active_mm, but that mm is only relelvant insofar as
         the scheduler is tracking it for refcounting.  cpu_tlbstate.state
         != TLBSTATE_OK.  The current cpu is clear in
         mm_cpumask(current->active_mm).  CR3 points to swapper_pg_dir,
         i.e. init_mm->pgd.
      
      This patch simplifies the situation.  Other than perf, x86 stops
      caring about current->active_mm at all.  We have
      cpu_tlbstate.loaded_mm pointing to the mm that CR3 references.  The
      TLB is always up to date for that mm.  leave_mm() just switches us
      to init_mm.  There are no longer any special cases for mm_cpumask,
      and switch_mm() switches mms without worrying about laziness.
      
      After this patch, cpu_tlbstate.state serves only to tell the TLB
      flush code whether it may switch to init_mm instead of doing a
      normal flush.
      
      This makes fairly extensive changes to xen_exit_mmap(), which used
      to look a bit like black magic.
      
      Perf is unchanged.  With or without this change, perf may behave a bit
      erratically if it tries to read user memory in kernel thread context.
      We should build on this patch to teach perf to never look at user
      memory when cpu_tlbstate.loaded_mm != current->mm.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Nadav Amit <nadav.amit@gmail.com>
      Cc: Nadav Amit <namit@vmware.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-mm@kvack.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      3d28ebce
  14. 13 12月, 2016 1 次提交
  15. 10 12月, 2016 2 次提交
    • T
      x86/ldt: Make all size computations unsigned · 990e9dc3
      Thomas Gleixner 提交于
      ldt->size can never be negative. The helper functions take 'unsigned int'
      arguments which are assigned from ldt->size. The related user space
      user_desc struct member entry_number is unsigned as well.
      
      But ldt->size itself and a few local variables which are related to
      ldt->size are type 'int' which makes no sense whatsoever and results in
      typecasts which make the eyes bleed.
      
      Clean it up and convert everything which is related to ldt->size to
      unsigned it.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Dan Carpenter <dan.carpenter@oracle.com>
      990e9dc3
    • D
      x86/ldt: Make a size argument unsigned · 296dc580
      Dan Carpenter 提交于
      My static checker complains that we put an upper bound on the "size"
      argument but not a lower bound.  The checker is not smart enough to know
      the possible ranges of "old_mm->context.ldt->size" from
      init_new_context_ldt() so it thinks maybe it could be negative.
      
      Let's make it unsigned to silence the warning and future proof the code
      a bit.
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: kernel-janitors@vger.kernel.org
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Link: http://lkml.kernel.org/r/20161208105602.GA11382@elgon.mountainSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      296dc580
  16. 19 2月, 2016 1 次提交
    • D
      x86/mm: Factor out LDT init from context init · 39a0526f
      Dave Hansen 提交于
      The arch-specific mm_context_t is a great place to put
      protection-key allocation state.
      
      But, we need to initialize the allocation state because pkey 0 is
      always "allocated".  All of the runtime initialization of
      mm_context_t is done in *_ldt() manipulation functions.  This
      renames the existing LDT functions like this:
      
      	init_new_context() -> init_new_context_ldt()
      	destroy_context() -> destroy_context_ldt()
      
      and makes init_new_context() and destroy_context() available for
      generic use.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/20160212210234.DB34FCC5@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      39a0526f
  17. 14 9月, 2015 1 次提交
    • J
      x86/ldt: Fix small LDT allocation for Xen · f454b478
      Jan Beulich 提交于
      While the following commit:
      
        37868fe1 ("x86/ldt: Make modify_ldt synchronous")
      
      added a nice comment explaining that Xen needs page-aligned
      whole page chunks for guest descriptor tables, it then
      nevertheless used kzalloc() on the small size path.
      
      As I'm unaware of guarantees for kmalloc(PAGE_SIZE, ) to return
      page-aligned memory blocks, I believe this needs to be switched
      back to __get_free_page() (or better get_zeroed_page()).
      Signed-off-by: NJan Beulich <jbeulich@suse.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/55E735D6020000780009F1E6@prv-mh.provo.novell.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f454b478
  18. 31 7月, 2015 1 次提交
    • A
      x86/ldt: Make modify_ldt synchronous · 37868fe1
      Andy Lutomirski 提交于
      modify_ldt() has questionable locking and does not synchronize
      threads.  Improve it: redesign the locking and synchronize all
      threads' LDTs using an IPI on all modifications.
      
      This will dramatically slow down modify_ldt in multithreaded
      programs, but there shouldn't be any multithreaded programs that
      care about modify_ldt's performance in the first place.
      
      This fixes some fallout from the CVE-2015-5157 fixes.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Cc: Andrew Cooper <andrew.cooper3@citrix.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jan Beulich <jbeulich@suse.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: security@kernel.org <security@kernel.org>
      Cc: <stable@vger.kernel.org>
      Cc: xen-devel <xen-devel@lists.xen.org>
      Link: http://lkml.kernel.org/r/4c6978476782160600471bd865b318db34c7b628.1438291540.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      37868fe1
  19. 22 5月, 2014 1 次提交
  20. 15 5月, 2014 1 次提交
    • L
      x86-64, modify_ldt: Make support for 16-bit segments a runtime option · fa81511b
      Linus Torvalds 提交于
      Checkin:
      
      b3b42ac2 x86-64, modify_ldt: Ban 16-bit segments on 64-bit kernels
      
      disabled 16-bit segments on 64-bit kernels due to an information
      leak.  However, it does seem that people are genuinely using Wine to
      run old 16-bit Windows programs on Linux.
      
      A proper fix for this ("espfix64") is coming in the upcoming merge
      window, but as a temporary fix, create a sysctl to allow the
      administrator to re-enable support for 16-bit segments.
      
      It adds a "/proc/sys/abi/ldt16" sysctl that defaults to zero (off). If
      you hit this issue and care about your old Windows program more than
      you care about a kernel stack address information leak, you can do
      
         echo 1 > /proc/sys/abi/ldt16
      
      as root (add it to your startup scripts), and you should be ok.
      
      The sysctl table is only added if you have COMPAT support enabled on
      x86-64, but I assume anybody who runs old windows binaries very much
      does that ;)
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      Link: http://lkml.kernel.org/r/CA%2B55aFw9BPoD10U1LfHbOMpHWZkvJTkMcfCs9s3urPr1YyWBxw@mail.gmail.com
      Cc: <stable@vger.kernel.org>
      fa81511b
  21. 05 5月, 2014 1 次提交
  22. 01 5月, 2014 1 次提交
    • H
      x86-64, espfix: Don't leak bits 31:16 of %esp returning to 16-bit stack · 3891a04a
      H. Peter Anvin 提交于
      The IRET instruction, when returning to a 16-bit segment, only
      restores the bottom 16 bits of the user space stack pointer.  This
      causes some 16-bit software to break, but it also leaks kernel state
      to user space.  We have a software workaround for that ("espfix") for
      the 32-bit kernel, but it relies on a nonzero stack segment base which
      is not available in 64-bit mode.
      
      In checkin:
      
          b3b42ac2 x86-64, modify_ldt: Ban 16-bit segments on 64-bit kernels
      
      we "solved" this by forbidding 16-bit segments on 64-bit kernels, with
      the logic that 16-bit support is crippled on 64-bit kernels anyway (no
      V86 support), but it turns out that people are doing stuff like
      running old Win16 binaries under Wine and expect it to work.
      
      This works around this by creating percpu "ministacks", each of which
      is mapped 2^16 times 64K apart.  When we detect that the return SS is
      on the LDT, we copy the IRET frame to the ministack and use the
      relevant alias to return to userspace.  The ministacks are mapped
      readonly, so if IRET faults we promote #GP to #DF which is an IST
      vector and thus has its own stack; we then do the fixup in the #DF
      handler.
      
      (Making #GP an IST exception would make the msr_safe functions unsafe
      in NMI/MC context, and quite possibly have other effects.)
      
      Special thanks to:
      
      - Andy Lutomirski, for the suggestion of using very small stack slots
        and copy (as opposed to map) the IRET frame there, and for the
        suggestion to mark them readonly and let the fault promote to #DF.
      - Konrad Wilk for paravirt fixup and testing.
      - Borislav Petkov for testing help and useful comments.
      Reported-by: NBrian Gerst <brgerst@gmail.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      Link: http://lkml.kernel.org/r/1398816946-3351-1-git-send-email-hpa@linux.intel.com
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Andrew Lutomriski <amluto@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Dirk Hohndel <dirk@hohndel.org>
      Cc: Arjan van de Ven <arjan.van.de.ven@intel.com>
      Cc: comex <comexk@gmail.com>
      Cc: Alexander van Heukelum <heukelum@fastmail.fm>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: <stable@vger.kernel.org> # consider after upstream merge
      3891a04a
  23. 12 4月, 2014 1 次提交
    • H
      x86-64, modify_ldt: Ban 16-bit segments on 64-bit kernels · b3b42ac2
      H. Peter Anvin 提交于
      The IRET instruction, when returning to a 16-bit segment, only
      restores the bottom 16 bits of the user space stack pointer.  We have
      a software workaround for that ("espfix") for the 32-bit kernel, but
      it relies on a nonzero stack segment base which is not available in
      32-bit mode.
      
      Since 16-bit support is somewhat crippled anyway on a 64-bit kernel
      (no V86 mode), and most (if not quite all) 64-bit processors support
      virtualization for the users who really need it, simply reject
      attempts at creating a 16-bit segment when running on top of a 64-bit
      kernel.
      
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      Link: http://lkml.kernel.org/n/tip-kicdm89kzw9lldryb1br9od0@git.kernel.org
      Cc: <stable@vger.kernel.org>
      b3b42ac2
  24. 29 3月, 2012 1 次提交
  25. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  26. 24 9月, 2009 1 次提交
  27. 03 1月, 2009 1 次提交
  28. 26 7月, 2008 1 次提交
  29. 24 7月, 2008 1 次提交
  30. 22 7月, 2008 1 次提交
  31. 19 7月, 2008 1 次提交
  32. 26 6月, 2008 1 次提交
  33. 23 5月, 2008 1 次提交
  34. 04 2月, 2008 1 次提交
  35. 30 1月, 2008 2 次提交