1. 24 2月, 2013 2 次提交
    • A
      x86/mm/pageattr: Prevent PSE and GLOABL leftovers to confuse pmd/pte_present and pmd_huge · a8aed3e0
      Andrea Arcangeli 提交于
      Without this patch any kernel code that reads kernel memory in
      non present kernel pte/pmds (as set by pageattr.c) will crash.
      
      With this kernel code:
      
      static struct page *crash_page;
      static unsigned long *crash_address;
      [..]
      	crash_page = alloc_pages(GFP_KERNEL, 9);
      	crash_address = page_address(crash_page);
      	if (set_memory_np((unsigned long)crash_address, 1))
      		printk("set_memory_np failure\n");
      [..]
      
      The kernel will crash if inside the "crash tool" one would try
      to read the memory at the not present address.
      
      crash> p crash_address
      crash_address = $8 = (long unsigned int *) 0xffff88023c000000
      crash> rd 0xffff88023c000000
      [ *lockup* ]
      
      The lockup happens because _PAGE_GLOBAL and _PAGE_PROTNONE
      shares the same bit, and pageattr leaves _PAGE_GLOBAL set on a
      kernel pte which is then mistaken as _PAGE_PROTNONE (so
      pte_present returns true by mistake and the kernel fault then
      gets confused and loops).
      
      With THP the same can happen after we taught pmd_present to
      check _PAGE_PROTNONE and _PAGE_PSE in commit
      027ef6c8 ("mm: thp: fix pmd_present for
      split_huge_page and PROT_NONE with THP").  THP has the same
      problem with _PAGE_GLOBAL as the 4k pages, but it also has a
      problem with _PAGE_PSE, which must be cleared too.
      
      After the patch is applied copy_user correctly returns -EFAULT
      and doesn't lockup anymore.
      
      crash> p crash_address
      crash_address = $9 = (long unsigned int *) 0xffff88023c000000
      crash> rd 0xffff88023c000000
      rd: read error: kernel virtual address: ffff88023c000000  type:
      "64-bit KVADDR"
      Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Shaohua Li <shaohua.li@intel.com>
      Cc: "H. Peter Anvin" <hpa@linux.intel.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      a8aed3e0
    • W
      memory-hotplug: common APIs to support page tables hot-remove · ae9aae9e
      Wen Congyang 提交于
      When memory is removed, the corresponding pagetables should alse be
      removed.  This patch introduces some common APIs to support vmemmap
      pagetable and x86_64 architecture direct mapping pagetable removing.
      
      All pages of virtual mapping in removed memory cannot be freed if some
      pages used as PGD/PUD include not only removed memory but also other
      memory.  So this patch uses the following way to check whether a page
      can be freed or not.
      
      1) When removing memory, the page structs of the removed memory are
         filled with 0FD.
      
      2) All page structs are filled with 0xFD on PT/PMD, PT/PMD can be
         cleared.  In this case, the page used as PT/PMD can be freed.
      
      For direct mapping pages, update direct_pages_count[level] when we freed
      their pagetables.  And do not free the pages again because they were
      freed when offlining.
      
      For vmemmap pages, free the pages and their pagetables.
      
      For larger pages, do not split them into smaller ones because there is
      no way to know if the larger page has been split.  As a result, there is
      no way to decide when to split.  We deal the larger pages in the
      following way:
      
      1) For direct mapped pages, all the pages were freed when they were
         offlined.  And since menmory offline is done section by section, all
         the memory ranges being removed are aligned to PAGE_SIZE.  So only need
         to deal with unaligned pages when freeing vmemmap pages.
      
      2) For vmemmap pages being used to store page_struct, if part of the
         larger page is still in use, just fill the unused part with 0xFD.  And
         when the whole page is fulfilled with 0xFD, then free the larger page.
      
      [akpm@linux-foundation.org: fix typo in comment]
      [tangchen@cn.fujitsu.com: do not calculate direct mapping pages when freeing vmemmap pagetables]
      [tangchen@cn.fujitsu.com: do not free direct mapping pages twice]
      [tangchen@cn.fujitsu.com: do not free page split from hugepage one by one]
      [tangchen@cn.fujitsu.com: do not split pages when freeing pagetable pages]
      [akpm@linux-foundation.org: use pmd_page_vaddr()]
      [akpm@linux-foundation.org: fix used-uninitialised bug]
      Signed-off-by: NYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Signed-off-by: NJianguo Wu <wujianguo@huawei.com>
      Signed-off-by: NWen Congyang <wency@cn.fujitsu.com>
      Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Jiang Liu <jiang.liu@huawei.com>
      Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Wu Jianguo <wujianguo@huawei.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ae9aae9e
  2. 26 1月, 2013 2 次提交
  3. 16 12月, 2012 1 次提交
  4. 18 11月, 2012 1 次提交
  5. 17 11月, 2012 1 次提交
  6. 30 10月, 2012 1 次提交
    • J
      x86-64/efi: Use EFI to deal with platform wall clock (again) · bd52276f
      Jan Beulich 提交于
      Other than ix86, x86-64 on EFI so far didn't set the
      {g,s}et_wallclock accessors to the EFI routines, thus
      incorrectly using raw RTC accesses instead.
      
      Simply removing the #ifdef around the respective code isn't
      enough, however: While so far early get-time calls were done in
      physical mode, this doesn't work properly for x86-64, as virtual
      addresses would still need to be set up for all runtime regions
      (which wasn't the case on the system I have access to), so
      instead the patch moves the call to efi_enter_virtual_mode()
      ahead (which in turn allows to drop all code related to calling
      efi-get-time in physical mode).
      
      Additionally the earlier calling of efi_set_executable()
      requires the CPA code to cope, i.e. during early boot it must be
      avoided to call cpa_flush_array(), as the first thing this
      function does is a BUG_ON(irqs_disabled()).
      
      Also make the two EFI functions in question here static -
      they're not being referenced elsewhere.
      
      History:
      
          This commit was originally merged as bacef661 ("x86-64/efi:
          Use EFI to deal with platform wall clock") but it resulted in some
          ASUS machines no longer booting due to a firmware bug, and so was
          reverted in f026cfa8. A pre-emptive fix for the buggy ASUS
          firmware was merged in 03a1c254975e ("x86, efi: 1:1 pagetable
          mapping for virtual EFI calls") so now this patch can be
          reapplied.
      Signed-off-by: NJan Beulich <jbeulich@suse.com>
      Tested-by: NMatt Fleming <matt.fleming@intel.com>
      Acked-by: NMatthew Garrett <mjg@redhat.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Signed-off-by: Matt Fleming <matt.fleming@intel.com> [added commit history]
      bd52276f
  7. 15 8月, 2012 1 次提交
  8. 11 6月, 2012 1 次提交
  9. 06 6月, 2012 1 次提交
    • J
      x86-64/efi: Use EFI to deal with platform wall clock · bacef661
      Jan Beulich 提交于
      Other than ix86, x86-64 on EFI so far didn't set the
      {g,s}et_wallclock accessors to the EFI routines, thus
      incorrectly using raw RTC accesses instead.
      
      Simply removing the #ifdef around the respective code isn't
      enough, however: While so far early get-time calls were done in
      physical mode, this doesn't work properly for x86-64, as virtual
      addresses would still need to be set up for all runtime regions
      (which wasn't the case on the system I have access to), so
      instead the patch moves the call to efi_enter_virtual_mode()
      ahead (which in turn allows to drop all code related to calling
      efi-get-time in physical mode).
      
      Additionally the earlier calling of efi_set_executable()
      requires the CPA code to cope, i.e. during early boot it must be
      avoided to call cpa_flush_array(), as the first thing this
      function does is a BUG_ON(irqs_disabled()).
      
      Also make the two EFI functions in question here static -
      they're not being referenced elsewhere.
      Signed-off-by: NJan Beulich <jbeulich@suse.com>
      Tested-by: NMatt Fleming <matt.fleming@intel.com>
      Acked-by: NMatthew Garrett <mjg@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/4FBFBF5F020000780008637F@nat28.tlf.novell.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      bacef661
  10. 06 12月, 2011 2 次提交
  11. 18 3月, 2011 1 次提交
  12. 10 3月, 2011 1 次提交
  13. 03 2月, 2011 1 次提交
  14. 18 11月, 2010 2 次提交
    • M
      x86: Add NX protection for kernel data · 5bd5a452
      Matthieu Castet 提交于
      This patch expands functionality of CONFIG_DEBUG_RODATA to set main
      (static) kernel data area as NX.
      
      The following steps are taken to achieve this:
      
       1. Linker script is adjusted so .text always starts and ends on a page bound
       2. Linker script is adjusted so .rodata always start and end on a page boundary
       3. NX is set for all pages from _etext through _end in mark_rodata_ro.
       4. free_init_pages() sets released memory NX in arch/x86/mm/init.c
       5. bios rom is set to x when pcibios is used.
      
      The results of patch application may be observed in the diff of kernel page
      table dumps:
      
      pcibios:
      
       -- data_nx_pt_before.txt       2009-10-13 07:48:59.000000000 -0400
       ++ data_nx_pt_after.txt        2009-10-13 07:26:46.000000000 -0400
        0x00000000-0xc0000000           3G                           pmd
        ---[ Kernel Mapping ]---
       -0xc0000000-0xc0100000           1M     RW             GLB x  pte
       +0xc0000000-0xc00a0000         640K     RW             GLB NX pte
       +0xc00a0000-0xc0100000         384K     RW             GLB x  pte
       -0xc0100000-0xc03d7000        2908K     ro             GLB x  pte
       +0xc0100000-0xc0318000        2144K     ro             GLB x  pte
       +0xc0318000-0xc03d7000         764K     ro             GLB NX pte
       -0xc03d7000-0xc0600000        2212K     RW             GLB x  pte
       +0xc03d7000-0xc0600000        2212K     RW             GLB NX pte
        0xc0600000-0xf7a00000         884M     RW         PSE GLB NX pmd
        0xf7a00000-0xf7bfe000        2040K     RW             GLB NX pte
        0xf7bfe000-0xf7c00000           8K                           pte
      
      No pcibios:
      
       -- data_nx_pt_before.txt       2009-10-13 07:48:59.000000000 -0400
       ++ data_nx_pt_after.txt        2009-10-13 07:26:46.000000000 -0400
        0x00000000-0xc0000000           3G                           pmd
        ---[ Kernel Mapping ]---
       -0xc0000000-0xc0100000           1M     RW             GLB x  pte
       +0xc0000000-0xc0100000           1M     RW             GLB NX pte
       -0xc0100000-0xc03d7000        2908K     ro             GLB x  pte
       +0xc0100000-0xc0318000        2144K     ro             GLB x  pte
       +0xc0318000-0xc03d7000         764K     ro             GLB NX pte
       -0xc03d7000-0xc0600000        2212K     RW             GLB x  pte
       +0xc03d7000-0xc0600000        2212K     RW             GLB NX pte
        0xc0600000-0xf7a00000         884M     RW         PSE GLB NX pmd
        0xf7a00000-0xf7bfe000        2040K     RW             GLB NX pte
        0xf7bfe000-0xf7c00000           8K                           pte
      
      The patch has been originally developed for Linux 2.6.34-rc2 x86 by
      Siarhei Liakh <sliakh.lkml@gmail.com> and Xuxian Jiang <jiang@cs.ncsu.edu>.
      
       -v1:  initial patch for 2.6.30
       -v2:  patch for 2.6.31-rc7
       -v3:  moved all code into arch/x86, adjusted credits
       -v4:  fixed ifdef, removed credits from CREDITS
       -v5:  fixed an address calculation bug in mark_nxdata_nx()
       -v6:  added acked-by and PT dump diff to commit log
       -v7:  minor adjustments for -tip
       -v8:  rework with the merge of "Set first MB as RW+NX"
      Signed-off-by: NSiarhei Liakh <sliakh.lkml@gmail.com>
      Signed-off-by: NXuxian Jiang <jiang@cs.ncsu.edu>
      Signed-off-by: NMatthieu CASTET <castet.matthieu@free.fr>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: James Morris <jmorris@namei.org>
      Cc: Andi Kleen <ak@muc.de>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Dave Jones <davej@redhat.com>
      Cc: Kees Cook <kees.cook@canonical.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      LKML-Reference: <4CE2F82E.60601@free.fr>
      [ minor cleanliness edits ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      5bd5a452
    • M
      x86: Fix improper large page preservation · 64edc8ed
      matthieu castet 提交于
      This patch fixes a bug in try_preserve_large_page() which may
      result in improper large page preservation and improper
      application of page attributes to the memory area outside of the
      original change request.
      
      More specifically, the problem manifests itself when set_memory_*()
      is called for several pages at the beginning of the large page and
      try_preserve_large_page() erroneously concludes that the change can
      be applied to whole large page.
      
      The fix consists of 3 parts:
      
        1. Addition of "required" protection attributes in
           static_protections(), so .data and .bss can be guaranteed to
           stay "RW"
      
        2. static_protections() is now called for every small
           page within large page to determine compatibility of new
           protection attributes (instead of just small pages within the
           requested range).
      
        3. Large page can be preserved only if attribute change is
           large-page-aligned and covers whole large page.
      
       -v1: Try_preserve_large_page() patch for Linux 2.6.34-rc2
       -v2: Replaced pfn check with address check for kernel rw-data
      Signed-off-by: NSiarhei Liakh <sliakh.lkml@gmail.com>
      Signed-off-by: NXuxian Jiang <jiang@cs.ncsu.edu>
      Reviewed-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: James Morris <jmorris@namei.org>
      Cc: Andi Kleen <ak@muc.de>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Dave Jones <davej@redhat.com>
      Cc: Kees Cook <kees.cook@canonical.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      LKML-Reference: <4CE2F7F3.8030809@free.fr>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      64edc8ed
  15. 06 4月, 2010 1 次提交
  16. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  17. 23 2月, 2010 1 次提交
    • S
      x86_64, cpa: Don't work hard in preserving kernel 2M mappings when using 4K already · 281ff33b
      Suresh Siddha 提交于
      We currently enforce the !RW mapping for the kernel mapping that maps
      holes between different text, rodata and data sections. However, kernel
      identity mappings will have different RWX permissions to the pages mapping to
      text and to the pages padding (which are freed) the text, rodata sections.
      Hence kernel identity mappings will be broken to smaller pages. For 64-bit,
      kernel text and kernel identity mappings are different, so we can enable
      protection checks that come with CONFIG_DEBUG_RODATA, as well as retain 2MB
      large page mappings for kernel text.
      
      Konrad reported a boot failure with the Linux Xen paravirt guest because of
      this. In this paravirt guest case, the kernel text mapping and the kernel
      identity mapping share the same page-table pages. Thus forcing the !RW mapping
      for some of the kernel mappings also cause the kernel identity mappings to be
      read-only resulting in the boot failure. Linux Xen paravirt guest also
      uses 4k mappings and don't use 2M mapping.
      
      Fix this issue and retain large page performance advantage for native kernels
      by not working hard and not enforcing !RW for the kernel text mapping,
      if the current mapping is already using small page mapping.
      Reported-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <1266522700.2909.34.camel@sbs-t61.sc.intel.com>
      Tested-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: stable@kernel.org	[2.6.32, 2.6.33]
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      281ff33b
  18. 17 11月, 2009 1 次提交
    • H
      x86, pageattr: Make set_memory_(x|nx) aware of NX support · 583140af
      H. Peter Anvin 提交于
      Make set_memory_x/set_memory_nx directly aware of if NX is supported
      in the system or not, rather than requiring that every caller assesses
      that support independently.
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Huang Ying <ying.huang@intel.com>
      Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Tim Starling <tstarling@wikimedia.org>
      Cc: Hannes Eder <hannes@hanneseder.net>
      LKML-Reference: <1258154897-6770-4-git-send-email-hpa@zytor.com>
      Acked-by: NKees Cook <kees.cook@canonical.com>
      583140af
  19. 03 11月, 2009 2 次提交
  20. 28 10月, 2009 1 次提交
    • S
      tracing: allow to change permissions for text with dynamic ftrace enabled · 883242dd
      Steven Rostedt 提交于
      The commit 74e08179
      x86-64: align RODATA kernel section to 2MB with CONFIG_DEBUG_RODATA
      prevents text sections from becoming read/write using set_memory_rw.
      
      The dynamic ftrace changes all text pages to read/write just before
      converting the calls to tracing to nops, and vice versa.
      
      I orginally just added a flag to allow this transaction when ftrace
      did the change, but I also found that when the CPA testing was running
      it would remove the read/write as well, and ftrace does not do the text
      conversion on boot up, and the CPA changes caused the dynamic tracer
      to fail on self tests.
      
      The current solution I have is to simply not to prevent
      change_page_attr from setting the RW bit for kernel text pages.
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      883242dd
  21. 20 10月, 2009 1 次提交
    • S
      x86-64: align RODATA kernel section to 2MB with CONFIG_DEBUG_RODATA · 74e08179
      Suresh Siddha 提交于
      CONFIG_DEBUG_RODATA chops the large pages spanning boundaries of kernel
      text/rodata/data to small 4KB pages as they are mapped with different
      attributes (text as RO, RODATA as RO and NX etc).
      
      On x86_64, preserve the large page mappings for kernel text/rodata/data
      boundaries when CONFIG_DEBUG_RODATA is enabled. This is done by allowing the
      RODATA section to be hugepage aligned and having same RWX attributes
      for the 2MB page boundaries
      
      Extra Memory pages padding the sections will be freed during the end of the boot
      and the kernel identity mappings will have different RWX permissions compared to
      the kernel text mappings.
      
      Kernel identity mappings to these physical pages will be mapped with smaller
      pages but large page mappings are still retained for kernel text,rodata,data
      mappings.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <20091014220254.190119924@sbs-t61.sc.intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      74e08179
  22. 12 9月, 2009 1 次提交
    • E
      agp/intel: Fix the pre-9xx chipset flush. · e517a5e9
      Eric Anholt 提交于
      Ever since we enabled GEM, the pre-9xx chipsets (particularly 865) have had
      serious stability issues.  Back in May a wbinvd was added to the DRM to
      work around much of the problem.  Some failure remained -- easily visible
      by dragging a window around on an X -retro desktop, or by looking at bugzilla.
      
      The chipset flush was on the right track -- hitting the right amount of
      memory, and it appears to be the only way to flush on these chipsets, but the
      flush page was mapped uncached.  As a result, the writes trying to clear the
      writeback cache ended up bypassing the cache, and not flushing anything!  The
      wbinvd would flush out other writeback data and often cause the data we wanted
      to get flushed, but not always.  By removing the setting of the page to UC
      and instead just clflushing the data we write to try to flush it, we get the
      desired behavior with no wbinvd.
      
      This exports clflush_cache_range(), which was laying around and happened to
      basically match the code I was otherwise going to copy from the DRM.
      Signed-off-by: NEric Anholt <eric@anholt.net>
      Signed-off-by: NBrice Goglin <Brice.Goglin@ens-lyon.org>
      Cc: stable@kernel.org
      e517a5e9
  23. 10 9月, 2009 1 次提交
  24. 14 8月, 2009 1 次提交
  25. 04 8月, 2009 1 次提交
  26. 31 7月, 2009 1 次提交
    • P
      x86, pat: Fix set_memory_wc related corruption · bdc6340f
      Pallipadi, Venkatesh 提交于
      Changeset 3869c4aa
      that went in after 2.6.30-rc1 was a seemingly small change to _set_memory_wc()
      to make it complaint with SDM requirements. But, introduced a nasty bug, which
      can result in crash and/or strange corruptions when set_memory_wc is used.
      One such crash reported here
      http://lkml.org/lkml/2009/7/30/94
      
      Actually, that changeset introduced two bugs.
      * change_page_attr_set() takes &addr as first argument and can the addr value
        might have changed on return, even for single page change_page_attr_set()
        call. That will make the second change_page_attr_set() in this routine
        operate on unrelated addr, that can eventually cause strange corruptions
        and bad page state crash.
      * The second change_page_attr_set() call, before setting _PAGE_CACHE_WC, should
        clear the earlier _PAGE_CACHE_UC_MINUS, as otherwise cache attribute will not
        be WC (will be UC instead).
      
      The patch below fixes both these problems. Sending a single patch to fix both
      the problems, as the change is to the same line of code. The change to have a
      addr_copy is not very clean. But, it is simpler than making more changes
      through various routines in pageattr.c.
      
      A huge thanks to Jerome for reporting this problem and providing a simple test
      case that helped us root cause the problem.
      Reported-by: NJerome Glisse <glisse@freedesktop.org>
      Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <20090730214319.GA1889@linux-os.sc.intel.com>
      Acked-by: NDave Airlie <airlied@redhat.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      bdc6340f
  27. 04 7月, 2009 1 次提交
    • T
      x86,percpu: generalize lpage first chunk allocator · 8c4bfc6e
      Tejun Heo 提交于
      Generalize and move x86 setup_pcpu_lpage() into
      pcpu_lpage_first_chunk().  setup_pcpu_lpage() now is a simple wrapper
      around the generalized version.  Other than taking size parameters and
      using arch supplied callbacks to allocate/free/map memory,
      pcpu_lpage_first_chunk() is identical to the original implementation.
      
      This simplifies arch code and will help converting more archs to
      dynamic percpu allocator.
      
      While at it, factor out pcpu_calc_fc_sizes() which is common to
      pcpu_embed_first_chunk() and pcpu_lpage_first_chunk().
      
      [ Impact: code reorganization and generalization ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      8c4bfc6e
  28. 22 6月, 2009 2 次提交
    • T
      x86: fix pageattr handling for lpage percpu allocator and re-enable it · e59a1bb2
      Tejun Heo 提交于
      lpage allocator aliases a PMD page for each cpu and returns whatever
      is unused to the page allocator.  When the pageattr of the recycled
      pages are changed, this makes the two aliases point to the overlapping
      regions with different attributes which isn't allowed and known to
      cause subtle data corruption in certain cases.
      
      This can be handled in simliar manner to the x86_64 highmap alias.
      pageattr code should detect if the target pages have PMD alias and
      split the PMD alias and synchronize the attributes.
      
      pcpur allocator is updated to keep the allocated PMD pages map sorted
      in ascending address order and provide pcpu_lpage_remapped() function
      which binary searches the array to determine whether the given address
      is aliased and if so to which address.  pageattr is updated to use
      pcpu_lpage_remapped() to detect the PMD alias and split it up as
      necessary from cpa_process_alias().
      
      Jan Beulich spotted the original problem and incorrect usage of vaddr
      instead of laddr for lookup.
      
      With this, lpage percpu allocator should work correctly.  Re-enable
      it.
      
      [ Impact: fix subtle lpage pageattr bug and re-enable lpage ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NJan Beulich <JBeulich@novell.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      e59a1bb2
    • T
      x86: reorganize cpa_process_alias() · 992f4c1c
      Tejun Heo 提交于
      Reorganize cpa_process_alias() so that new alias condition can be
      added easily.
      
      Jan Beulich spotted problem in the original cleanup thread which
      incorrectly assumed the two existing conditions were mutially
      exclusive.
      
      [ Impact: code reorganization ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Jan Beulich <JBeulich@novell.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      992f4c1c
  29. 15 6月, 2009 1 次提交
  30. 27 5月, 2009 1 次提交
  31. 23 5月, 2009 2 次提交
  32. 10 4月, 2009 2 次提交