1. 22 1月, 2009 1 次提交
  2. 20 1月, 2009 1 次提交
    • N
      x86: optimise x86's do_page_fault (C entry point for the page fault path) · 92181f19
      Nick Piggin 提交于
      Impact: cleanup, restructure code to improve assembly
      
      gcc isn't _all_ that smart about spilling registers to stack or reusing
      stack slots, even with branch annotations. do_page_fault contained a lot
      of functionality, so split unlikely paths into their own functions, and
      mark them as noinline just to be sure. I consider this actually to be
      somewhat of a cleanup too: the main function now contains about half
      the number of lines so the normal path is easier to read, while the error
      cases are also nicely split away.
      
      Also, ensure the order of arguments to functions is always the same: regs,
      addr, error_code. This can reduce code size a tiny bit, and just looks neater
      too.
      
      And add a couple of branch annotations.
      
      Before:
        do_page_fault:
                subq    $360, %rsp      #,
      
      After:
        do_page_fault:
                subq    $56, %rsp       #,
      
      bloat-o-meter:
        add/remove: 8/0 grow/shrink: 0/1 up/down: 2222/-1680 (542)
        function                                     old     new   delta
        __bad_area_nosemaphore                         -     506    +506
        no_context                                     -     474    +474
        vmalloc_fault                                  -     424    +424
        spurious_fault                                 -     358    +358
        mm_fault_error                                 -     272    +272
        bad_area_access_error                          -      89     +89
        bad_area                                       -      89     +89
        bad_area_nosemaphore                           -      10     +10
        do_page_fault                               2464     784   -1680
      
      Yes, the total size increases by 542 bytes, due to the extra function calls.
      But these will very rarely be called (except for vmalloc_fault) in a normal
      workload. Importantly, do_page_fault is less than 1/3rd it's original size,
      and touches far less stack.
      
      Existing gotos and branch hints did move a lot of the infrequently used text
      out of the fastpath, but that's even further improved after this patch.
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      92181f19
  3. 17 1月, 2009 1 次提交
  4. 16 1月, 2009 1 次提交
  5. 15 1月, 2009 1 次提交
  6. 14 1月, 2009 7 次提交
    • H
      [CVE-2009-0029] Rename old_readdir to sys_old_readdir · e55380ed
      Heiko Carstens 提交于
      This way it matches the generic system call name convention.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      e55380ed
    • I
      x86: change the default cache size to 64 bytes · 0a2a18b7
      Ingo Molnar 提交于
      Right now the generic cacheline size is 128 bytes - that is wasteful
      when structures are aligned, as all modern x86 CPUs have an (effective)
      cacheline sizes of 64 bytes.
      
      It was set to 128 bytes due to some cacheline aliasing problems on
      older P4 systems, but those are many years old and we dont optimize
      for them anymore. (They'll still get the 128 bytes cacheline size if
      the kernel is specifically built for Pentium 4)
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Acked-by: NArjan van de Ven <arjan@linux.intel.com>
      0a2a18b7
    • F
      x86, tlb flush_data: replace per_cpu with an array · 09b3ec73
      Frederik Deweerdt 提交于
      Impact: micro-optimization, memory reduction
      
      On x86_64 flush tlb data is stored in per_cpu variables. This is
      unnecessary because only the first NUM_INVALIDATE_TLB_VECTORS entries
      are accessed.
      
      This patch aims at making the code less confusing (there's nothing
      really "per_cpu") by using a plain array. It also would save some memory
      on most distros out there (Ubuntu x86_64 has NR_CPUS=64 by default).
      
      [ Ravikiran G Thirumalai also pointed out that the correct alignment
        is ____cacheline_internodealigned_in_smp, so that there's no
        bouncing on vsmp. ]
      Signed-off-by: NFrederik Deweerdt <frederik.deweerdt@xprog.eu>
      Acked-by: NRavikiran Thirumalai <kiran@scalex86.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      09b3ec73
    • V
      x86 PAT: remove CPA WARN_ON for zero pte · 58dab916
      venkatesh.pallipadi@intel.com 提交于
      Impact: reduce scope of debug check - avoid warnings
      
      The logic to find whether identity map exists or not using
      high_memory or max_low_pfn_mapped/max_pfn_mapped are not complete
      as the memory withing the range may not be mapped if there is a
      unusable hole in e820.
      
      Specifically, on my test system I started seeing these warnings with
      tools like hwinfo, acpidump trying to map ACPI region.
      
      [   27.400018] ------------[ cut here ]------------
      [   27.400344] WARNING: at /home/venkip/src/linus/linux-2.6/arch/x86/mm/pageattr.c:560 __change_page_attr_set_clr+0xf3/0x8b8()
      [   27.400821] Hardware name: X7DB8
      [   27.401070] CPA: called for zero pte. vaddr = ffff8800cff6a000 cpa->vaddr = ffff8800cff6a000
      [   27.401569] Modules linked in:
      [   27.401882] Pid: 4913, comm: dmidecode Not tainted 2.6.28-05716-gfe0bdec6 #586
      [   27.402141] Call Trace:
      [   27.402488]  [<ffffffff80237c21>] warn_slowpath+0xd3/0x10f
      [   27.402749]  [<ffffffff80274ade>] ? find_get_page+0xb3/0xc9
      [   27.403028]  [<ffffffff80274a2b>] ? find_get_page+0x0/0xc9
      [   27.403333]  [<ffffffff80226425>] __change_page_attr_set_clr+0xf3/0x8b8
      [   27.403628]  [<ffffffff8028ec99>] ? __purge_vmap_area_lazy+0x192/0x1a1
      [   27.403883]  [<ffffffff8028eb52>] ? __purge_vmap_area_lazy+0x4b/0x1a1
      [   27.404172]  [<ffffffff80290268>] ? vm_unmap_aliases+0x1ab/0x1bb
      [   27.404512]  [<ffffffff80290105>] ? vm_unmap_aliases+0x48/0x1bb
      [   27.404766]  [<ffffffff80226d28>] change_page_attr_set_clr+0x13e/0x2e6
      [   27.405026]  [<ffffffff80698fa7>] ? _spin_unlock+0x26/0x2a
      [   27.405292]  [<ffffffff80227e6a>] ? reserve_memtype+0x19b/0x4e3
      [   27.405590]  [<ffffffff80226ffd>] _set_memory_wb+0x22/0x24
      [   27.405844]  [<ffffffff80225d28>] ioremap_change_attr+0x26/0x28
      [   27.406097]  [<ffffffff80228355>] reserve_pfn_range+0x1a3/0x235
      [   27.406427]  [<ffffffff80228430>] track_pfn_vma_new+0x49/0xb3
      [   27.406686]  [<ffffffff80286c46>] remap_pfn_range+0x94/0x32c
      [   27.406940]  [<ffffffff8022878d>] ? phys_mem_access_prot_allowed+0xb5/0x1a8
      [   27.407209]  [<ffffffff803e9bf4>] mmap_mem+0x75/0x9d
      [   27.407523]  [<ffffffff8028b3b4>] mmap_region+0x2cf/0x53e
      [   27.407776]  [<ffffffff8028b8cc>] do_mmap_pgoff+0x2a9/0x30d
      [   27.408034]  [<ffffffff8020f4a4>] sys_mmap+0x92/0xce
      [   27.408339]  [<ffffffff8020b65b>] system_call_fastpath+0x16/0x1b
      [   27.408614] ---[ end trace 4b16ad70c09a602d ]---
      [   27.408871] dmidecode:4913 reserve_pfn_range ioremap_change_attr failed write-back for cff6a000-cff6b000
      
      This is wih track_pfn_vma_new trying to keep identity map in sync.
      The address cff6a000 is the ACPI region according to e820.
      
      [    0.000000] BIOS-provided physical RAM map:
      [    0.000000]  BIOS-e820: 0000000000000000 - 000000000009c000 (usable)
      [    0.000000]  BIOS-e820: 000000000009c000 - 00000000000a0000 (reserved)
      [    0.000000]  BIOS-e820: 00000000000cc000 - 00000000000d0000 (reserved)
      [    0.000000]  BIOS-e820: 00000000000e4000 - 0000000000100000 (reserved)
      [    0.000000]  BIOS-e820: 0000000000100000 - 00000000cff60000 (usable)
      [    0.000000]  BIOS-e820: 00000000cff60000 - 00000000cff69000 (ACPI data)
      [    0.000000]  BIOS-e820: 00000000cff69000 - 00000000cff80000 (ACPI NVS)
      [    0.000000]  BIOS-e820: 00000000cff80000 - 00000000d0000000 (reserved)
      [    0.000000]  BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved)
      [    0.000000]  BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved)
      [    0.000000]  BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
      [    0.000000]  BIOS-e820: 00000000ff000000 - 0000000100000000 (reserved)
      [    0.000000]  BIOS-e820: 0000000100000000 - 0000000230000000 (usable)
      
      And is not mapped as per init_memory_mapping.
      
      [    0.000000] init_memory_mapping: 0000000000000000-00000000cff60000
      [    0.000000] init_memory_mapping: 0000000100000000-0000000230000000
      
      We can add logic to check for this. But, there can also be other holes in
      identity map when we have 1GB of aligned reserved space in e820.
      
      This patch handles it by removing the WARN_ON and returning a specific
      error value (EFAULT) to indicate that the address does not have any
      identity mapping.
      
      The code that tries to keep identity map in sync can ignore
      this error, with other callers of cpa still getting error here.
      Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      58dab916
    • V
      x86 PAT: return compatible mapping to remap_pfn_range callers · cdecff68
      venkatesh.pallipadi@intel.com 提交于
      Impact: avoid warning message, potentially solve 3D performance regression
      
      Change x86 PAT code to return compatible memtype if the exact memtype that
      was requested in remap_pfn_rage and friends is not available due to some
      conflict.
      
      This is done by returning the compatible type in pgprot parameter of
      track_pfn_vma_new(), and the caller uses that memtype for page table.
      
      Note that track_pfn_vma_copy() which is basically called during fork gets the
      prot from existing page table and should not have any conflict. Hence we use
      strict memtype check there and do not allow compatible memtypes.
      
      This patch fixes the bug reported here:
      
        http://marc.info/?l=linux-kernel&m=123108883716357&w=2
      
      Specifically the error message:
      
        X:5010 map pfn expected mapping type write-back for d0000000-d0101000,
        got write-combining
      
      Should go away.
      Reported-and-bisected-by: NKevin Winchester <kjwinchester@gmail.com>
      Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cdecff68
    • V
      x86 PAT: change track_pfn_vma_new to take pgprot_t pointer param · e4b866ed
      venkatesh.pallipadi@intel.com 提交于
      Impact: cleanup
      
      Change the protection parameter for track_pfn_vma_new() into a pgprot_t pointer.
      Subsequent patch changes the x86 PAT handling to return a compatible
      memtype in pgprot_t, if what was requested cannot be allowed due to conflicts.
      No fuctionality change in this patch.
      Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e4b866ed
    • V
      x86 PAT: consolidate old memtype new memtype check into a function · afc7d20c
      venkatesh.pallipadi@intel.com 提交于
      Impact: cleanup
      
      Move the new memtype old memtype allowed check to header so that is can be
      shared by other users. Subsequent patch uses this in pat.c in remap_pfn_range()
      code path. No functionality change in this patch.
      Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      afc7d20c
  7. 13 1月, 2009 5 次提交
  8. 10 1月, 2009 2 次提交
    • L
      x86: make 'constant_test_bit()' take an unsigned bit number · c4295fbb
      Linus Torvalds 提交于
      Ingo noticed that using signed arithmetic seems to confuse the gcc
      inliner, and make it potentially decide that it's all too complicated.
      
      (Yeah, yeah, it's a constant. It's always positive. Still..)
      
      Based-on: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c4295fbb
    • A
      x86: only scan the root bus in early PCI quirks · 8659c406
      Andi Kleen 提交于
      We found a situation on Linus' machine that the Nvidia timer quirk hit on
      a Intel chipset system.  The problem is that the system has a fancy Nvidia
      card with an own PCI bridge, and the early-quirks code looking for any
      NVidia bridge triggered on it incorrectly.  This didn't lead a boot
      failure by luck, but the timer routing code selecting the wrong timer
      first and some ugly messages.  It might lead to real problems on other
      systems.
      
      I checked all the devices which are currently checked for by early_quirks
      and it turns out they are all located in the root bus zero.
      
      So change the early-quirks loop to only scan bus 0.  This incidently also
      saves quite some unnecessary scanning work, because early_quirks doesn't
      go through all the non root busses.
      
      The graphics card is not on bus 0, so it is not matched anymore.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8659c406
  9. 09 1月, 2009 3 次提交
  10. 08 1月, 2009 18 次提交