1. 27 1月, 2009 1 次提交
  2. 23 1月, 2009 12 次提交
  3. 21 1月, 2009 19 次提交
  4. 20 1月, 2009 8 次提交
    • N
      x86: optimise x86's do_page_fault (C entry point for the page fault path) · 92181f19
      Nick Piggin 提交于
      Impact: cleanup, restructure code to improve assembly
      
      gcc isn't _all_ that smart about spilling registers to stack or reusing
      stack slots, even with branch annotations. do_page_fault contained a lot
      of functionality, so split unlikely paths into their own functions, and
      mark them as noinline just to be sure. I consider this actually to be
      somewhat of a cleanup too: the main function now contains about half
      the number of lines so the normal path is easier to read, while the error
      cases are also nicely split away.
      
      Also, ensure the order of arguments to functions is always the same: regs,
      addr, error_code. This can reduce code size a tiny bit, and just looks neater
      too.
      
      And add a couple of branch annotations.
      
      Before:
        do_page_fault:
                subq    $360, %rsp      #,
      
      After:
        do_page_fault:
                subq    $56, %rsp       #,
      
      bloat-o-meter:
        add/remove: 8/0 grow/shrink: 0/1 up/down: 2222/-1680 (542)
        function                                     old     new   delta
        __bad_area_nosemaphore                         -     506    +506
        no_context                                     -     474    +474
        vmalloc_fault                                  -     424    +424
        spurious_fault                                 -     358    +358
        mm_fault_error                                 -     272    +272
        bad_area_access_error                          -      89     +89
        bad_area                                       -      89     +89
        bad_area_nosemaphore                           -      10     +10
        do_page_fault                               2464     784   -1680
      
      Yes, the total size increases by 542 bytes, due to the extra function calls.
      But these will very rarely be called (except for vmalloc_fault) in a normal
      workload. Importantly, do_page_fault is less than 1/3rd it's original size,
      and touches far less stack.
      
      Existing gotos and branch hints did move a lot of the infrequently used text
      out of the fastpath, but that's even further improved after this patch.
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      92181f19
    • I
      Merge commit 'v2.6.29-rc2' into x86/mm · 0ce1c383
      Ingo Molnar 提交于
      0ce1c383
    • I
      x86, cpumask: fix tlb flush race · 5766b842
      Ingo Molnar 提交于
      Impact: fix bootup crash
      
      The cpumask is now passed in as a reference to mm->cpu_vm_mask, not on
      the stack - hence it is not constant anymore during the TLB flush.
      
      That way it could race and some static sanity checks would trigger:
      
      [  238.154287] ------------[ cut here ]------------
      [  238.156039] kernel BUG at arch/x86/kernel/tlb_32.c:130!
      [  238.156039] invalid opcode: 0000 [#1] SMP
      [  238.156039] last sysfs file: /sys/class/net/eth2/address
      [  238.156039] Modules linked in:
      [  238.156039]
      [  238.156039] Pid: 6493, comm: ifup-eth Not tainted (2.6.29-rc2-tip #1) P4DC6
      [  238.156039] EIP: 0060:[<c0118f87>] EFLAGS: 00010202 CPU: 2
      [  238.156039] EIP is at native_flush_tlb_others+0x35/0x158
      [  238.156039] EAX: c0ef972c EBX: f6143301 ECX: 00000000 EDX: 00000000
      [  238.156039] ESI: f61433a8 EDI: f6143200 EBP: f34f3e00 ESP: f34f3df0
      [  238.156039]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
      [  238.156039] Process ifup-eth (pid: 6493, ti=f34f2000 task=f399ab00 task.ti=f34f2000)
      [  238.156039] Stack:
      [  238.156039]  ffffffff f61433a8 ffffffff f6143200 f34f3e18 c0118e9c 00000000 f6143200
      [  238.156039]  f61433a8 f5bec738 f34f3e28 c0119435 c2b5b830 f6143200 f34f3e34 c01c2dc3
      [  238.156039]  bffd9000 f34f3e60 c01c3051 00000000 ffffffff f34f3e4c 00000000 00000071
      [  238.156039] Call Trace:
      [  238.156039]  [<c0118e9c>] ? flush_tlb_others+0x52/0x5b
      [  238.156039]  [<c0119435>] ? flush_tlb_mm+0x7f/0x8b
      [  238.156039]  [<c01c2dc3>] ? tlb_finish_mmu+0x2d/0x55
      [  238.156039]  [<c01c3051>] ? exit_mmap+0x124/0x170
      [  238.156039]  [<c013e965>] ? mmput+0x40/0xf5
      [  238.156039]  [<c01e4788>] ? flush_old_exec+0x640/0x94b
      [  238.156039]  [<c01ddb4e>] ? fsnotify_access+0x37/0x39
      [  238.156039]  [<c01e3435>] ? kernel_read+0x39/0x4b
      [  238.156039]  [<c021bc8a>] ? load_elf_binary+0x4a1/0x11bb
      [  238.156039]  [<c01c0af9>] ? might_fault+0x51/0x9c
      [  238.156039]  [<c010a2cc>] ? paravirt_read_tsc+0x20/0x4f
      [  238.156039]  [<c010a406>] ? native_sched_clock+0x5d/0x60
      [  238.156039]  [<c01e2fda>] ? search_binary_handler+0xab/0x2c4
      [  238.156039]  [<c021b7e9>] ? load_elf_binary+0x0/0x11bb
      [  238.156039]  [<c04ae9a5>] ? _raw_read_unlock+0x21/0x46
      [  238.156039]  [<c021b7e9>] ? load_elf_binary+0x0/0x11bb
      [  238.156039]  [<c01e2fe1>] ? search_binary_handler+0xb2/0x2c4
      [  238.156039]  [<c01e4076>] ? do_execve+0x21c/0x2ee
      [  238.156039]  [<c01029b7>] ? sys_execve+0x51/0x8c
      [  238.156039]  [<c0103eaf>] ? sysenter_do_call+0x12/0x43
      
      Fix it by not assuming that the cpumask is constant.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      5766b842
    • I
    • T
      linker script: kill PERCPU_VADDR_PREALLOC() · 6b7c38d5
      Tejun Heo 提交于
      Impact: cleanup
      
      With .data.percpu.first in place, PERCPU_VADDR_PREALLOC() is no longer
      necessary.  Kill it.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      6b7c38d5
    • B
      x86: remove pda.h · 0d974d45
      Brian Gerst 提交于
      Impact: cleanup
      Signed-off-by: NBrian Gerst <brgerst@gmail.com>
      0d974d45
    • B
      x86: move stack_canary into irq_stack · 947e76cd
      Brian Gerst 提交于
      Impact: x86_64 percpu area layout change, irq_stack now at the beginning
      
      Now that the PDA is empty except for the stack canary, it can be removed.
      The irqstack is moved to the start of the per-cpu section.  If the stack
      protector is enabled, the canary overlaps the bottom 48 bytes of the irqstack.
      
      tj: * updated subject
          * dropped asm relocation of irq_stack_ptr
          * updated comments a bit
          * rebased on top of stack canary changes
      Signed-off-by: NBrian Gerst <brgerst@gmail.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      947e76cd
    • B
      x86: rework __per_cpu_load adjustments · 8c7e58e6
      Brian Gerst 提交于
      Impact: cleanup
      
      Use cpu_number to determine if the adjustment is necessary.
      Signed-off-by: NBrian Gerst <brgerst@gmail.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      8c7e58e6