1. 08 11月, 2008 1 次提交
    • I
      sched: improve sched_clock() performance · 0d12cdd5
      Ingo Molnar 提交于
      in scheduler-intense workloads native_read_tsc() overhead accounts for
      20% of the system overhead:
      
       659567 system_call                              41222.9375
       686796 schedule                                 435.7843
       718382 __switch_to                              665.1685
       823875 switch_mm                                4526.7857
       1883122 native_read_tsc                          55385.9412
       9761990 total                                      2.8468
      
      this is large part due to the rdtsc_barrier() that is done before
      and after reading the TSC.
      
      But sched_clock() is not a precise clock in the GTOD sense, using such
      barriers is completely pointless. So remove the barriers and only use
      them in vget_cycles().
      
      This improves lat_ctx performance by about 5%.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0d12cdd5
  2. 06 11月, 2008 3 次提交
    • Y
      x86: remove VISWS and PARAVIRT around NR_IRQS puzzle · 7db282fa
      Yinghai Lu 提交于
      Impact: fix warning message when PARAVIRT is set in config
      
      Remove stale #ifdef components from our IRQ sizing logic.
      x86/Voyager is the only holdout.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7db282fa
    • Y
      x86: size NR_IRQS on 32-bit systems the same way as 64-bit · 1b489768
      Yinghai Lu 提交于
      Impact: make NR_IRQS big enough for system with lots of apic/pins
      
      If lots of IO_APIC's are there (or can be there), size the same way
      as 64-bit, depending on MAX_IO_APICS and NR_CPUS.
      
      This fixes the boot problem reported by Ben Hutchings on a 32-bit
      server with 5 IO-APICs and 240 IO-APIC pins.
      Signed-off-by: NYinghai <yinghai@kernel.org>
      Tested-by: NBen Hutchings <bhutchings@solarflare.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1b489768
    • I
      sched: re-tune balancing · 9fcd18c9
      Ingo Molnar 提交于
      Impact: improve wakeup affinity on NUMA systems, tweak SMP systems
      
      Given the fixes+tweaks to the wakeup-buddy code, re-tweak the domain
      balancing defaults on NUMA and SMP systems.
      
      Turn on SD_WAKE_AFFINE which was off on x86 NUMA - there's no reason
      why we would not want to have wakeup affinity across nodes as well.
      (we already do this in the standard NUMA template.)
      
      lat_ctx on a NUMA box is particularly happy about this change:
      
      before:
      
       |   phoenix:~/l> ./lat_ctx -s 0 2
       |   "size=0k ovr=2.60
       |   2 5.70
      
      after:
      
       |   phoenix:~/l> ./lat_ctx -s 0 2
       |   "size=0k ovr=2.65
       |   2 2.07
      
      a 2.75x speedup.
      
      pipe-test is similarly happy about it too:
      
       |  phoenix:~/sched-tests> ./pipe-test
       |   18.26 usecs/loop.
       |   14.70 usecs/loop.
       |   14.38 usecs/loop.
       |   10.55 usecs/loop.              # +WAKE_AFFINE on domain0+domain1
       |   8.63 usecs/loop.
       |   8.59 usecs/loop.
       |   9.03 usecs/loop.
       |   8.94 usecs/loop.
       |   8.96 usecs/loop.
       |   8.63 usecs/loop.
      
      Also:
      
       - disable SD_BALANCE_NEWIDLE on NUMA and SMP domains (keep it for siblings)
       - enable SD_WAKE_BALANCE on SMP domains
      
      Sysbench+postgresql improves all around the board, quite significantly:
      
                 .28-rc3-11474e2c  .28-rc3-11474e2c-tune
      -------------------------------------------------
          1:             571              688    +17.08%
          2:            1236             1206    -2.55%
          4:            2381             2642    +9.89%
          8:            4958             5164    +3.99%
         16:            9580             9574    -0.07%
         32:            7128             8118    +12.20%
         64:            7342             8266    +11.18%
        128:            7342             8064    +8.95%
        256:            7519             7884    +4.62%
        512:            7350             7731    +4.93%
      -------------------------------------------------
        SUM:           55412            59341    +6.62%
      
      So it's a win both for the runup portion, the peak area and the tail.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9fcd18c9
  3. 03 11月, 2008 1 次提交
  4. 31 10月, 2008 6 次提交
  5. 30 10月, 2008 1 次提交
  6. 29 10月, 2008 1 次提交
    • H
      x86: start annotating early ioremap pointers with __iomem · 1d6cf1fe
      Harvey Harrison 提交于
      Impact: some new sparse warnings in e820.c etc, but no functional change.
      
      As with regular ioremap, iounmap etc, annotate with __iomem.
      
      Fixes the following sparse warnings, will produce some new ones
      elsewhere in arch/x86 that will get worked out over time.
      
      arch/x86/mm/ioremap.c:402:9: warning: cast removes address space of expression
      arch/x86/mm/ioremap.c:406:10: warning: cast adds address space to expression (<asn:2>)
      arch/x86/mm/ioremap.c:782:19: warning: Using plain integer as NULL pointer
      Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1d6cf1fe
  7. 28 10月, 2008 2 次提交
  8. 25 10月, 2008 1 次提交
  9. 24 10月, 2008 1 次提交
    • F
      x86: use GFP_DMA for 24bit coherent_dma_mask · 75bebb7f
      FUJITA Tomonori 提交于
      dma_alloc_coherent (include/asm-x86/dma-mapping.h) avoids GFP_DMA
      allocation first and if the allocated address is not fit for the
      device's coherent_dma_mask, then dma_alloc_coherent does GFP_DMA
      allocation. This is because dma_alloc_coherent avoids precious GFP_DMA
      zone if possible. This is also how the old dma_alloc_coherent
      (arch/x86/kernel/pci-dma.c) works.
      
      However, if the coherent_dma_mask of a device is 24bit, there is no
      point to go into the above GFP_DMA retry mechanism. We had better use
      GFP_DMA in the first place.
      Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Tested-by: NTakashi Iwai <tiwai@suse.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      75bebb7f
  10. 23 10月, 2008 4 次提交