1. 23 3月, 2006 10 次提交
    • M
      [PATCH] x86: Make _syscallX() macros compile in PIC mode · aeefc956
      Markus Gutschke 提交于
      Gcc reserves %ebx when compiling position-independent-code on i386.  This
      means, the _syscallX() macros in include/asm-i386/unistd.h will not
      compile.  This patch is changes the existing macros to take special care to
      preserve %ebx.
      
      The bug can be tracked at http://bugzilla.kernel.org/show_bug.cgi?id=6204Signed-off-by: NMarkus Gutschke <markus@google.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      aeefc956
    • C
      [PATCH] i386 spinlocks: disable interrupts only if we enabled them · 42c059e0
      Chuck Ebbert 提交于
      _raw_spin_lock_flags() is entered with interrupts disabled.  If it cannot
      obtain a spinlock, it checks the flags that were passed and re-enables
      interrupts before spinning if that's how the flags are set.  When the
      spinlock might be available, it disables interrupts (even if they are
      already disabled) before trying to get the lock.  Change that so interrupts
      are only disabled if they have been enabled.  This costs nine bytes of
      duplicated spinloop code.
      
      Fastpath before patch:
              jle <keep looping>      not-taken conditional jump
              cli                     disable interrupts
              jmp <try for lock>      unconditional jump
      
      Fastpath after patch, if interrupts were not enabled:
              jg <try for lock>       taken conditional branch
      Signed-off-by: NChuck Ebbert <76306.1226@compuserve.com>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      42c059e0
    • J
      [PATCH] Fix the imlicit declaration of mtrr_centaur_report_mcr in arch/i386/kernel/cpu/centaur.c · 52f4a91a
      Jesper Juhl 提交于
      arch/i386/kernel/cpu/centaur.c: In function `centaur_mcr_insert':
      arch/i386/kernel/cpu/centaur.c:33: warning: implicit declaration of function `mtrr_centaur_report_mcr'
      Signed-off-by: NJesper Juhl <jesper.juhl@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      52f4a91a
    • J
      [PATCH] i386: fix uses of user_mode() vs. user_mode_vm() · db753bdf
      Jan Beulich 提交于
      >commit 76381fee
      >Author: Vincent Hanquez <vincent.hanquez@cl.cam.ac.uk>
      >Date:   Thu Jun 23 00:08:46 2005 -0700
      >
      >    [PATCH] xen: x86_64: use more usermode macro
      >
      >    Make use of the user_mode macro where it's possible.  This is useful for Xen
      >    because it will need only to redefine only the macro to a hypervisor call.
      
      I am of the opinion that the above changeset is incomplete, i.e.  it missed
      converting some previous uses of user_mode to user_mode_vm.  While most of
      them could be considered just cosmetical, at least the one in die_nmi
      doesn't appear to be.
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Cc: Vincent Hanquez <vincent.hanquez@cl.cam.ac.uk>
      Cc: Zachary Amsden <zach@vmware.com>
      Cc: James Bottomley <James.Bottomley@steeleye.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      db753bdf
    • J
      [PATCH] i386: actively synchronize vmalloc area when registering certain callbacks · 101f12af
      Jan Beulich 提交于
      Registering a callback handler through register_die_notifier() is obviously
      primarily intended for use by modules.  However, the way these currently
      get called it is basically impossible for them to actually be used by
      modules, as there is, on non-PAE configurationes, a good chance (the larger
      the module, the better) for the system to crash as a result.
      
      This is because the callback gets invoked
      
      (a) in the page fault path before the top level page table propagation
          gets carried out (hence a fault to propagate the top level page table
          entry/entries mapping to module's code/data would nest infinitly) and
      
      (b) in the NMI path, where nested faults must absolutely not happen,
          since otherwise the IRET from the nested fault re-enables NMIs,
          potentially resulting in nested NMI occurences.
      
      Besides the modular aspect, similar problems would even arise for in-
      kernel consumers of the API if they touched ioremap()ed or vmalloc()ed
      memory inside their handlers.
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      101f12af
    • S
      [PATCH] x86: early printk handling fixes · 99b7de33
      Stas Sergeev 提交于
      The history is that -mm kernels do not work for me for a few months
      already.  The things started from crashing somewhere after starting init,
      and for the last month - no boot at all, just "Uncompressing...  OK,
      booting kernel", and silence.  Early console didn't work too.  With the
      latest releases this degraded into an infinite stream of the "Unknown
      interrupt or fault" messages.  So today my patience ran out and I started
      to think how can I collect at least some info for the bug-report.  Attached
      is the patch that allows to gather some valueable debug info on the problem
      by making an early console more useable.  I can't properly test the patch,
      as the kernel still doesn't boot, so I'll explain it in details in a hope
      someone else can justify the intrusive changes.
      
      arch_hooks.h: added prototypes for setup_early_printk() and early_printk().
      
      setup.c: killed wrong setup_early_printk() prototype.  Moved
      setup_early_printk() a bit earlier, as it was not "early enough" to cover
      the bug I was fighting with.
      
      early_printk.c: made it to start printing from the bottom of the screen,
      otherwise the messages interfere with the ones of the boot-loader, so you
      can't read them.
      Signed-off-by: NStas Sergeev <stsp@aknet.ru>
      Cc: Andi Kleen <ak@muc.de>
      Cc: Zwane Mwaikambo <zwane@arm.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      99b7de33
    • C
      [PATCH] i386: remove duplicate declaration of mp_bus_id_to_pci_bus · 7c63ee5c
      Chris Wright 提交于
      mp_bus_id_to_pci_bus is declared identically twice.
      Signed-off-by: NChris Wright <chrisw@sous-sol.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      7c63ee5c
    • N
      [PATCH] Compilation fix for ES7000 when no ACPI is specified in config (i386) · e5428ede
      Natalie.Protasevich@unisys.com 提交于
      ES7000 platform code clean up for compilation errors and a warning.
      Ifdef'd the ACPI related parts in the ES7000 platform code.  They were
      causing compile errors in certain configuration (without ACPI defined).  I
      think this approach would be best (as opposed to Kconfig changes) since it
      only touches the subarch...
      
      Signed-off-by: <Natalie.Protasevich@unisys.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e5428ede
    • E
      [PATCH] i386: Add a temporary to make put_user more type safe · 30e931d4
      Eric W. Biederman 提交于
      In some code I am developing I had occasion to change the type of a
      variable.  This made the value put_user was putting to user space wrong.
      But the code continued to build cleanly without errors.
      
      Introducing a temporary fixes this problem and at least with gcc-3.3.5 does
      not cause gcc any problems with optimizing out the temporary.  gcc-4.x
      using SSA internally ought to be even better at optimizing out temporaries,
      so I don't expect a temporary to become a problem.  Especially because in
      all correct cases the types on both sides of the assignment to the
      temporary are the same.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      30e931d4
    • G
      [PATCH] x86: SMP alternatives · 9a0b5817
      Gerd Hoffmann 提交于
      Implement SMP alternatives, i.e.  switching at runtime between different
      code versions for UP and SMP.  The code can patch both SMP->UP and UP->SMP.
      The UP->SMP case is useful for CPU hotplug.
      
      With CONFIG_CPU_HOTPLUG enabled the code switches to UP at boot time and
      when the number of CPUs goes down to 1, and switches to SMP when the number
      of CPUs goes up to 2.
      
      Without CONFIG_CPU_HOTPLUG or on non-SMP-capable systems the code is
      patched once at boot time (if needed) and the tables are released
      afterwards.
      
      The changes in detail:
      
        * The current alternatives bits are moved to a separate file,
          the SMP alternatives code is added there.
      
        * The patch adds some new elf sections to the kernel:
          .smp_altinstructions
      	like .altinstructions, also contains a list
      	of alt_instr structs.
          .smp_altinstr_replacement
      	like .altinstr_replacement, but also has some space to
      	save original instruction before replaving it.
          .smp_locks
      	list of pointers to lock prefixes which can be nop'ed
      	out on UP.
          The first two are used to replace more complex instruction
          sequences such as spinlocks and semaphores.  It would be possible
          to deal with the lock prefixes with that as well, but by handling
          them as special case the table sizes become much smaller.
      
       * The sections are page-aligned and padded up to page size, so they
         can be free if they are not needed.
      
       * Splitted the code to release init pages to a separate function and
         use it to release the elf sections if they are unused.
      Signed-off-by: NGerd Hoffmann <kraxel@suse.de>
      Signed-off-by: NChuck Ebbert <76306.1226@compuserve.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      9a0b5817
  2. 22 3月, 2006 2 次提交
    • Z
      [PATCH] Enable mprotect on huge pages · 8f860591
      Zhang, Yanmin 提交于
      2.6.16-rc3 uses hugetlb on-demand paging, but it doesn_t support hugetlb
      mprotect.
      
      From: David Gibson <david@gibson.dropbear.id.au>
      
        Remove a test from the mprotect() path which checks that the mprotect()ed
        range on a hugepage VMA is hugepage aligned (yes, really, the sense of
        is_aligned_hugepage_range() is the opposite of what you'd guess :-/).
      
        In fact, we don't need this test.  If the given addresses match the
        beginning/end of a hugepage VMA they must already be suitably aligned.  If
        they don't, then mprotect_fixup() will attempt to split the VMA.  The very
        first test in split_vma() will check for a badly aligned address on a
        hugepage VMA and return -EINVAL if necessary.
      
      From: "Chen, Kenneth W" <kenneth.w.chen@intel.com>
      
        On i386 and x86-64, pte flag _PAGE_PSE collides with _PAGE_PROTNONE.  The
        identify of hugetlb pte is lost when changing page protection via mprotect.
        A page fault occurs later will trigger a bug check in huge_pte_alloc().
      
        The fix is to always make new pte a hugetlb pte and also to clean up
        legacy code where _PAGE_PRESENT is forced on in the pre-faulting day.
      Signed-off-by: NZhang Yanmin <yanmin.zhang@intel.com>
      Cc: David Gibson <david@gibson.dropbear.id.au>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Signed-off-by: NKen Chen <kenneth.w.chen@intel.com>
      Signed-off-by: NNishanth Aravamudan <nacc@us.ibm.com>
      Cc: Andi Kleen <ak@muc.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      8f860591
    • H
      [PATCH] don't call check_acpi_pci() on x86 with ACPI disabled · 152475cb
      Herbert Poetzl 提交于
      check_acpi_pci() is called from arch/i386/kernel/setup.c even if
      CONFIG_ACPI is not defined, but the code in include/asm/acpi.h doesn't
      provide it in this case.
      Signed-off-by: NHerbert Pötzl <herbert@13thfloor.at>
      Cc: "Brown, Len" <len.brown@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      152475cb
  3. 09 3月, 2006 1 次提交
    • A
      [PATCH] i386: port ATI timer fix from x86_64 to i386 II · f9262c12
      Andi Kleen 提交于
      ATI chipsets tend to generate double timer interrupts for the local APIC
      timer when both the 8254 and the IO-APIC timer pins are enabled.  This is
      because they route it to both and the result is anded together and the CPU
      ends up processing it twice.
      
      This patch changes check_timer to disable the 8254 routing for interrupt 0.
      
      I think it would be safe on all chipsets actually (i tested it on a couple
      and it worked everywhere) and Windows seems to do it in a similar way, but
      to be conservative this patch only enables this mode on ATI (and adds
      options to enable/disable too)
      
      Ported over from a similar x86-64 change.
      
      I reused the ACPI earlyquirk infrastructure for the ATI bridge check, but
      tweaked it a bit to work even without ACPI.
      
      Inspired by a patch from Chuck Ebbert, but redone.
      
      Cc: Chuck Ebbert <76306.1226@compuserve.com>
      Cc: "Brown, Len" <len.brown@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      f9262c12
  4. 25 2月, 2006 2 次提交
  5. 18 2月, 2006 1 次提交
  6. 16 2月, 2006 1 次提交
  7. 15 2月, 2006 2 次提交
  8. 12 2月, 2006 1 次提交
    • U
      [PATCH] fstatat64 support · cff2b760
      Ulrich Drepper 提交于
      The *at patches introduced fstatat and, due to inusfficient research, I
      used the newfstat functions generally as the guideline.  The result is that
      on 32-bit platforms we don't have all the information needed to implement
      fstatat64.
      
      This patch modifies the code to pass up 64-bit information if
      __ARCH_WANT_STAT64 is defined.  I renamed the syscall entry point to make
      this clear.  Other archs will continue to use the existing code.  On x86-64
      the compat code is implemented using a new sys32_ function.  this is what
      is done for the other stat syscalls as well.
      
      This patch might break some other archs (those which define
      __ARCH_WANT_STAT64 and which already wired up the syscall).  Yet others
      might need changes to accomodate the compatibility mode.  I really don't
      want to do that work because all this stat handling is a mess (more so in
      glibc, but the kernel is also affected).  It should be done by the arch
      maintainers.  I'll provide some stand-alone test shortly.  Those who are
      eager could compile glibc and run 'make check' (no installation needed).
      
      The patch below has been tested on x86 and x86-64.
      Signed-off-by: NUlrich Drepper <drepper@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Andi Kleen <ak@muc.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      cff2b760
  9. 08 2月, 2006 1 次提交
  10. 06 2月, 2006 1 次提交
  11. 04 2月, 2006 1 次提交
    • Z
      [PATCH] Export cpu topology in sysfs · 69dcc991
      Zhang, Yanmin 提交于
      The patch implements cpu topology exportation by sysfs.
      
      Items (attributes) are similar to /proc/cpuinfo.
      
      1) /sys/devices/system/cpu/cpuX/topology/physical_package_id:
      	represent the physical package id of  cpu X;
      2) /sys/devices/system/cpu/cpuX/topology/core_id:
      	represent the cpu core id to cpu X;
      3) /sys/devices/system/cpu/cpuX/topology/thread_siblings:
      	represent the thread siblings to cpu X in the same core;
      4) /sys/devices/system/cpu/cpuX/topology/core_siblings:
      	represent the thread siblings to cpu X in the same physical package;
      
      To implement it in an architecture-neutral way, a new source file,
      driver/base/topology.c, is to export the 5 attributes.
      
      If one architecture wants to support this feature, it just needs to
      implement 4 defines, typically in file include/asm-XXX/topology.h.
      The 4 defines are:
      #define topology_physical_package_id(cpu)
      #define topology_core_id(cpu)
      #define topology_thread_siblings(cpu)
      #define topology_core_siblings(cpu)
      
      The type of **_id is int.
      The type of siblings is cpumask_t.
      
      To be consistent on all architectures, the 4 attributes should have
      deafult values if their values are unavailable. Below is the rule.
      
      1) physical_package_id: If cpu has no physical package id, -1 is the
      default value.
      
      2) core_id: If cpu doesn't support multi-core, its core id is 0.
      
      3) thread_siblings: Just include itself, if the cpu doesn't support
      HT/multi-thread.
      
      4) core_siblings: Just include itself, if the cpu doesn't support
      multi-core and HT/Multi-thread.
      
      So be careful when declaring the 4 defines in include/asm-XXX/topology.h.
      
      If an attribute isn't defined on an architecture, it won't be exported.
      
      Thank Nathan, Greg, Andi, Paul and Venki.
      
      The patch provides defines for i386/x86_64/ia64.
      Signed-off-by: NZhang, Yanmin <yanmin.zhang@intel.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Cc: Greg KH <greg@kroah.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      69dcc991
  12. 02 2月, 2006 1 次提交
    • M
      [PATCH] VMSPLIT config options · 975b3d3d
      Mark Lord 提交于
      Enable selection of different user/kernel VM splits for i386, including an
      optimized mode for 1GB physical RAM, which gives the kernel a direct (non
      HIGHMEM) mapping to the entire 1GB rather than just the first 896MB.
      
      There is a similarly a similarly optimized mode for machines with exactly 2GB
      of physical RAM.
      
      This can speed up the kernel by avoiding having to create/destroy temporary
      HIGHMEM mappings, and by not having to include HIGHMEM support at all on such
      machines.  The flip side is that there's less virtual addressing left for
      userspace in these alternatives, and some binary-only kernel modules may
      misbehave unless rebuilt with the same VMSPLIT option as the main kernel
      image.
      
      Original idea/patch from Jens Axboe, modified based on suggestions from Linus
      et al.
      Signed-off-by: NMark Lord <mlord@pobox.com>
      Signed-off-by: NJens Axboe <axboe@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      975b3d3d
  13. 19 1月, 2006 6 次提交
  14. 15 1月, 2006 1 次提交
  15. 13 1月, 2006 6 次提交
    • A
      [PATCH] death of get_thread_info/put_thread_info · f5a61d0c
      Al Viro 提交于
      {get,put}_thread_info() were introduced in 2.5.4 and never
      had been called by anything in the tree.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      f5a61d0c
    • A
      [PATCH] i386: task_stack_page() · 65e0fdff
      Al Viro 提交于
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      65e0fdff
    • A
      [PATCH] i386: fix task_pt_regs() · 07b047fc
      akpm@osdl.org 提交于
      )
      
      From: Al Viro <viro@ftp.linux.org.uk>
      
      task_pt_regs() needs the same offset-by-8 to match copy_thread()
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      07b047fc
    • A
      [PATCH] i386: task_thread_info() · 06b425d8
      Al Viro 提交于
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      06b425d8
    • A
      [PATCH] scheduler cache-hot-autodetect · 198e2f18
      akpm@osdl.org 提交于
      )
      
      From: Ingo Molnar <mingo@elte.hu>
      
      This is the latest version of the scheduler cache-hot-auto-tune patch.
      
      The first problem was that detection time scaled with O(N^2), which is
      unacceptable on larger SMP and NUMA systems. To solve this:
      
      - I've added a 'domain distance' function, which is used to cache
        measurement results. Each distance is only measured once. This means
        that e.g. on NUMA distances of 0, 1 and 2 might be measured, on HT
        distances 0 and 1, and on SMP distance 0 is measured. The code walks
        the domain tree to determine the distance, so it automatically follows
        whatever hierarchy an architecture sets up. This cuts down on the boot
        time significantly and removes the O(N^2) limit. The only assumption
        is that migration costs can be expressed as a function of domain
        distance - this covers the overwhelming majority of existing systems,
        and is a good guess even for more assymetric systems.
      
        [ People hacking systems that have assymetries that break this
          assumption (e.g. different CPU speeds) should experiment a bit with
          the cpu_distance() function. Adding a ->migration_distance factor to
          the domain structure would be one possible solution - but lets first
          see the problem systems, if they exist at all. Lets not overdesign. ]
      
      Another problem was that only a single cache-size was used for measuring
      the cost of migration, and most architectures didnt set that variable
      up. Furthermore, a single cache-size does not fit NUMA hierarchies with
      L3 caches and does not fit HT setups, where different CPUs will often
      have different 'effective cache sizes'. To solve this problem:
      
      - Instead of relying on a single cache-size provided by the platform and
        sticking to it, the code now auto-detects the 'effective migration
        cost' between two measured CPUs, via iterating through a wide range of
        cachesizes. The code searches for the maximum migration cost, which
        occurs when the working set of the test-workload falls just below the
        'effective cache size'. I.e. real-life optimized search is done for
        the maximum migration cost, between two real CPUs.
      
        This, amongst other things, has the positive effect hat if e.g. two
        CPUs share a L2/L3 cache, a different (and accurate) migration cost
        will be found than between two CPUs on the same system that dont share
        any caches.
      
      (The reliable measurement of migration costs is tricky - see the source
      for details.)
      
      Furthermore i've added various boot-time options to override/tune
      migration behavior.
      
      Firstly, there's a blanket override for autodetection:
      
      	migration_cost=1000,2000,3000
      
      will override the depth 0/1/2 values with 1msec/2msec/3msec values.
      
      Secondly, there's a global factor that can be used to increase (or
      decrease) the autodetected values:
      
      	migration_factor=120
      
      will increase the autodetected values by 20%. This option is useful to
      tune things in a workload-dependent way - e.g. if a workload is
      cache-insensitive then CPU utilization can be maximized by specifying
      migration_factor=0.
      
      I've tested the autodetection code quite extensively on x86, on 3
      P3/Xeon/2MB, and the autodetected values look pretty good:
      
      Dual Celeron (128K L2 cache):
      
       ---------------------
       migration cost matrix (max_cache_size: 131072, cpu: 467 MHz):
       ---------------------
                 [00]    [01]
       [00]:     -     1.7(1)
       [01]:   1.7(1)    -
       ---------------------
       cacheflush times [2]: 0.0 (0) 1.7 (1784008)
       ---------------------
      
      Here the slow memory subsystem dominates system performance, and even
      though caches are small, the migration cost is 1.7 msecs.
      
      Dual HT P4 (512K L2 cache):
      
       ---------------------
       migration cost matrix (max_cache_size: 524288, cpu: 2379 MHz):
       ---------------------
                 [00]    [01]    [02]    [03]
       [00]:     -     0.4(1)  0.0(0)  0.4(1)
       [01]:   0.4(1)    -     0.4(1)  0.0(0)
       [02]:   0.0(0)  0.4(1)    -     0.4(1)
       [03]:   0.4(1)  0.0(0)  0.4(1)    -
       ---------------------
       cacheflush times [2]: 0.0 (33900) 0.4 (448514)
       ---------------------
      
      Here it can be seen that there is no migration cost between two HT
      siblings (CPU#0/2 and CPU#1/3 are separate physical CPUs). A fast memory
      system makes inter-physical-CPU migration pretty cheap: 0.4 msecs.
      
      8-way P3/Xeon [2MB L2 cache]:
      
       ---------------------
       migration cost matrix (max_cache_size: 2097152, cpu: 700 MHz):
       ---------------------
                 [00]    [01]    [02]    [03]    [04]    [05]    [06]    [07]
       [00]:     -    19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1)
       [01]:  19.2(1)    -    19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1)
       [02]:  19.2(1) 19.2(1)    -    19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1)
       [03]:  19.2(1) 19.2(1) 19.2(1)    -    19.2(1) 19.2(1) 19.2(1) 19.2(1)
       [04]:  19.2(1) 19.2(1) 19.2(1) 19.2(1)    -    19.2(1) 19.2(1) 19.2(1)
       [05]:  19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1)    -    19.2(1) 19.2(1)
       [06]:  19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1)    -    19.2(1)
       [07]:  19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1)    -
       ---------------------
       cacheflush times [2]: 0.0 (0) 19.2 (19281756)
       ---------------------
      
      This one has huge caches and a relatively slow memory subsystem - so the
      migration cost is 19 msecs.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NAshok Raj <ashok.raj@intel.com>
      Signed-off-by: NKen Chen <kenneth.w.chen@intel.com>
      Cc: <wilder@us.ibm.com>
      Signed-off-by: NJohn Hawkes <hawkes@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      198e2f18
    • I
      [PATCH] sched: add cacheflush() asm · 4dc7a0bb
      Ingo Molnar 提交于
      Add per-arch sched_cacheflush() which is a write-back cacheflush used by
      the migration-cost calibration code at bootup time.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      4dc7a0bb
  16. 12 1月, 2006 3 次提交