1. 03 5月, 2007 40 次提交
    • J
      [PATCH] i386: Relocate VDSO ELF headers to match mapped location with COMPAT_VDSO · d4f7a2c1
      Jeremy Fitzhardinge 提交于
      Some versions of libc can't deal with a VDSO which doesn't have its
      ELF headers matching its mapped address.  COMPAT_VDSO maps the VDSO at
      a specific system-wide fixed address.  Previously this was all done at
      build time, on the grounds that the fixed VDSO address is always at
      the top of the address space.  However, a hypervisor may reserve some
      of that address space, pushing the fixmap address down.
      
      This patch does the adjustment dynamically at runtime, depending on
      the runtime location of the VDSO fixmap.
      
      [ Patch has been through several hands: Jan Beulich wrote the orignal
        version; Zach reworked it, and Jeremy converted it to relocate phdrs
        as well as sections. ]
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Zachary Amsden <zach@vmware.com>
      Cc: "Jan Beulich" <JBeulich@novell.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Roland McGrath <roland@redhat.com>
      d4f7a2c1
    • J
      [PATCH] i386: clean up identify_cpu · a6c4e076
      Jeremy Fitzhardinge 提交于
      identify_cpu() is used to identify both the boot CPU and secondary
      CPUs, but it performs some actions which only apply to the boot CPU.
      Those functions are therefore really __init functions, but because
      they're called by identify_cpu(), they must be marked __cpuinit.
      
      This patch splits identify_cpu() into identify_boot_cpu() and
      identify_secondary_cpu(), and calls the appropriate init functions
      from each.  Also, identify_boot_cpu() and all the functions it
      dominates are marked __init.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      a6c4e076
    • J
      [PATCH] i386: Clean up asm-i386/bugs.h · 1353ebb4
      Jeremy Fitzhardinge 提交于
      Most of asm-i386/bugs.h is code which should be in a C file, so put it there.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      1353ebb4
    • A
      [PATCH] x86-64: fix arithmetic in comment · bbf30a16
      Avi Kivity 提交于
      The xmm space on x86_64 is 256 bytes.
      Signed-off-by: NAvi Kivity <avi@qumranet.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      bbf30a16
    • A
      [PATCH] x86-64: Use X86_EFLAGS_IF in x86-64/irqflags.h. · 5d02d7ae
      Andi Kleen 提交于
      As per i386 patch: move X86_EFLAGS_IF et al out to a new header:
      processor-flags.h, so we can include it from irqflags.h and use it in
      raw_irqs_disabled_flags().
      
      As a side-effect, we could now use these flags in .S files.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      5d02d7ae
    • J
      [PATCH] x86: fix amd64-agp aperture validation · b92e9fac
      Jan Beulich 提交于
      Under CONFIG_DISCONTIGMEM, assuming that a !pfn_valid() implies all
      subsequent pfn-s are also invalid is wrong. Thus replace this by
      explicitly checking against the E820 map.
      
      AK: make e820 on x86-64 not initdata
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Acked-by: NMark Langsdorf <mark.langsdorf@amd.com>
      b92e9fac
    • J
      [PATCH] x86-64: Account for module percpu space separately from kernel percpu · b00742d3
      Jeremy Fitzhardinge 提交于
      Rather than using a single constant PERCPU_ENOUGH_ROOM, compute it as
      the sum of kernel_percpu + PERCPU_MODULE_RESERVE.  This is now common
      to all architectures; if an architecture wants to set
      PERCPU_ENOUGH_ROOM to something special, then it may do so (ia64 is
      the only one which does).
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Andi Kleen <ak@suse.de>
      b00742d3
    • J
      [PATCH] i386: Add machine_ops interface to abstract halting and rebooting · 07f3331c
      Jeremy Fitzhardinge 提交于
      machine_ops is an interface for the machine_* functions defined in
      <linux/reboot.h>.  This is intended to allow hypervisors to intercept
      the reboot process, but it could be used to implement other x86
      subarchtecture reboots.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      07f3331c
    • J
      [PATCH] i386: Add smp_ops interface · 01a2f435
      Jeremy Fitzhardinge 提交于
      Add a smp_ops interface.  This abstracts the API defined by
      <linux/smp.h> for use within arch/i386.  The primary intent is that it
      be used by a paravirtualizing hypervisor to implement SMP, but it
      could also be used by non-APIC-using sub-architectures.
      
      This is related to CONFIG_PARAVIRT, but is implemented unconditionally
      since it is simpler that way and not a highly performance-sensitive
      interface.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      01a2f435
    • R
      [PATCH] i386: cleanup GDT Access · 4fbb5968
      Rusty Russell 提交于
      Now we have an explicit per-cpu GDT variable, we don't need to keep the
      descriptors around to use them to find the GDT: expose cpu_gdt directly.
      
      We could go further and make load_gdt() pack the descriptor for us, or even
      assume it means "load the current cpu's GDT" which is what it always does.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      4fbb5968
    • A
      [PATCH] x86: sys_ioperm() prototype cleanup · ca906e42
      Adrian Bunk 提交于
      - there's no reason for duplicating the prototype from
        include/linux/syscalls.h in include/asm-x86_64/unistd.h
      - every file should #include the headers containing the prototypes for
        it's global functions
      Signed-off-by: NAdrian Bunk <bunk@stusta.de>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      ca906e42
    • C
      [PATCH] x86-64: use lru instead of page->index and page->private for pgd lists management. · 2bff7383
      Christoph Lameter 提交于
      x86_64 currently simulates a list using the index and private fields of the
      page struct.  Seems that the code was inherited from i386.  But x86_64 does
      not use the slab to allocate pgds and pmds etc.  So the lru field is not
      used by the slab and therefore available.
      
      This patch uses standard list operations on page->lru to realize pgd
      tracking.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      2bff7383
    • A
      [PATCH] i386: Use X86_EFLAGS_IF in irqflags.h. · b4531e86
      Andi Kleen 提交于
      Move X86_EFLAGS_IF et al out to a new header: processor-flags.h, so we
      can include it from irqflags.h and use it in raw_irqs_disabled_flags().
      
      As a side-effect, we could now use these flags in .S files.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      b4531e86
    • J
      [PATCH] x86: tighten kernel image page access rights · 6fb14755
      Jan Beulich 提交于
      On x86-64, kernel memory freed after init can be entirely unmapped instead
      of just getting 'poisoned' by overwriting with a debug pattern.
      
      On i386 and x86-64 (under CONFIG_DEBUG_RODATA), kernel text and bug table
      can also be write-protected.
      
      Compared to the first version, this one prevents re-creating deleted
      mappings in the kernel image range on x86-64, if those got removed
      previously. This, together with the original changes, prevents temporarily
      having inconsistent mappings when cacheability attributes are being
      changed on such pages (e.g. from AGP code). While on i386 such duplicate
      mappings don't exist, the same change is done there, too, both for
      consistency and because checking pte_present() before using various other
      pte_XXX functions is a requirement anyway. At once, i386 code gets
      adjusted to use pte_huge() instead of open coding this.
      
      AK: split out cpa() changes
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      6fb14755
    • J
      [PATCH] x86: Improve handling of kernel mappings in change_page_attr · d01ad8dd
      Jan Beulich 提交于
      Fix various broken corner cases in i386 and x86-64 change_page_attr.
      
      AK: split off from tighten kernel image access rights
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      d01ad8dd
    • R
      [PATCH] i386: rationalize paravirt wrappers · 90a0a06a
      Rusty Russell 提交于
      paravirt.c used to implement native versions of all low-level
      functions.  Far cleaner is to have the native versions exposed in the
      headers and as inline native_XXX, and if !CONFIG_PARAVIRT, then simply
      #define XXX native_XXX.
      
      There are several nice side effects:
      
      1) write_dt_entry() now takes the correct "struct Xgt_desc_struct *"
         not "void *".
      
      2) load_TLS is reintroduced to the for loop, not manually unrolled
         with a #error in case the bounds ever change.
      
      3) Macros become inlines, with type checking.
      
      4) Access to the native versions is trivial for KVM, lguest, Xen and
         others who might want it.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Andi Kleen <ak@muc.de>
      Cc: Avi Kivity <avi@qumranet.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      90a0a06a
    • R
      [PATCH] i386: clean up cpu_init() · d2cbcc49
      Rusty Russell 提交于
      We now have cpu_init() and secondary_cpu_init() doing nothing but calling
      _cpu_init() with the same arguments.  Rename _cpu_init() to cpu_init() and use
      it as a replcement for secondary_cpu_init().
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      d2cbcc49
    • R
      [PATCH] i386: Use per-cpu GDT immediately upon boot · bf504672
      Rusty Russell 提交于
      Now we are no longer dynamically allocating the GDT, we don't need the
      "cpu_gdt_table" at all: we can switch straight from "boot_gdt_table" to the
      per-cpu GDT.  This means initializing the cpu_gdt array in C.
      
      The boot CPU uses the per-cpu var directly, then in smp_prepare_cpus() it
      switches to the per-cpu copy just allocated.  For secondary CPUs, the
      early_gdt_descr is set to point directly to their per-cpu copy.
      
      For UP the code is very simple: it keeps using the "per-cpu" GDT as per SMP,
      but we never have to move.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      bf504672
    • R
      [PATCH] i386: Use per-cpu variables for GDT, PDA · ae1ee11b
      Rusty Russell 提交于
      Allocating PDA and GDT at boot is a pain.  Using simple per-cpu variables adds
      happiness (although we need the GDT page-aligned for Xen, which we do in a
      followup patch).
      
      [akpm@linux-foundation.org: build fix]
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      ae1ee11b
    • I
      [PATCH] i386: Allow i386 crash kernels to handle x86_64 dumps · 79e03011
      Ian Campbell 提交于
      The specific case I am encountering is kdump under Xen with a 64 bit
      hypervisor and 32 bit kernel/userspace.  The dump created is 64 bit due to
      the hypervisor but the dump kernel is 32 bit for maximum compatibility.
      
      It's possibly less likely to be useful in a purely native scenario but I
      see no reason to disallow it.
      
      [akpm@linux-foundation.org: build fix]
      Signed-off-by: NIan Campbell <ian.campbell@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Acked-by: NVivek Goyal <vgoyal@in.ibm.com>
      Cc: Horms <horms@verge.net.au>
      Cc: Magnus Damm <magnus.damm@gmail.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      79e03011
    • R
      [PATCH] x86-64: Introduce load_TLS to the "for" loop. · eab0c72a
      Rusty Russell 提交于
      GCC (4.1 at least) unrolls it anyway, but I can't believe this code
      was ever justifiable.  (I've also submitted a patch which cleans up
      i386, which is even uglier).
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      eab0c72a
    • R
      [PATCH] i386: Initialize esp0 properly all the time · 692174b9
      Rusty Russell 提交于
      Whenever we schedule, __switch_to calls load_esp0 which does:
      
      	tss->esp0 = thread->esp0;
      
      This is never initialized for the initial thread (ie "swapper"), so when we're
      scheduling that, we end up setting esp0 to 0.  This is fine: the swapper never
      leaves ring 0, so this field is never used.
      
      lguest, however, gets upset that we're trying to used an unmapped page as our
      kernel stack.  Rather than work around it there, let's initialize it.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      692174b9
    • D
      [PATCH] x86-64: configurable fake numa node sizes · 8b8ca80e
      David Rientjes 提交于
      Extends the numa=fake x86_64 command-line option to allow for configurable
      node sizes.  These nodes can be used in conjunction with cpusets for coarse
      memory resource management.
      
      The old command-line option is still supported:
        numa=fake=32	gives 32 fake NUMA nodes, ignoring the NUMA setup of the
      		actual machine.
      
      But now you may configure your system for the node sizes of your choice:
        numa=fake=2*512,1024,2*256
      		gives two 512M nodes, one 1024M node, two 256M nodes, and
      		the rest of system memory to a sixth node.
      
      The existing hash function is maintained to support the various node sizes
      that are possible with this implementation.
      
      Each node of the same size receives roughly the same amount of available
      pages, regardless of any reserved memory with its address range.  The total
      available pages on the system is calculated and divided by the number of equal
      nodes to allocate.  These nodes are then dynamically allocated and their
      borders extended until such time as their number of available pages reaches
      the required size.
      
      Configurable node sizes are recommended when used in conjunction with cpusets
      for memory control because it eliminates the overhead associated with scanning
      the zonelists of many smaller full nodes on page_alloc().
      
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Paul Jackson <pj@sgi.com>
      Cc: Christoph Lameter <clameter@engr.sgi.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      8b8ca80e
    • J
      [PATCH] x86: Log reason why TSC was marked unstable · 5a90cf20
      john stultz 提交于
      Change mark_tsc_unstable() so it takes a string argument, which holds the
      reason the TSC was marked unstable.
      
      This is then displayed the first time mark_tsc_unstable is called.
      
      This should help us better debug why the TSC was marked unstable on certain
      systems and allow us to make sure we're not being overly paranoid when
      throwing out this troublesome clocksource.
      
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      5a90cf20
    • V
      [PATCH] i386: modpost apic related warning fixes · 1833d6bc
      Vivek Goyal 提交于
      o Modpost generates warnings for i386 if compiled with CONFIG_RELOCATABLE=y
      
      WARNING: vmlinux - Section mismatch: reference to .init.text:find_unisys_acpi_oem_table from .text between 'acpi_madt_oem_check' (at offset 0xc0101eda) and 'enable_apic_mode'
      WARNING: vmlinux - Section mismatch: reference to .init.text:acpi_get_table_header_early from .text between 'acpi_madt_oem_check' (at offset 0xc0101ef0) and 'enable_apic_mode'
      WARNING: vmlinux - Section mismatch: reference to .init.text:parse_unisys_oem from .text between 'acpi_madt_oem_check' (at offset 0xc0101f2e) and 'enable_apic_mode'
      WARNING: vmlinux - Section mismatch: reference to .init.text:setup_unisys from .text between 'acpi_madt_oem_check' (at offset 0xc0101f37) and 'enable_apic_mode'WARNING: vmlinux - Section mismatch: reference to .init.text:parse_unisys_oem from .text between 'mps_oem_check' (at offset 0xc0101ec7) and 'acpi_madt_oem_check'
      WARNING: vmlinux - Section mismatch: reference to .init.text:es7000_sw_apic from .text between 'enable_apic_mode' (at offset 0xc0101f48) and 'check_apicid_present'
      
      o Some functions which are inline (acpi_madt_oem_check) are not inlined by
        compiler as these functions are accessed using function pointer. These
        functions are put in .text section and they in-turn access __init type
        functions hence modpost generates warnings.
      
      o Do not iniline acpi_madt_oem_check, instead make it __init.
      Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Len Brown <lenb@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      1833d6bc
    • R
      [PATCH] x86-64: Set HASHDIST_DEFAULT to 1 for x86_64 NUMA · e073ae1b
      Ravikiran G Thirumalai 提交于
      Enable system hashtable memory to be distributed among nodes on x86_64 NUMA
      
      Forcing the kernel to use node interleaved vmalloc instead of bootmem for
      the system hashtable memory (alloc_large_system_hash) reduces the memory
      imbalance on node 0 by around 40MB on a 8 node x86_64 NUMA box:
      
      Before the following patch, on bootup of a 8 node box:
      
      Node 0 MemTotal:      3407488 kB
      Node 0 MemFree:       3206296 kB
      Node 0 MemUsed:        201192 kB
      Node 0 Active:           7012 kB
      Node 0 Inactive:          512 kB
      Node 0 Dirty:               0 kB
      Node 0 Writeback:           0 kB
      Node 0 FilePages:        1912 kB
      Node 0 Mapped:            420 kB
      Node 0 AnonPages:        5612 kB
      Node 0 PageTables:        468 kB
      Node 0 NFS_Unstable:        0 kB
      Node 0 Bounce:              0 kB
      Node 0 Slab:             5408 kB
      Node 0 SReclaimable:      644 kB
      Node 0 SUnreclaim:       4764 kB
      
      After the patch (or using hashdist=1 on the kernel command line):
      
      Node 0 MemTotal:      3407488 kB
      Node 0 MemFree:       3247608 kB
      Node 0 MemUsed:        159880 kB
      Node 0 Active:           3012 kB
      Node 0 Inactive:          616 kB
      Node 0 Dirty:               0 kB
      Node 0 Writeback:           0 kB
      Node 0 FilePages:        2424 kB
      Node 0 Mapped:            380 kB
      Node 0 AnonPages:        1200 kB
      Node 0 PageTables:        396 kB
      Node 0 NFS_Unstable:        0 kB
      Node 0 Bounce:              0 kB
      Node 0 Slab:             6304 kB
      Node 0 SReclaimable:     1596 kB
      Node 0 SUnreclaim:       4708 kB
      
      I guess it is a good idea to keep HASHDIST_DEFAULT "on" for x86_64 NUMA
      since x86_64 has no dearth of vmalloc space?  Or maybe enable hash
      distribution for all 64bit NUMA arches?  The following patch does it only
      for x86_64.
      
      I ran a HPC MPI benchmark -- 'Ansys wingsolid', which takes up quite a bit of
      memory and uses up tlb entries.  This was on a 4 way, 2 socket
      Tyan AMD box (non vsmp), with 8G total memory (4G pernode).
      
      The results with and without hash distribution are:
      
      1. Vanilla - runtime of 1188.000s
      2. With hashdist=1 runtime of 1154.000s
      
      Oprofile output for the duration of run is:
      
      1. Vanilla:
      PU: AMD64 processors, speed 2411.16 MHz (estimated)
      Counted L1_AND_L2_DTLB_MISSES events (L1 and L2 DTLB misses) with a unit
      mask of 0x00 (No unit mask) count 500
      samples  %        app name                 symbol name
      163054    6.5513  libansys1.so             MultiFront::decompose(int, int,
      Elemset *, int *, int, int, int)
      162061    6.5114  libansys3.so             blockSaxpy6L_fd
      162042    6.5107  libansys3.so             blockInnerProduct6L_fd
      156286    6.2794  libansys3.so             maxb33_
      87879     3.5309  libansys1.so             elmatrixmultpcg_
      84857     3.4095  libansys4.so             saxpy_pcg
      58637     2.3560  libansys4.so             .st4560
      46612     1.8728  libansys4.so             .st4282
      43043     1.7294  vmlinux-t                copy_user_generic_string
      41326     1.6604  libansys3.so             blockSaxpyBackSolve6L_fd
      41288     1.6589  libansys3.so             blockInnerProductBackSolve6L_fd
      
      2. With hashdist=1
      CPU: AMD64 processors, speed 2411.13 MHz (estimated)
      Counted L1_AND_L2_DTLB_MISSES events (L1 and L2 DTLB misses) with a unit
      mask of 0x00 (No unit mask) count 500
      samples  %        app name                 symbol name
      162993    6.9814  libansys1.so             MultiFront::decompose(int, int,
      Elemset *, int *, int, int, int)
      160799    6.8874  libansys3.so             blockInnerProduct6L_fd
      160459    6.8729  libansys3.so             blockSaxpy6L_fd
      156018    6.6826  libansys3.so             maxb33_
      84700     3.6279  libansys4.so             saxpy_pcg
      83434     3.5737  libansys1.so             elmatrixmultpcg_
      58074     2.4875  libansys4.so             .st4560
      46000     1.9703  libansys4.so             .st4282
      41166     1.7632  libansys3.so             blockSaxpyBackSolve6L_fd
      41033     1.7575  libansys3.so             blockInnerProductBackSolve6L_fd
      35762     1.5318  libansys1.so             inner_product_sub
      35591     1.5245  libansys1.so             inner_product_sub2
      28259     1.2104  libansys4.so             addVectors
      Signed-off-by: NPravin B. Shelar <pravin.shelar@calsoftinc.com>
      Signed-off-by: NRavikiran Thirumalai <kiran@scalex86.org>
      Signed-off-by: NShai Fultheim <shai@scalex86.org>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Acked-by: NChristoph Lameter <clameter@engr.sgi.com>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      e073ae1b
    • A
      [PATCH] x86-64: fix x86_64-mm-sched-clock-share · 184c44d2
      Andrew Morton 提交于
      Fix for the following patch. Provide dummy cpufreq functions when
      CPUFREQ is not compiled in.
      
      Cc: Andi Kleen <ak@suse.de>
      Cc: Dave Jones <davej@codemonkey.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      184c44d2
    • V
      [PATCH] x86-64: build-time checking · 6a50a664
      Vivek Goyal 提交于
      o X86_64 kernel should run from 2MB aligned address for two reasons.
      	- Performance.
      	- For relocatable kernels, page tables are updated based on difference
      	  between compile time address and load time physical address.
      	  This difference should be multiple of 2MB as kernel text and data
      	  is mapped using 2MB pages and PMD should be pointing to a 2MB
      	  aligned address. Life is simpler if both compile time and load time
      	  kernel addresses are 2MB aligned.
      
      o Flag the error at compile time if one is trying to build a kernel which
        does not meet alignment restrictions.
      Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      6a50a664
    • V
      [PATCH] x86-64: Relocatable Kernel Support · 1ab60e0f
      Vivek Goyal 提交于
      This patch modifies the x86_64 kernel so that it can be loaded and run
      at any 2M aligned address, below 512G.  The technique used is to
      compile the decompressor with -fPIC and modify it so the decompressor
      is fully relocatable.  For the main kernel the page tables are
      modified so the kernel remains at the same virtual address.  In
      addition a variable phys_base is kept that holds the physical address
      the kernel is loaded at.  __pa_symbol is modified to add that when
      we take the address of a kernel symbol.
      
      When loaded with a normal bootloader the decompressor will decompress
      the kernel to 2M and it will run there.  This both ensures the
      relocation code is always working, and makes it easier to use 2M
      pages for the kernel and the cpu.
      
      AK: changed to not make RELOCATABLE default in Kconfig
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      1ab60e0f
    • V
      [PATCH] x86: __pa and __pa_symbol address space separation · 0dbf7028
      Vivek Goyal 提交于
      Currently __pa_symbol is for use with symbols in the kernel address
      map and __pa is for use with pointers into the physical memory map.
      But the code is implemented so you can usually interchange the two.
      
      __pa which is much more common can be implemented much more cheaply
      if it is it doesn't have to worry about any other kernel address
      spaces.  This is especially true with a relocatable kernel as
      __pa_symbol needs to peform an extra variable read to resolve
      the address.
      
      There is a third macro that is added for the vsyscall data
      __pa_vsymbol for finding the physical addesses of vsyscall pages.
      
      Most of this patch is simply sorting through the references to
      __pa or __pa_symbol and using the proper one.  A little of
      it is continuing to use a physical address when we have it
      instead of recalculating it several times.
      
      swapper_pgd is now NULL.  leave_mm now uses init_mm.pgd
      and init_mm.pgd is initialized at boot (instead of compile time)
      to the physmem virtual mapping of init_level4_pgd.  The
      physical address changed.
      
      Except for the for EMPTY_ZERO page all of the remaining references
      to __pa_symbol appear to be during kernel initialization.  So this
      should reduce the cost of __pa in the common case, even on a relocated
      kernel.
      
      As this is technically a semantic change we need to be on the lookout
      for anything I missed.  But it works for me (tm).
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      0dbf7028
    • V
      [PATCH] x86-64: Remove the identity mapping as early as possible · cfd243d4
      Vivek Goyal 提交于
      With the rewrite of the SMP trampoline and the early page
      allocator there is nothing that needs identity mapped pages,
      once we start executing C code.
      
      So add zap_identity_mappings into head64.c and remove
      zap_low_mappings() from much later in the code.  The functions
       are subtly different thus the name change.
      
      This also kills boot_level4_pgt which was from an earlier
      attempt to move the identity mappings as early as possible,
      and is now no longer needed.  Essentially I have replaced
      boot_level4_pgt with trampoline_level4_pgt in trampoline.S
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      cfd243d4
    • V
      [PATCH] x86-64: wakeup.S rename registers to reflect right names · 7db681d7
      Vivek Goyal 提交于
      o Use appropriate names for 64bit regsiters.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      7db681d7
    • V
      [PATCH] x86-64: Add EFER to the register set saved by save_processor_state · 3c321bce
      Vivek Goyal 提交于
      EFER varies like %cr4 depending on the cpu capabilities, and which cpu
      capabilities we want to make use of.  So save/restore it make certain
      we have the same EFER value when we are done.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      3c321bce
    • V
      [PATCH] x86-64: cleanup segments · 30f47289
      Vivek Goyal 提交于
      Move __KERNEL32_CS up into the unused gdt entry.  __KERNEL32_CS is
      used when entering the kernel so putting it first is useful when
      trying to keep boot gdt sizes to a minimum.
      
      Set the accessed bit on all gdt entries.  We don't care
      so there is no need for the cpu to burn the extra cycles,
      and it potentially allows the pages to be immutable.  Plus
      it is confusing when debugging and your gdt entries mysteriously
      change.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      30f47289
    • V
      [PATCH] x86-64: Clean up the early boot page table · 67dcbb6b
      Vivek Goyal 提交于
      - Merge physmem_pgt and ident_pgt, removing physmem_pgt.  The merge
        is broken as soon as mm/init.c:init_memory_mapping is run.
      - As physmem_pgt is gone don't export it in pgtable.h.
      - Use defines from pgtable.h for page permissions.
      - Fix the physical memory identity mapping so it is at the correct
        address.
      - Remove the physical memory mapping from wakeup_level4_pgt it
        is at the wrong address so we can't possibly be usinging it.
      - Simply NEXT_PAGE the work to calculate the phys_ alias
        of the labels was very cool.  Unfortuantely it was a brittle
        special purpose hack that makes maitenance more difficult.
        Instead just use label - __START_KERNEL_map like we do
        everywhere else in assembly.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      67dcbb6b
    • V
      [PATCH] x86-64: Assembly safe page.h and pgtable.h · 9d291e78
      Vivek Goyal 提交于
      This patch makes pgtable.h and page.h safe to include
      in assembly files like head.S.  Allowing us to use
      symbolic constants instead of hard coded numbers when
      refering to the page tables.
      
      This patch copies asm-sparc64/const.h to asm-x86_64 to
      get a definition of _AC() a very convinient macro that
      allows us to force the type when we are compiling the
      code in C and to drop all of the type information when
      we are using the constant in assembly.  Previously this
      was done with multiple definition of the same constant.
      const.h was modified slightly so that it works when given
      CONFIG options as arguments.
      
      This patch adds #ifndef __ASSEMBLY__ ... #endif
      and _AC(1,UL) where appropriate so the assembler won't
      choke on the header files.  Otherwise nothing
      should have changed.
      
      AK: added const.h to exported headers to fix headers_check
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      9d291e78
    • S
      [PATCH] x86-64: dma_ops as const · e6584504
      Stephen Hemminger 提交于
      The dma_ops structure can be const since it never changes
      after boot.
      Signed-off-by: NStephen Hemminger <shemminger@linux-foundation.org>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      e6584504
    • J
      [PATCH] x86-64: fix cpu MHz reporting on constant_tsc cpus · 6b37f5a2
      Joerg Roedel 提交于
      This patch fixes the reporting of cpu_mhz in /proc/cpuinfo on CPUs with
      a constant TSC rate and a kernel with disabled cpufreq.
      Signed-off-by: NMark Langsdorf <mark.langsdorf@amd.com>
      Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      
       arch/x86_64/kernel/apic.c     |    2 -
       arch/x86_64/kernel/time.c     |   58 +++++++++++++++++++++++++++++++++++++++---
       arch/x86_64/kernel/tsc.c      |   12 +++++---
       arch/x86_64/kernel/tsc_sync.c |    2 -
       include/asm-x86_64/proto.h    |    1
       5 files changed, 65 insertions(+), 10 deletions(-)
      6b37f5a2
    • G
      [PATCH] x86-64: Remove duplicated code for reading control registers · fbc16f2c
      Glauber de Oliveira Costa 提交于
      On Tue, Mar 13, 2007 at 05:33:09AM -0700, Randy.Dunlap wrote:
      > On Tue, 13 Mar 2007, Glauber de Oliveira Costa wrote:
      >
      > > Tiny cleanup:
      > >
      > > In x86_64, the same functions for reading cr3 and writing cr{3,4} are
      > > defined in tlbflush.h and system.h, whith just a name change.
      > > The only difference is the clobbering of memory, which seems a safe, and
      > > even needed change for the write_cr4. This patch removes the duplicate.
      > > write_cr3() is moved to system.h for consistency.
      >
      > missing patch.....
      >
      thanks. Attached now
      
      --
      Glauber de Oliveira Costa
      Red Hat Inc.
      "Free as in Freedom"
      Signed-off-by: NAndi Kleen <ak@suse.de>
      fbc16f2c
    • R
      [PATCH] x86-64: Remove unused set_seg_base · f9d09645
      Rusty Russell 提交于
      The set_seg_base function isn't used anywhere (2.6.21-rc3-git1)
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      f9d09645