提交 · 562795fe5770d0c7ee7fe269890e37d9fd475fdf · openeuler / raspberrypi-kernel

17 1月, 2006 2 次提交

[PATCH] x86_64: Increase NR_IRQ_VECTORS to 32 * NR_CPUS · 5580ecee

由 Andi Kleen 提交于 1月 16, 2006

This prevents running out of GSIs on large Unisys ES7000 machines.
Follows i386

Cc:  "Protasevich, Natalie" <Natalie.Protasevich@unisys.com>
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

5580ecee

[PATCH] x86_64: Allow nesting of int3 by default for kprobes · 5f8efbb9

由 Andi Kleen 提交于 1月 16, 2006

This unbreaks recursive kprobes which didn't work anymore
due to an earlier patch which converted the debug entry point
to use an IST.

This also allows nesting of the debug entry point too.
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

5f8efbb9

15 1月, 2006 1 次提交

[PATCH] mark several functions __always_inline · 652050ae

由 Ingo Molnar 提交于 1月 14, 2006

      Arjan van de Ven <arjan@infradead.org>

Mark a number of functions as 'must inline'.  The functions affected by this
patch need to be inlined because they use knowledge that their arguments are
constant so that most of the function optimizes away.  At this point this
patch does not change behavior, it's for documentation only (and for future
patches in the inline series)
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NArjan van de Ven <arjan@infradead.org>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

652050ae

13 1月, 2006 5 次提交

[PATCH] death of get_thread_info/put_thread_info · f5a61d0c

由 Al Viro 提交于 1月 12, 2006

{get,put}_thread_info() were introduced in 2.5.4 and never
had been called by anything in the tree.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

f5a61d0c

[PATCH] amd64: task_pt_regs() · bb049232

由 Al Viro 提交于 1月 12, 2006

Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

bb049232

[PATCH] amd64: task_thread_info() · e4f17c43

由 Al Viro 提交于 1月 12, 2006

Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

e4f17c43

[PATCH] scheduler cache-hot-autodetect · 198e2f18

由 akpm@osdl.org 提交于 1月 12, 2006

)

From: Ingo Molnar <mingo@elte.hu>

This is the latest version of the scheduler cache-hot-auto-tune patch.

The first problem was that detection time scaled with O(N^2), which is
unacceptable on larger SMP and NUMA systems. To solve this:

- I've added a 'domain distance' function, which is used to cache
  measurement results. Each distance is only measured once. This means
  that e.g. on NUMA distances of 0, 1 and 2 might be measured, on HT
  distances 0 and 1, and on SMP distance 0 is measured. The code walks
  the domain tree to determine the distance, so it automatically follows
  whatever hierarchy an architecture sets up. This cuts down on the boot
  time significantly and removes the O(N^2) limit. The only assumption
  is that migration costs can be expressed as a function of domain
  distance - this covers the overwhelming majority of existing systems,
  and is a good guess even for more assymetric systems.

  [ People hacking systems that have assymetries that break this
    assumption (e.g. different CPU speeds) should experiment a bit with
    the cpu_distance() function. Adding a ->migration_distance factor to
    the domain structure would be one possible solution - but lets first
    see the problem systems, if they exist at all. Lets not overdesign. ]

Another problem was that only a single cache-size was used for measuring
the cost of migration, and most architectures didnt set that variable
up. Furthermore, a single cache-size does not fit NUMA hierarchies with
L3 caches and does not fit HT setups, where different CPUs will often
have different 'effective cache sizes'. To solve this problem:

- Instead of relying on a single cache-size provided by the platform and
  sticking to it, the code now auto-detects the 'effective migration
  cost' between two measured CPUs, via iterating through a wide range of
  cachesizes. The code searches for the maximum migration cost, which
  occurs when the working set of the test-workload falls just below the
  'effective cache size'. I.e. real-life optimized search is done for
  the maximum migration cost, between two real CPUs.

  This, amongst other things, has the positive effect hat if e.g. two
  CPUs share a L2/L3 cache, a different (and accurate) migration cost
  will be found than between two CPUs on the same system that dont share
  any caches.

(The reliable measurement of migration costs is tricky - see the source
for details.)

Furthermore i've added various boot-time options to override/tune
migration behavior.

Firstly, there's a blanket override for autodetection:

	migration_cost=1000,2000,3000

will override the depth 0/1/2 values with 1msec/2msec/3msec values.

Secondly, there's a global factor that can be used to increase (or
decrease) the autodetected values:

	migration_factor=120

will increase the autodetected values by 20%. This option is useful to
tune things in a workload-dependent way - e.g. if a workload is
cache-insensitive then CPU utilization can be maximized by specifying
migration_factor=0.

I've tested the autodetection code quite extensively on x86, on 3
P3/Xeon/2MB, and the autodetected values look pretty good:

Dual Celeron (128K L2 cache):

 ---------------------
 migration cost matrix (max_cache_size: 131072, cpu: 467 MHz):
 ---------------------
           [00]    [01]
 [00]:     -     1.7(1)
 [01]:   1.7(1)    -
 ---------------------
 cacheflush times [2]: 0.0 (0) 1.7 (1784008)
 ---------------------

Here the slow memory subsystem dominates system performance, and even
though caches are small, the migration cost is 1.7 msecs.

Dual HT P4 (512K L2 cache):

 ---------------------
 migration cost matrix (max_cache_size: 524288, cpu: 2379 MHz):
 ---------------------
           [00]    [01]    [02]    [03]
 [00]:     -     0.4(1)  0.0(0)  0.4(1)
 [01]:   0.4(1)    -     0.4(1)  0.0(0)
 [02]:   0.0(0)  0.4(1)    -     0.4(1)
 [03]:   0.4(1)  0.0(0)  0.4(1)    -
 ---------------------
 cacheflush times [2]: 0.0 (33900) 0.4 (448514)
 ---------------------

Here it can be seen that there is no migration cost between two HT
siblings (CPU#0/2 and CPU#1/3 are separate physical CPUs). A fast memory
system makes inter-physical-CPU migration pretty cheap: 0.4 msecs.

8-way P3/Xeon [2MB L2 cache]:

 ---------------------
 migration cost matrix (max_cache_size: 2097152, cpu: 700 MHz):
 ---------------------
           [00]    [01]    [02]    [03]    [04]    [05]    [06]    [07]
 [00]:     -    19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1)
 [01]:  19.2(1)    -    19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1)
 [02]:  19.2(1) 19.2(1)    -    19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1)
 [03]:  19.2(1) 19.2(1) 19.2(1)    -    19.2(1) 19.2(1) 19.2(1) 19.2(1)
 [04]:  19.2(1) 19.2(1) 19.2(1) 19.2(1)    -    19.2(1) 19.2(1) 19.2(1)
 [05]:  19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1)    -    19.2(1) 19.2(1)
 [06]:  19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1)    -    19.2(1)
 [07]:  19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1)    -
 ---------------------
 cacheflush times [2]: 0.0 (0) 19.2 (19281756)
 ---------------------

This one has huge caches and a relatively slow memory subsystem - so the
migration cost is 19 msecs.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NAshok Raj <ashok.raj@intel.com>
Signed-off-by: NKen Chen <kenneth.w.chen@intel.com>
Cc: <wilder@us.ibm.com>
Signed-off-by: NJohn Hawkes <hawkes@sgi.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

198e2f18

[PATCH] sched: add cacheflush() asm · 4dc7a0bb

由 Ingo Molnar 提交于 1月 12, 2006

Add per-arch sched_cacheflush() which is a write-back cacheflush used by
the migration-cost calibration code at bootup time.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

4dc7a0bb

12 1月, 2006 32 次提交

[PATCH] x86_64: Some housekeeping in local APIC code · 11a8e778

由 Andi Kleen 提交于 1月 11, 2006

Remove support for obsolete hardware and cleanup.

- Remove checks for non integrated APICs
- Replace apic_write_around with apic_write.
- Remove apic_read_around
- Remove APIC version reads used by old workarounds
- Remove old workaround for Simics
- Fix indentation
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

11a8e778

[PATCH] x86_64: Display meaningful part of filename during BUG() · 5f1d189f

由 Jan Beulich 提交于 1月 11, 2006

When building in a separate objtree, file names produced by BUG() & Co. can
get fairly long; printing only the first 50 characters may thus result in
(almost) no useful information. The following change makes it so that rather
the last 50 characters of the filename get printed.
Signed-Off-By: NJan Beulich <jbeulich@novell.com>
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

5f1d189f

[PATCH] x86_64: Remove unused AMD K8 C stepping flag · dd52d642

由 Andi Kleen 提交于 1月 11, 2006

X86_FEATURE_K8_C was a synthetic Linux CPUID flag that was used for some
code optimizations in Opteron C stepping or later. But support for pre C
stepping optimizations has been removed, so this isn't needed anymore.
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

dd52d642

[PATCH] x86_64: sparse warning cleanups · 77a75333

由 Stephen Hemminger 提交于 1月 11, 2006

Fix some trivial sparse warnings in x86_64 code.
Signed-off-by: NStephen Hemminger <shemminger@osdl.org>
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

77a75333

[PATCH] x86_64: Move NUMA page_to_pfn/pfn_to_page functions out of line · cf050132

由 Andi Kleen 提交于 1月 11, 2006

Saves about ~18K .text in defconfig

There would be more optimization potential, but that's for later.

Suggestion originally from Bill Irwin.
Fix from Andy Whitcroft.
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

cf050132

[PATCH] x86_64: Remove unused segments · cdc4b9c0

由 Andi Kleen 提交于 1月 11, 2006

They used to be used by the reboot code, but not anymore.

Noticed by Jan Beulich

Cc: JBeulich@novell.com
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

cdc4b9c0

[PATCH] x86_64: Inclusion of ScaleMP vSMP architecture patches - vsmp_arch · 79f12614

由 Ravikiran G Thirumalai 提交于 1月 11, 2006

Introduce vSMP arch to the kernel.

This patch:
1. Adds CONFIG_X86_VSMP
2. Adds machine specific macros for local_irq_disabled, local_irq_enabled
   and irqs_disabled
3. Writes to the vSMP CTL device to indicate kernel compiled with CONFIG_VSMP
Signed-off-by: NRavikiran Thirumalai <kiran@scalemp.com>
Signed-off-by: NShai Fultheim <shai@scalemp.com>
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

79f12614

[PATCH] x86_64: Inclusion of ScaleMP vSMP architecture patches - vsmp_align · 5fd63b30

由 Ravikiran G Thirumalai 提交于 1月 11, 2006

vSMP specific alignment patch to
1. Define INTERNODE_CACHE_SHIFT for vSMP
2. Use this for alignment of critical structures
3. Use INTERNODE_CACHE_SHIFT for ARCH_MIN_TASKALIGN,
   and let the slab align task_struct allocations to the internode cacheline size
4. Introduce and use ARCH_MIN_MMSTRUCT_ALIGN for mm_struct slab allocations.
Signed-off-by: NRavikiran Thirumalai <kiran@scalemp.com>
Signed-off-by: NShai Fultheim <shai@scalemp.com>
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

5fd63b30

[PATCH] x86_64: Make sure BITS_PER_ATOMIC is defined in asm-generic/atomic.h · 99f7b77d

由 Andi Kleen 提交于 1月 11, 2006

Fixes

  CC      fs/nfsctl.o
In file included from include2/asm/atomic.h:427,
                 from /home/lsrc/quilt/linux/include/linux/file.h:8,
                 from /home/lsrc/quilt/linux/fs/nfsctl.c:8:
/home/lsrc/quilt/linux/include/asm-generic/atomic.h:20:5: warning: "BITS_PER_LONG" is not defined
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

99f7b77d

[PATCH] x86_64: cleanup enter_lazy_tlb() · e4b5939a

由 Brian Gerst 提交于 1月 11, 2006

Move the #ifdef into the function body.
Signed-off-by: NBrian Gerst <bgerst@didntduck.org>
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

e4b5939a

[PATCH] x86_64: Remove useless KDB vector · 915f34e2

由 Andi Kleen 提交于 1月 11, 2006

It was set as an NMI, but the NMI bit always forces an interrupt
to end up at vector 2. So it was never used. Remove.
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

915f34e2

[PATCH] x86_64: Don't claim too many vectors for TLB flushing · e080e9d6

由 Jason Uhlenkott 提交于 1月 11, 2006

It looks like the new scalable TLB flush code for x86_64 is claiming
one more IRQ vector than it actually uses.
Signed-off-by: NJason Uhlenkott <jasonuhl@sgi.com>
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

e080e9d6

[PATCH] x86_64: Allocate PDAs in the local node · 365ba917

由 Ravikiran G Thirumalai 提交于 1月 11, 2006

Patch uses a static PDA array early at boot and reallocates processor PDA
with node local memory when kmalloc is ready, just before pda_init.
The boot_cpu_pda is needed since the cpu_pda is used even before pda_init for
that cpu is called (to set the static per-cpu areas offset table etc)
Signed-off-by: NRavikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: NShai Fultheim <shai@scalex86.org>
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

365ba917

[PATCH] x86_64: Node local pda take 2 -- cpu_pda preparation · df79efde

由 Ravikiran G Thirumalai 提交于 1月 11, 2006

Helper patch to change cpu_pda users to use macros to access cpu_pda
instead of the cpu_pda[] array.
Signed-off-by: NRavikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: NShai Fultheim <shai@scalex86.org>
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

df79efde

[PATCH] x86_64: Early initialization of cpu_to_node · 05b3cbd8

由 Ravikiran Thirumalai 提交于 1月 11, 2006

Patch enables early intialization of cpu_to_node.
apicid_to_node is built by reading the SRAT table, from acpi_numa_init with
ACPI_NUMA and k8_scan_nodes with K8_NUMA.
x86_cpu_to_apicid is built by parsing the ACPI MADT table, from acpi_boot_init.
We combine these two tables and setup cpu_to_node.

Early intialization helps the static per_cpu_areas in getting pages from
correct node.

Change since last release:
Do not initialize early init_cpu_to_node for faking node cases.

Patch tested on TYAN dual core 4P board with K8 only, ACPI_NUMA.
Tested on EM64T NUMA. Also tested with numa=off, numa=fake, and  running
a kernel compiled with NUMA on a regular EM64 2 way SMP.
Signed-off-by: NAlok N Kataria <alokk@calsoftinc.com>
Signed-off-by: NRavikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: NShai Fultheim <shai@scalex86.org>
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

05b3cbd8

[PATCH] x86_64: On Intel CPUs don't do an additional CPU sync before RDTSC · c818a181

由 Andi Kleen 提交于 1月 11, 2006

RDTSC serialization using cpuid is not needed for Intel platforms.
This increases gettimeofday performance.

Cc: vojtech@suse.cz
Cc: rohit.seth@intel.com
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

c818a181

[PATCH] x86_64: Support alternative() with a output argument. · 6e54d95f

由 Andi Kleen 提交于 1月 11, 2006

Needed for follow on patches
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

6e54d95f

[PATCH] x86_64: Don't try to synchronize the TSC over CPUs on Intel CPUs at boot. · 737c5c3b

由 Andi Kleen 提交于 1月 11, 2006

They already do this in hardware and the Linux algorithm
actually adds errors.

Cc: mingo@elte.hu
Cc: rohit.seth@intel.com
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

737c5c3b

[PATCH] x86_64: Fix compile error with !CONFIG_COMPAT · 3c021751

由 Andi Kleen 提交于 1月 11, 2006

cpumask.h wasn't included implicitely into proto.h in this case.
Just move it over to smp.h
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

3c021751

[PATCH] x86_64: x86_64 write apic id fix · b9d1e4bd

由 Vivek Goyal 提交于 1月 11, 2006

o Apic id is in most significant 8 bits of APIC_ID register. Current code
  is trying to write apic id to least significant 8 bits. This patch fixes
  it.

o This fix enables booting uni kdump capture kernel on a cpu with non-zero
  apic id.
Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

b9d1e4bd

[PATCH] x86_64: Remove unused apic_write_atomic · 2d0db401

由 Andi Kleen 提交于 1月 11, 2006

This function is never used for x86_64.
Signed-off-by: NBrian Gerst <bgerst@didntduck.org>
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

2d0db401

[PATCH] x86_64: Use function pointers to call DMA mapping functions · 17a941d8

由 Muli Ben-Yehuda 提交于 1月 11, 2006

AK: I hacked Muli's original patch a lot and there were a lot
of changes - all bugs are probably to blame on me now.
There were also some changes in the fall back behaviour
for swiotlb - in particular it doesn't try to use GFP_DMA
now anymore. Also all DMA mapping operations use the
same core dma_alloc_coherent code with proper fallbacks now.
And various other changes and cleanups.

Known problems: iommu=force swiotlb=force together breaks
                needs more testing.

This patch cleans up x86_64's DMA mapping dispatching code. Right now
we have three possible IOMMU types: AGP GART, swiotlb and nommu, and
in the future we will also have Xen's x86_64 swiotlb and other HW
IOMMUs for x86_64. In order to support all of them cleanly, this
patch:

- introduces a struct dma_mapping_ops with function pointers for each
  of the DMA mapping operations of gart (AMD HW IOMMU), swiotlb
  (software IOMMU) and nommu (no IOMMU).

- gets rid of:

  if (swiotlb)
      return swiotlb_xxx();

- PCI_DMA_BUS_IS_PHYS is now checked against the dma_ops being set
This makes swiotlb faster by avoiding double copying in some cases.
Signed-Off-By: NMuli Ben-Yehuda <mulix@mulix.org>
Signed-Off-By: NJon D. Mason <jdmason@us.ibm.com>
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

17a941d8

[PATCH] x86_64: Add idle notifiers · 95833c83

由 Andi Kleen 提交于 1月 11, 2006

This adds a new notifier chain that is called with IDLE_START
when a CPU goes idle and IDLE_END when it goes out of idle.
The context can be idle thread or interrupt context.

Since we cannot rely on MONITOR/MWAIT existing the idle
end check currently has to be done in all interrupt
handlers.

They were originally inspired by the similar s390 implementation.

They have a variety of applications:
- They will be needed for CONFIG_NO_IDLE_HZ
- They can be used for oprofile to fix up the missing time
in idle when performance counters don't tick.
- They can be used for better C state management in ACPI
- They could be used for microstate accounting.

This is just infrastructure so far, no users.
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

95833c83

[PATCH] x86_64: Handle missing local APIC timer interrupts on C3 state · d25bf7e5

由 Venkatesh Pallipadi 提交于 1月 11, 2006

Whenever we see that a CPU is capable of C3 (during ACPI cstate init), we
disable local APIC timer and switch to using a broadcast from external timer
interrupt (IRQ 0).

Patch below adds the code for x86_64.
Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

d25bf7e5

[PATCH] x86_64: "extern inline" -> "static inline" in pgtable.h · 4839057c

由 Adrian Bunk 提交于 1月 11, 2006

Signed-off-by: NAdrian Bunk <bunk@stusta.de>
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

4839057c

[PATCH] x86_64: Implement is_compat_task the right way · bf2fcc6f

由 Andi Kleen 提交于 1月 11, 2006

By setting a flag during a 32bit system call only
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

bf2fcc6f

[PATCH] x86_64: Align and pad x86_64 GDT on page boundary · c11efdf9

由 Ravikiran G Thirumalai 提交于 1月 11, 2006

This patch is on the same lines as Zachary Amsden's i386 GDT page alignemnt
patch in -mm, but for x86_64.

Patch to align and pad x86_64 GDT on page boundries.

[AK: some minor cleanups and fixed incorrect TLS initialization
in CPU init.]
Signed-off-by: NNippun Goel <nippung@calsoftinc.com>
Signed-off-by: NRavikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: NShai Fultheim <shai@scalex86.org>
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

c11efdf9

[PATCH] x86_64: Fix 64bit FXSAVE encoding · 7180d4fb

由 Jan Beulich 提交于 1月 11, 2006

The separation of the rex64 prefix (on fxsave/fxrstor) by way of using
a semicolon resulted in the prefix not always taking effect (because
when extended registers are needed for addressing, another rex prefix
would have been generated by the compiler), thus (depending on the
build) resulting in eventually getting 32-bit saves and/or restores.
Signed-Off-By: NJan Beulich <jbeulich@novell.com>
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

7180d4fb

[PATCH] x86_64: Generalize DMI and enable for x86-64 · e9928674

由 Andi Kleen 提交于 1月 11, 2006

Some people need it now on 64bit so reuse the i386 code for
x86-64. This will be also useful for future bug workarounds.

It is a bit simplified there because there is no need
to do it very early on x86-64. This means it doesn't need
early ioremap et.al. We run it as a core initcall right now.

I hope it's not needed for early setup.

I added a general CONFIG_DMI symbol in case IA64 or someone
else wants to reuse the code later too.
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

e9928674

[PATCH] x86_64: fls in asm for x86_64 · 636dd2b7

由 Stephen Hemminger 提交于 1月 11, 2006

Use single instruction for find largest set bit on x86_64.

[Updated by Jan Beulich to fix wrong asm constraints in original
patch -AK]

Cc: jbeulich@novell.com
Signed-off-by: NStephen Hemminger <shemminger@osdl.org>
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

636dd2b7

[PATCH] x86_64: don't save eflags in x86-64 switch_to() · 60917a38

由 Benjamin LaHaise 提交于 1月 11, 2006

As discussed, the flags register on x86-64 is saved and restored by the
assembly code which sets up struct pt_regs, so we do not need to save
and restore it in the inline assembler which already informs gcc that
we're clobbering the flags.  This patch has been sanity booted and works
okay here.
Signed-off-by: NBenjamin LaHaise <benjamin.c.lahaise@intel.com>
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

60917a38

[PATCH] x86_64: Move int 3 handler to debug stack and allow to increase it. · b556b35e

由 Jan Beulich 提交于 1月 11, 2006

This
- switches the INT3 handler to run on an IST stack (to cope with
  breakpoints set by a kernel debugger on places where the kernel's
  %gs base hasn't been set up, yet); the IST stack used is shared with
  the INT1 handler's
[AK: this also allows setting a kprobe on the interrupt/exception entry
points]
- allows nesting of INT1/INT3 handlers so that one can, with a kernel
  debugger, debug (at least) the user-mode portions of the INT1/INT3
  handling; the nesting isn't actively enabled here since a kernel-
  debugger-free kernel doesn't need it
Signed-Off-By: NJan Beulich <jbeulich@novell.com>
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

b556b35e