提交 · 650927ef8ab1e9b05b77a3f32ca7adcedaae9306 · openanolis / cloud-kernel

26 6月, 2005 11 次提交

[PATCH] kexec: x86_64: add i8259 shutdown method · 719e7110

由 Eric W. Biederman 提交于 6月 25, 2005

From: Eric W. Biederman <ebiederm@xmission.com

The following patch simply adds a shutdown method to the x86_64 i8259 code.
Signed-off-by: NEric Biederman <ebiederm@xmission.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

719e7110

[PATCH] kexec: x86_64: e820 64bit fix · 70adada4

由 Eric W. Biederman 提交于 6月 25, 2005

From: Eric W. Biederman <ebiederm@xmission.com>

It is ok to reserve resources > 4G on x86_64 struct resource is 64bit now :)
Signed-off-by: NEric Biederman <ebiederm@xmission.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

70adada4

[PATCH] consolidate PREEMPT options into kernel/Kconfig.preempt · cc19ca86

由 Ingo Molnar 提交于 6月 25, 2005

This patch consolidates the CONFIG_PREEMPT and CONFIG_PREEMPT_BKL
preemption options into kernel/Kconfig.preempt.  This, besides reducing
source-code, also enables more centralized tweaking of preemption related
options.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

cc19ca86

[PATCH] RCU: clean up a few remaining synchronize_kernel() calls · b2b18660

由 Paul E. McKenney 提交于 6月 25, 2005

2.6.12-rc6-mm1 has a few remaining synchronize_kernel()s, some (but not
all) in comments.  This patch changes these synchronize_kernel() calls (and
comments) to synchronize_rcu() or synchronize_sched() as follows:

- arch/x86_64/kernel/mce.c mce_read(): change to synchronize_sched() to
  handle races with machine-check exceptions (synchronize_rcu() would not cut
  it given RCU implementations intended for hardcore realtime use.

- drivers/input/serio/i8042.c i8042_stop(): change to synchronize_sched() to
  handle races with i8042_interrupt() interrupt handler.  Again,
  synchronize_rcu() would not cut it given RCU implementations intended for
  hardcore realtime use.

- include/*/kdebug.h comments: change to synchronize_sched() to handle races
  with NMIs.  As before, synchronize_rcu() would not cut it...

- include/linux/list.h comment: change to synchronize_rcu(), since this
  comment is for list_del_rcu().

- security/keys/key.c unregister_key_type(): change to synchronize_rcu(),
  since this is interacting with RCU read side.

- security/keys/process_keys.c install_session_keyring(): change to
  synchronize_rcu(), since this is interacting with RCU read side.
Signed-off-by: N"Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

b2b18660

[PATCH] swsusp: clean assembly parts · 8d783b3e

由 Pavel Machek 提交于 6月 25, 2005

This patch fixes register saving so that each register is only saved once,
and adds missing saving of %cr8 on x86-64.  Some reordering so that
save/restore is more logical/safer (segment registers should be restored
after gdt).
Signed-off-by: NPavel Machek <pavel@suse.cz>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

8d783b3e

[PATCH] s-t-RAM: load gdt the right way · 343c3f64

由 Pavel Machek 提交于 6月 25, 2005

Sleep code uses wrong version of lgdt, that does the wrong thing when
gdt is beyond 16MB or so.
Signed-off-by: NPavel Machek <pavel@suse.cz>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

343c3f64

[PATCH] x86_64: Provide ability to choose using shortcuts for IPI in flat mode. · a02c4cb6

由 Ashok Raj 提交于 6月 25, 2005

This patch provides an option to switch broadcast or use mask version for
sending IPI's.  If CONFIG_HOTPLUG_CPU is defined, we choose not to use
broadcast shortcuts by default, otherwise we choose broadcast mode as default.

both cases, one can change this via startup cmd line option, to choose
no-broadcast mode.

	no_ipi_broadcast=1

This is provided on request from Andi Kleen, since he doesnt agree with
replacing IPI shortcuts as a solution for CPU hotplug.  Without removing
broadcast IPI's, it would mean lots of new code for __cpu_up() path, which
would acheive the same results.
Signed-off-by: NAshok Raj <ashok.raj@intel.com>
Acked-by: NAndi Kleen <ak@muc.de>
Acked-by: NZwane Mwaikambo <zwane@arm.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

a02c4cb6

[PATCH] x86_64: Dont use broadcast shortcut to make it cpu hotplug safe. · 884d9e40

由 Ashok Raj 提交于 6月 25, 2005

Broadcast IPI's provide un-expected behaviour for cpu hotplug.  CPU's in
offline state also end up receiving the IPI.  Once the cpus become online they
receive these stale IPI's which are bad and introduce unexpected behaviour.

This is easily avoided by not sending a broadcast and addressing just the
CPU's in online map.  Doing prelim cycle counts it appears there is no big
overhead and numbers seem around 0x3000-0x3900 on an average on x86 and x86_64
systems with CPUS running 3G, both for broadcast and mask version of the
API's.

The shortcuts are useful only for flat mode (where the perf shows no
degradation), and in cluster mode, its unicast anyway.  Its simpler to just
not use broadcast anymore.
Signed-off-by: NAshok Raj <ashok.raj@intel.com>
Acked-by: NAndi Kleen <ak@muc.de>
Acked-by: NZwane Mwaikambo <zwane@arm.linux.org.uk>
Signed-off-by: NShaohua Li <shaohua.li@intel.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

884d9e40

[PATCH] x86_64: CPU hotplug sibling map cleanup · cb0cd8d4

由 Ashok Raj 提交于 6月 25, 2005

This patch is a minor cleanup to the cpu sibling/core map.  It is required
that this setup happens on a per-cpu bringup time.
Signed-off-by: NAshok Raj <ashok.raj@intel.com>
Acked-by: NAndi Kleen <ak@muc.de>
Acked-by: NZwane Mwaikambo <zwane@arm.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

cb0cd8d4

[PATCH] x86_64: CPU hotplug support · 76e4f660

由 Ashok Raj 提交于 6月 25, 2005

  Experimental CPU hotplug patch for x86_64
  -----------------------------------------
This supports logical CPU online and offline.
- Test with maxcpus=1, and then kick other cpu's off to test if init code
  is all cleaned up. CONFIG_SCHED_SMT works as well.
- idle threads are forked on demand from keventd threads for clean startup

TBD:
1. Not tested on a real NUMA machine (tested with numa=fake=2)
2. Handle ACPI pieces for physical hotplug support.
Signed-off-by: NAshok Raj <ashok.raj@intel.com>
Acked-by: NAndi Kleen <ak@muc.de>
Acked-by: NZwane Mwaikambo <zwane@arm.linux.org.uk>
Signed-off-by: Shaohua.li<shaohua.li@intel.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

76e4f660

[PATCH] x86_64: Change init sections for CPU hotplug support · e6982c67

由 Ashok Raj 提交于 6月 25, 2005

This patch adds __cpuinit and __cpuinitdata sections that need to exist past
boot to support cpu hotplug.

Caveat: This is done *only* for EM64T CPU Hotplug support, on request from
Andi Kleen.  Much of the generic hotplug code in kernel, and none of the other
archs that support CPU hotplug today, i386, ia64, ppc64, s390 and parisc dont
mark sections with __cpuinit, but only mark them as __devinit, and
__devinitdata.

If someone is motivated to change generic code, we need to make sure all
existing hotplug code does not break, on other arch's that dont use __cpuinit,
and __cpudevinit.
Signed-off-by: NAshok Raj <ashok.raj@intel.com>
Acked-by: NAndi Kleen <ak@muc.de>
Acked-by: NZwane Mwaikambo <zwane@arm.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

e6982c67

24 6月, 2005 19 次提交

[PATCH] kprobes: Temporary disarming of reentrant probe for x86_64 · aa3d7e3d

由 Prasanna S Panchamukhi 提交于 6月 23, 2005

This patch includes x86_64 architecture specific changes to support temporary
disarming on reentrancy of probes.
Signed-of-by: NPrasanna S Panchamukhi <prasanna@in.ibm.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

aa3d7e3d

[PATCH] Move kprobe [dis]arming into arch specific code · 7e1048b1

由 Rusty Lynch 提交于 6月 23, 2005

The architecture independent code of the current kprobes implementation is
arming and disarming kprobes at registration time.  The problem is that the
code is assuming that arming and disarming is a just done by a simple write
of some magic value to an address.  This is problematic for ia64 where our
instructions look more like structures, and we can not insert break points
by just doing something like:

*p->addr = BREAKPOINT_INSTRUCTION;

The following patch to 2.6.12-rc4-mm2 adds two new architecture dependent
functions:

     * void arch_arm_kprobe(struct kprobe *p)
     * void arch_disarm_kprobe(struct kprobe *p)

and then adds the new functions for each of the architectures that already
implement kprobes (spar64/ppc64/i386/x86_64).

I thought arch_[dis]arm_kprobe was the most descriptive of what was really
happening, but each of the architectures already had a disarm_kprobe()
function that was really a "disarm and do some other clean-up items as
needed when you stumble across a recursive kprobe." So...  I took the
liberty of changing the code that was calling disarm_kprobe() to call
arch_disarm_kprobe(), and then do the cleanup in the block of code dealing
with the recursive kprobe case.

So far this patch as been tested on i386, x86_64, and ppc64, but still
needs to be tested in sparc64.
Signed-off-by: NRusty Lynch <rusty.lynch@intel.com>
Signed-off-by: NAnil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

7e1048b1

[PATCH] x86_64 specific function return probes · 73649dab

由 Rusty Lynch 提交于 6月 23, 2005

The following patch adds the x86_64 architecture specific implementation
for function return probes.

Function return probes is a mechanism built on top of kprobes that allows
a caller to register a handler to be called when a given function exits.
For example, to instrument the return path of sys_mkdir:

static int sys_mkdir_exit(struct kretprobe_instance *i, struct pt_regs *regs)
{
	printk("sys_mkdir exited\n");
	return 0;
}
static struct kretprobe return_probe = {
	.handler = sys_mkdir_exit,
};

<inside setup function>

return_probe.kp.addr = (kprobe_opcode_t *) kallsyms_lookup_name("sys_mkdir");
if (register_kretprobe(&return_probe)) {
	printk(KERN_DEBUG "Unable to register return probe!\n");
	/* do error path */
}

<inside cleanup function>
unregister_kretprobe(&return_probe);

The way this works is that:

* At system initialization time, kernel/kprobes.c installs a kprobe
  on a function called kretprobe_trampoline() that is implemented in
  the arch/x86_64/kernel/kprobes.c  (More on this later)

* When a return probe is registered using register_kretprobe(),
  kernel/kprobes.c will install a kprobe on the first instruction of the
  targeted function with the pre handler set to arch_prepare_kretprobe()
  which is implemented in arch/x86_64/kernel/kprobes.c.

* arch_prepare_kretprobe() will prepare a kretprobe instance that stores:
  - nodes for hanging this instance in an empty or free list
  - a pointer to the return probe
  - the original return address
  - a pointer to the stack address

  With all this stowed away, arch_prepare_kretprobe() then sets the return
  address for the targeted function to a special trampoline function called
  kretprobe_trampoline() implemented in arch/x86_64/kernel/kprobes.c

* The kprobe completes as normal, with control passing back to the target
  function that executes as normal, and eventually returns to our trampoline
  function.

* Since a kprobe was installed on kretprobe_trampoline() during system
  initialization, control passes back to kprobes via the architecture
  specific function trampoline_probe_handler() which will lookup the
  instance in an hlist maintained by kernel/kprobes.c, and then call
  the handler function.

* When trampoline_probe_handler() is done, the kprobes infrastructure
  single steps the original instruction (in this case just a top), and
  then calls trampoline_post_handler().  trampoline_post_handler() then
  looks up the instance again, puts the instance back on the free list,
  and then makes a long jump back to the original return instruction.

So to recap, to instrument the exit path of a function this implementation
will cause four interruptions:

  - A breakpoint at the very beginning of the function allowing us to
    switch out the return address
  - A single step interruption to execute the original instruction that
    we replaced with the break instruction (normal kprobe flow)
  - A breakpoint in the trampoline function where our instrumented function
    returned to
  - A single step interruption to execute the original instruction that
    we replaced with the break instruction (normal kprobe flow)
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

73649dab

[PATCH] xen: x86_64: use more usermode macro · 76381fee

由 Vincent Hanquez 提交于 6月 23, 2005

Make use of the user_mode macro where it's possible.  This is useful for Xen
because it will need only to redefine only the macro to a hypervisor call.
Signed-off-by: NVincent Hanquez <vincent.hanquez@cl.cam.ac.uk>
Cc: Ian Pratt <m+Ian.Pratt@cl.cam.ac.uk>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

76381fee

[PATCH] xen: x86_64: Add macro for debugreg · e9129e56

由 Vincent Hanquez 提交于 6月 23, 2005

Add 2 macros to set and get debugreg on x86_64.  This is useful for Xen
because it will need only to redefine each macro to a hypervisor call.
Signed-off-by: NVincent Hanquez <vincent.hanquez@cl.cam.ac.uk>
Cc: Ian Pratt <m+Ian.Pratt@cl.cam.ac.uk>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

e9129e56

[PATCH] x86_64: avoid wasting IRQs · 701067c4

由 Natalie Protasevich 提交于 6月 23, 2005

I suggest to change the way IRQs are handed out to PCI devices.

Currently, each I/O APIC pin gets associated with an IRQ, no matter if the
pin is used or not. It is expected that each pin can potentually be
engaged by a device inserted into the corresponding PCI slot. However,
this imposes severe limitation on systems that have designs that employ
many I/O APICs, only utilizing couple lines of each, such as P64H2 chipset.

It is used in ES7000, and currently, there is no way to boot the system
with more that 9 I/O APICs.

The simple change below allows to boot a system with say 64 (or more) I/O
APICs, each providing 1 slot, which otherwise impossible because of the IRQ
gaps created for unused lines on each I/O APIC. It does not resolve the
problem with number of devices that exceeds number of possible IRQs, but
eases up a tension for IRQs on any large system with potentually large
number of devices.

I only implemented this for the ACPI boot, since if the system is this big
and using newer chipsets it is probably (better be!) an ACPI based system
:). The change is completely "mechanical" and does not alter any internal
structures or interrupt model/implementation. The patch works for both
i386 and x86_64 archs. It works with MSIs just fine, and should not
intervene with implementations like shared vectors, when they get worked
out and incorporated.

To illustrate, below is the interrupt distribution for 2-cell ES7000 with
20 I/O APICs, and an Ethernet card in the last slot, which should be eth1
and which was not configured because its IRQ exceeded allowable number (it
actially turned out huge - 480!):

zorro-tb2:~ # cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7
0: 65716 30012 30007 30002 30009 30010 30010 30010 IO-APIC-edge timer
4: 373 0 725 280 0 0 0 0 IO-APIC-edge serial
8: 0 0 0 0 0 0 0 0 IO-APIC-edge rtc
9: 0 0 0 0 0 0 0 0 IO-APIC-level acpi
14: 39 3 0 0 0 0 0 0 IO-APIC-edge ide0
16: 108 13 0 0 0 0 0 0 IO-APIC-level uhci_hcd:usb1
18: 0 0 0 0 0 0 0 0 IO-APIC-level uhci_hcd:usb3
19: 15 0 0 0 0 0 0 0 IO-APIC-level uhci_hcd:usb2
23: 3 0 0 0 0 0 0 0 IO-APIC-level ehci_hcd:usb4
96: 4240 397 18 0 0 0 0 0 IO-APIC-level aic7xxx
97: 15 0 0 0 0 0 0 0 IO-APIC-level aic7xxx
192: 847 0 0 0 0 0 0 0 IO-APIC-level eth0
NMI: 0 0 0 0 0 0 0 0
LOC: 273423 274528 272829 274228 274092 273761 273827 273694
ERR: 7
MIS: 0

Even though the system doesn't have that many devices, some don't get
enabled only because of IRQ numbering model.

This is the IRQ picture after the patch was applied:

zorro-tb2:~ # cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7
0: 44169 10004 10004 10001 10004 10003 10004 6135 IO-APIC-edge timer
4: 345 0 0 0 0 244 0 0 IO-APIC-edge serial
8: 0 0 0 0 0 0 0 0 IO-APIC-edge rtc
9: 0 0 0 0 0 0 0 0 IO-APIC-level acpi
14: 39 0 3 0 0 0 0 0 IO-APIC-edge ide0
17: 4425 0 9 0 0 0 0 0 IO-APIC-level aic7xxx
18: 15 0 0 0 0 0 0 0 IO-APIC-level aic7xxx, uhci_hcd:usb3
21: 231 0 0 0 0 0 0 0 IO-APIC-level uhci_hcd:usb1
22: 26 0 0 0 0 0 0 0 IO-APIC-level uhci_hcd:usb2
23: 3 0 0 0 0 0 0 0 IO-APIC-level ehci_hcd:usb4
24: 348 0 0 0 0 0 0 0 IO-APIC-level eth0
25: 6 192 0 0 0 0 0 0 IO-APIC-level eth1
NMI: 0 0 0 0 0 0 0 0
LOC: 107981 107636 108899 108698 108489 108326 108331 108254
ERR: 7
MIS: 0

Not only we see the card in the last I/O APIC, but we are not even close to
using up available IRQs, since we didn't waste any.
Signed-off-by: NNatalie Protasevich <Natalie.Protasevich@unisys.com>
Acked-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

701067c4

[PATCH] x86_64: never block forced SIGSEGV · 0928d6ef

由 Roland McGrath 提交于 6月 23, 2005

This is the x86_64 version of the signal fix I just posted for i386.

This problem was first noticed on PPC and has already been fixed there.
But the exact same issue applies to other platforms in the same way. The
signal blocking for sa_mask and the handled signal takes place after the
handler setup. When the stack is bogus, the handler setup forces a
SIGSEGV. But then this will be blocked, and returning to user mode will
fault again and iterate. This patch fixes the problem by checking whether
signal handler setup failed, and not doing the signal-blocking if so. This
copies what was done in the ppc code. I think all architectures' signal
handler setup code follows this pattern and needs the change.
Signed-off-by: NRoland McGrath <roland@redhat.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

0928d6ef

[PATCH] x86_64: fix hpet for systems that don't support legacy replacement · a3a00751

由 john stultz 提交于 6月 23, 2005

Currently the x86-64 HPET code assumes the entire HPET implementation from
the spec is present.  This breaks on boxes that do not implement the
optional legacy timer replacement functionality portion of the spec.

This patch fixes this issue, allowing x86-64 systems that cannot use the
HPET for the timer interrupt and RTC to still use the HPET as a time
source.  I've tested this patch on a system systems without HPET, with HPET
but without legacy timer replacement, as well as HPET with legacy timer
replacement.

This version adds a minor check to cap the HPET counter value in
gettimeoffset_hpet to avoid possible time inconsistencies.  Please ignore
the A2 version I sent to you earlier.
Acked-by: NAndi Kleen <ak@muc.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

a3a00751

[PATCH] x86_64: i8259.c iso99 structure initialization · c0a88c98

由 Alexander Nyberg 提交于 6月 23, 2005

Cc: Andi Kleen <ak@muc.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

c0a88c98

[PATCH] i386: Selectable Frequency of the Timer Interrupt · 59121003

由 Christoph Lameter 提交于 6月 23, 2005

Make the timer frequency selectable. The timer interrupt may cause bus
and memory contention in large NUMA systems since the interrupt occurs
on each processor HZ times per second.
Signed-off-by: NChristoph Lameter <christoph@lameter.com>
Signed-off-by: NShai Fultheim <shai@scalex86.org>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

59121003

[PATCH] allow early printk to use more than 25 lines · 799d19f6

由 Jan Beulich 提交于 6月 23, 2005

Allow early printk code to take advantage of the full size of the screen, not
just the first 25 lines.
Signed-off-by: NJan Beulich <jbeulich@novell.com>
Acked-by: NAndi Kleen <ak@muc.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

799d19f6

[PATCH] x86/x86_64: pcibus_to_node · 8c5a0908

由 Christoph Lameter 提交于 6月 23, 2005

Define pcibus_to_node to be able to figure out which NUMA node contains a
given PCI device.  This defines pcibus_to_node(bus) in
include/linux/topology.h and adjusts the macros for i386 and x86_64 that
already provided a way to determine the cpumask of a pci device.

x86_64 was changed to not build an array of cpumasks anymore.  Instead an
array of nodes is build which can be used to generate the cpumask via
node_to_cpumask.
Signed-off-by: NChristoph Lameter <christoph@lameter.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

8c5a0908

[PATCH] Platform SMIs and their interferance with tsc based delay calibration · 8a9e1b0f

由 Venkatesh Pallipadi 提交于 6月 23, 2005

Issue:
Current tsc based delay_calibration can result in significant errors in
loops_per_jiffy count when the platform events like SMIs
(System Management Interrupts that are non-maskable) are present. This could
lead to potential kernel panic(). This issue is becoming more visible with 2.6
kernel (as default HZ is 1000) and on platforms with higher SMI handling
latencies. During the boot time, SMIs are mostly used by BIOS (for things
like legacy keyboard emulation).

Description:
The psuedocode for current delay calibration with tsc based delay looks like
(0) Estimate a value for loops_per_jiffy
(1) While (loops_per_jiffy estimate is accurate enough)
(2)   wait for jiffy transition (jiffy1)
(3)   Note down current tsc (tsc1)
(4)   loop until tsc becomes tsc1 + loops_per_jiffy
(5)   check whether jiffy changed since jiffy1 or not and refine
loops_per_jiffy estimate

Consider the following cases
Case 1:
If SMIs happen between (2) and (3) above, we can end up with a
loops_per_jiffy value that is too low. This results in shorted delays and
kernel can panic () during boot (Mostly at IOAPIC timer initialization
timer_irq_works() as we don't have enough timer interrupts in a specified
interval).

Case 2:
If SMIs happen between (3) and (4) above, then we can end up with a
loops_per_jiffy value that is too high. And with current i386 code, too
high lpj value (greater than 17M) can result in a overflow in
delay.c:__const_udelay() again resulting in shorter delay and panic().

Solution:
The patch below makes the calibration routine aware of asynchronous events
like SMIs. We increase the delay calibration time and also identify any
significant errors (greater than 12.5%) in the calibration and notify it to
user.

Patch below changes both i386 and x86-64 architectures to use this
new and improved calibrate_delay_direct() routine.
Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: NAdrian Bunk <bunk@stusta.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

8a9e1b0f

[PATCH] use ${CROSS_COMPILE}installkernel in arch/*/boot/install.sh · 0f8e2d62

由 Ian Campbell 提交于 6月 23, 2005

The attached patch causes the various arch specific install.sh scripts to
look for ${CROSS_COMPILE}installkernel rather than just installkernel (in
both /sbin/ and ~/bin/ where the script already did this).  This allows you
to have e.g.  arm-linux-installkernel as a handy way to install on your
cross target.  It also prevents the script picking up on the host
/sbin/installkernel which causes the script to fall through and do the
install itself (which is what I actually use myself, with $INSTALL_PATH
set).

I don't believe it causes back-compatibility problems since calling the
host installkernel was never likely to work or be what you wanted when
cross compiling anyway.  If $CROSS_COMPILE isn't set then nothing changes.

I only use ARM and i386 myself but I figured it couldn't hurt to do the
whole lot.  I've cc'd those who I hope are the arch maintainers for files
that I've touched.
Signed-off-by: NIan Campbell <icampbell@arcom.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

0f8e2d62

[PATCH] add x86-64 specific support for sparsemem · bbfceef4

由 Matt Tolentino 提交于 6月 23, 2005

This patch adds in the necessary support for sparsemem such that x86-64
kernels may use sparsemem as an alternative to discontigmem for NUMA
kernels.  Note that this does no preclude one from continuing to build NUMA
kernels using discontigmem, but merely allows the option to build NUMA
kernels with sparsemem.

Interestingly, the use of sparsemem in lieu of discontigmem in NUMA kernels
results in reduced text size for otherwise equivalent kernels as shown in
the example builds below:

   text	   data	    bss	    dec	    hex	filename
2371036	 765884	1237108	4374028	 42be0c	vmlinux.discontig
2366549	 776484	1302772	4445805	 43d66d	vmlinux.sparse
Signed-off-by: NMatt Tolentino <matthew.e.tolentino@intel.com>
Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

bbfceef4

[PATCH] reorganize x86-64 NUMA and DISCONTIGMEM config options · 2b97690f

由 Matt Tolentino 提交于 6月 23, 2005

In order to use the alternative sparsemem implmentation for NUMA kernels,
we need to reorganize the config options.  This patch effectively abstracts
out the CONFIG_DISCONTIGMEM options to CONFIG_NUMA in most cases.  Thus,
the discontigmem implementation may be employed as always, but the
sparsemem implementation may be used alternatively.
Signed-off-by: NMatt Tolentino <matthew.e.tolentino@intel.com>
Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

2b97690f

[PATCH] add x86-64 Kconfig options for sparsemem · 1035faf1

由 Matt Tolentino 提交于 6月 23, 2005

Add the requisite arch specific Kconfig options to enable the use of the
sparsemem implementation for NUMA kernels on x86-64.
Signed-off-by: NMatt Tolentino <matthew.e.tolentino@intel.com>
Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

1035faf1

[PATCH] remove direct ref to contig_page_data for x86-64 · 07332663

由 Matt Tolentino 提交于 6月 23, 2005

This patch pulls out all remaining direct references to contig_page_data
from arch/x86-64, thus saving an ifdef in one case.
Signed-off-by: NMatt Tolentino <matthew.e.tolentino@intel.com>
Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

07332663

[PATCH] make each arch use mm/Kconfig · 3f22ab27

由 Dave Hansen 提交于 6月 23, 2005

For all architectures, this just means that you'll see a "Memory Model"
choice in your architecture menu. For those that implement DISCONTIGMEM,
you may eventually want to make your ARCH_DISCONTIGMEM_ENABLE a "def_bool
y" and make your users select DISCONTIGMEM right out of the new choice
menu. The only disadvantage might be if you have some specific things that
you need in your help option to explain something about DISCONTIGMEM.
Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
Signed-off-by: NAdrian Bunk <bunk@stusta.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

3f22ab27

22 6月, 2005 3 次提交

[PATCH] Avoiding mmap fragmentation · 1363c3cd

由 Wolfgang Wander 提交于 6月 21, 2005

Ingo recently introduced a great speedup for allocating new mmaps using the
free_area_cache pointer which boosts the specweb SSL benchmark by 4-5% and
causes huge performance increases in thread creation.

The downside of this patch is that it does lead to fragmentation in the
mmap-ed areas (visible via /proc/self/maps), such that some applications
that work fine under 2.4 kernels quickly run out of memory on any 2.6
kernel.

The problem is twofold:

  1) the free_area_cache is used to continue a search for memory where
     the last search ended.  Before the change new areas were always
     searched from the base address on.

     So now new small areas are cluttering holes of all sizes
     throughout the whole mmap-able region whereas before small holes
     tended to close holes near the base leaving holes far from the base
     large and available for larger requests.

  2) the free_area_cache also is set to the location of the last
     munmap-ed area so in scenarios where we allocate e.g.  five regions of
     1K each, then free regions 4 2 3 in this order the next request for 1K
     will be placed in the position of the old region 3, whereas before we
     appended it to the still active region 1, placing it at the location
     of the old region 2.  Before we had 1 free region of 2K, now we only
     get two free regions of 1K -> fragmentation.

The patch addresses thes issues by introducing yet another cache descriptor
cached_hole_size that contains the largest known hole size below the
current free_area_cache.  If a new request comes in the size is compared
against the cached_hole_size and if the request can be filled with a hole
below free_area_cache the search is started from the base instead.

The results look promising: Whereas 2.6.12-rc4 fragments quickly and my
(earlier posted) leakme.c test program terminates after 50000+ iterations
with 96 distinct and fragmented maps in /proc/self/maps it performs nicely
(as expected) with thread creation, Ingo's test_str02 with 20000 threads
requires 0.7s system time.

Taking out Ingo's patch (un-patch available per request) by basically
deleting all mentions of free_area_cache from the kernel and starting the
search for new memory always at the respective bases we observe: leakme
terminates successfully with 11 distinctive hardly fragmented areas in
/proc/self/maps but thread creating is gringdingly slow: 30+s(!) system
time for Ingo's test_str02 with 20000 threads.

Now - drumroll ;-) the appended patch works fine with leakme: it ends with
only 7 distinct areas in /proc/self/maps and also thread creation seems
sufficiently fast with 0.71s for 20000 threads.
Signed-off-by: NWolfgang Wander <wwc@rentec.com>
Credit-to: "Richard Purdie" <rpurdie@rpsys.net>
Signed-off-by: NKen Chen <kenneth.w.chen@intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu> (partly)
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

1363c3cd

[PATCH] smp_processor_id() cleanup · 39c715b7

由 Ingo Molnar 提交于 6月 21, 2005

This patch implements a number of smp_processor_id() cleanup ideas that
Arjan van de Ven and I came up with.

The previous __smp_processor_id/_smp_processor_id/smp_processor_id API
spaghetti was hard to follow both on the implementational and on the
usage side.

Some of the complexity arose from picking wrong names, some of the
complexity comes from the fact that not all architectures defined
__smp_processor_id.

In the new code, there are two externally visible symbols:

 - smp_processor_id(): debug variant.

 - raw_smp_processor_id(): nondebug variant. Replaces all existing
   uses of _smp_processor_id() and __smp_processor_id(). Defined
   by every SMP architecture in include/asm-*/smp.h.

There is one new internal symbol, dependent on DEBUG_PREEMPT:

 - debug_smp_processor_id(): internal debug variant, mapped to
                             smp_processor_id().

Also, i moved debug_smp_processor_id() from lib/kernel_lock.c into a new
lib/smp_processor_id.c file.  All related comments got updated and/or
clarified.

I have build/boot tested the following 8 .config combinations on x86:

 {SMP,UP} x {PREEMPT,!PREEMPT} x {DEBUG_PREEMPT,!DEBUG_PREEMPT}

I have also build/boot tested x64 on UP/PREEMPT/DEBUG_PREEMPT.  (Other
architectures are untested, but should work just fine.)
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NArjan van de Ven <arjan@infradead.org>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

39c715b7

[PATCH] x86_64: TASK_SIZE fixes for compatibility mode processes · 84929801

由 Suresh Siddha 提交于 6月 21, 2005

Appended patch will setup compatibility mode TASK_SIZE properly.  This will
fix atleast three known bugs that can be encountered while running
compatibility mode apps.

a) A malicious 32bit app can have an elf section at 0xffffe000.  During
   exec of this app, we will have a memory leak as insert_vm_struct() is
   not checking for return value in syscall32_setup_pages() and thus not
   freeing the vma allocated for the vsyscall page.  And instead of exec
   failing (as it has addresses > TASK_SIZE), we were allowing it to
   succeed previously.

b) With a 32bit app, hugetlb_get_unmapped_area/arch_get_unmapped_area
   may return addresses beyond 32bits, ultimately causing corruption
   because of wrap-around and resulting in SEGFAULT, instead of returning
   ENOMEM.

c) 32bit app doing this below mmap will now fail.

  mmap((void *)(0xFFFFE000UL), 0x10000UL, PROT_READ|PROT_WRITE,
	MAP_FIXED|MAP_PRIVATE|MAP_ANON, 0, 0);
Signed-off-by: NZou Nan hai <nanhai.zou@intel.com>
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

84929801

09 6月, 2005 1 次提交

[PATCH] revert x86_64-use-the-e820-hole-to-map-the-iommu-agp-aperture · 42442ed5

由 Andrew Morton 提交于 6月 08, 2005

Martin Bligh determined that this patch is causing his test box to not boot.
Revert.

Andi Kleen <ak@muc.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

42442ed5

01 6月, 2005 2 次提交

[PATCH] x86_64 CONFIG_ACPI=n build fix · 9c2be6a0

由 Andi Kleen 提交于 5月 31, 2005

Make CONFIG_X86_PM_TIMER dependent on CONFIG_ACPI
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

9c2be6a0

[PATCH] x86_64: More fixes for compilation without CONFIG_ACPI · 8d916406

由 Andi Kleen 提交于 5月 31, 2005

Suggested by Alexander Nyberg
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

8d916406

29 5月, 2005 1 次提交

[PATCH] x86_64: signal.c build fix · c5924b7d

由 Oliver Korpilla 提交于 5月 27, 2005

For unspecified reasons, arch/x86_64/kernel/signal.c apparently needs
ia32_unistd.h.

Cc: Andi Kleen <ak@muc.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

c5924b7d

27 5月, 2005 1 次提交

[PATCH] Note on ACPI build fix · 8aadff7d

由 Alexander Nyberg 提交于 5月 27, 2005

Even after the previous fix you can still set CONFIG_ACPI_BOOT
indirectly even without CONFIG_ACPI by choosing CONFIG_PCI and
CONFIG_PCI_MMCONFIG.

That doesn't build very well either.

This makes PCI_MMCONFIG depend on ACPI, fixing that hole.

[ I guess in theory Kconfig could follow the whole chain of dependencies
  for things that get selected, but that sounds insanely complicated, so
  we'll just fix up these things by hand.  --Linus ]
Signed-off-by: NAlexander Nyberg <alexn@telia.com>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

8aadff7d

26 5月, 2005 1 次提交

[PATCH] x86_64: CONFIG_BUG=n fixes · 4f60fdf6

由 Alexander Nyberg 提交于 5月 25, 2005

Fixes some !CONFIG_BUG warnings:
include/asm/mmu_context.h: I funktion `switch_mm':
include/asm/mmu_context.h:57: varning: implicit declaration of function `out_of_line_bug'
Signed-off-by: NAlexander Nyberg <alexn@telia.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

4f60fdf6

21 5月, 2005 1 次提交

[PATCH] x86_64: i386/x86-64: Export cpu_core_map · 2df9fa36

由 Andi Kleen 提交于 5月 20, 2005

Needed for the powernow k8 driver for dual core support.
Signed-off-by: NAndi Kleen <ak@suse.de>
Cc: <mark.langsdorf@amd.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

2df9fa36

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功