提交 · b407fc57b815b2016186220baabc76cc8264206e · openeuler / raspberrypi-kernel

30 3月, 2009 3 次提交

x86/paravirt: flush pending mmu updates on context switch · b407fc57

由 Jeremy Fitzhardinge 提交于 2月 17, 2009

Impact: allow preemption during lazy mmu updates

If we're in lazy mmu mode when context switching, leave
lazy mmu mode, but remember the task's state in
TIF_LAZY_MMU_UPDATES.  When we resume the task, check this
flag and re-enter lazy mmu mode if its set.

This sets things up for allowing lazy mmu mode while preemptible,
though that won't actually be active until the next change.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>

b407fc57

x86/pvops: replace arch_enter_lazy_cpu_mode with arch_start_context_switch · 7fd7d83d

由 Jeremy Fitzhardinge 提交于 2月 17, 2009

Impact: simplification, prepare for later changes

Make lazy cpu mode more specific to context switching, so that
it makes sense to do more context-switch specific things in
the callbacks.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>

7fd7d83d

x86/paravirt: remove lazy mode in interrupts · b8bcfe99

由 Jeremy Fitzhardinge 提交于 2月 17, 2009

Impact: simplification, robustness

Make paravirt_lazy_mode() always return PARAVIRT_LAZY_NONE
when in an interrupt.  This prevents interrupt code from
accidentally inheriting an outer lazy state, and instead
does everything synchronously.  Outer batched operations
are left deferred.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>

b8bcfe99

18 3月, 2009 1 次提交

prevent boosting kprobes on exception address · 30390880

由 Masami Hiramatsu 提交于 3月 16, 2009

Don't boost at the addresses which are listed on exception tables,
because major page fault will occur on those addresses.  In that case,
kprobes can not ensure that when instruction buffer can be freed since
some processes will sleep on the buffer.

kprobes-ia64 already has same check.
Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

30390880

17 3月, 2009 2 次提交

Fast TSC calibration: calculate proper frequency error bounds · 9e8912e0

由 Linus Torvalds 提交于 3月 17, 2009

In order for ntpd to correctly synchronize the clocks, the frequency of
the system clock must not be off by more than 500 ppm (or, put another
way, 1:2000), or ntpd will end up giving up on trying to synchronize
properly, and ends up reseting the clock in jumps instead.

The fast TSC PIT calibration sometimes failed this test - it was
assuming that the PIT reads always took about one microsecond each (2us
for the two reads to get a 16-bit timer), and that calibrating TSC to
the PIT over 15ms should thus be sufficient to get much closer than
500ppm (max 2us error on both sides giving 4us over 15ms: a 270 ppm
error value).

However, that assumption does not always hold: apparently some hardware
is either very much slower at reading the PIT registers, or there was
other noise causing at least one machine to get 700+ ppm errors.

So instead of using a fixed 15ms timing loop, this changes the fast PIT
calibration to read the TSC delta over the individual PIT timer reads,
and use the result to calculate the error bars on the PIT read timing
properly.  We then successfully calibrate the TSC only if the maximum
error bars fall below 500ppm.

In the process, we also relax the timing to allow up to 25ms for the
calibration, although it can happen much faster depending on hardware.
Reported-and-tested-by: NJesper Krogh <jesper@krogh.cc>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Acked-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9e8912e0

Fix potential fast PIT TSC calibration startup glitch · a6a80e1d

由 Linus Torvalds 提交于 3月 17, 2009

During bootup, when we reprogram the PIT (programmable interval timer)
to start counting down from 0xffff in order to use it for the fast TSC
calibration, we should also make sure to delay a bit afterwards to allow
the PIT hardware to actually start counting with the new value.

That will happens at the next CLK pulse (1.193182 MHz), so the easiest
way to do that is to just wait at least one microsecond after
programming the new PIT counter value. We do that by just reading the
counter value back once - which will take about 2us on PC hardware.
Reported-and-tested-by: Njohn stultz <johnstul@us.ibm.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a6a80e1d

11 3月, 2009 1 次提交

x86, sched_clock(): mark variables read-mostly · f24ade3a

由 Ingo Molnar 提交于 3月 10, 2009

Impact: micro-optimization

There's a number of variables in the sched_clock() path that are
in .data/.bss - but not marked __read_mostly. This creates the
danger of accidental false cacheline sharing with some other,
write-often variable.

So mark them __read_mostly.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

f24ade3a

10 3月, 2009 3 次提交

percpu: generalize embedding first chunk setup helper · 66c3a757

由 Tejun Heo 提交于 3月 10, 2009

Impact: code reorganization

Separate out embedding first chunk setup helper from x86 embedding
first chunk allocator and put it in mm/percpu.c.  This will be used by
the default percpu first chunk allocator and possibly by other archs.
Signed-off-by: NTejun Heo <tj@kernel.org>

66c3a757

percpu: more flexibility for @dyn_size of pcpu_setup_first_chunk() · 6074d5b0

由 Tejun Heo 提交于 3月 10, 2009

Impact: cleanup, more flexibility for first chunk init

Non-negative @dyn_size used to be allowed iff @unit_size wasn't auto.
This restriction stemmed from implementation detail and made things a
bit less intuitive.  This patch allows @dyn_size to be specified
regardless of @unit_size and swaps the positions of @dyn_size and
@unit_size so that the parameter order makes more sense (static,
reserved and dyn sizes followed by enclosing unit_size).

While at it, add @unit_size >= PCPU_MIN_UNIT_SIZE sanity check.
Signed-off-by: NTejun Heo <tj@kernel.org>

6074d5b0

Revert "[CPUFREQ] Disable sysfs ui for p4-clockmod." · 129f8ae9

由 Dave Jones 提交于 3月 09, 2009

This reverts commit e088e4c9.

Removing the sysfs interface for p4-clockmod was flagged as a
regression in bug 12826.

Course of action:
 - Find out the remaining causes of overheating, and fix them
   if possible. ACPI should be doing the right thing automatically.
   If it isn't, we need to fix that.
 - mark p4-clockmod ui as deprecated
 - try again with the removal in six months.

It's not really feasible to printk about the deprecation, because
it needs to happen at all the sysfs entry points, which means adding
a lot of strcmp("p4-clockmod".. calls to the core, which.. bleuch.
Signed-off-by: NDave Jones <davej@redhat.com>

129f8ae9

08 3月, 2009 1 次提交

x86: UV: remove uv_flush_tlb_others() WARN_ON · 3a450de1

由 Cliff Wickman 提交于 3月 06, 2009

In uv_flush_tlb_others() (arch/x86/kernel/tlb_uv.c),
the "WARN_ON(!in_atomic())" fails if CONFIG_PREEMPT is not enabled.

And CONFIG_PREEMPT is not enabled by default in the distribution that
most UV owners will use.

We could #ifdef CONFIG_PREEMPT the warning, but that is not good form.
And there seems to be no suitable fix to in_atomic() when CONFIG_PREMPT
is not on.

As Ingo commented:

  > and we have no proper primitive to test for atomicity. (mainly
  > because we dont know about atomicity on a non-preempt kernel)

So we drop the WARN_ON.
Signed-off-by: NCliff Wickman <cpw@sgi.com>
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

3a450de1

06 3月, 2009 6 次提交

x86, pebs: correct qualifier passed to ds_write_config() from ds_request_pebs() · 73bf1b62

由 Markus Metzger 提交于 3月 05, 2009

ds_write_config() can write the BTS as well as the PEBS part of
the DS config. ds_request_pebs() passes the wrong qualifier, which
results in the wrong configuration to be written.
Reported-by: NStephane Eranian <eranian@googlemail.com>
Signed-off-by: NMarkus Metzger <markus.t.metzger@intel.com>
LKML-Reference: <20090305085721.A22550@sedona.ch.intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

73bf1b62

x86, bts: remove bad warning · 9ca0791d

由 Markus Metzger 提交于 3月 05, 2009

In case a ptraced task is reaped (while the tracer is still attached),
ds_exit_thread() is called before ptrace_exit(). The latter will
release the bts_tracer and remove the thread's ds_ctx.
The former will WARN() if the context is not NULL.

Oleg Nesterov submitted patches that move ptrace_exit() before
exit_thread() and thus reverse the order of the above calls.

Remove the bad warning. I will add it again when Oleg's changes are in.
Signed-off-by: NMarkus Metzger <markus.t.metzger@intel.com>
LKML-Reference: <20090305084954.A22000@sedona.ch.intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

9ca0791d

x86, percpu: setup reserved percpu area for x86_64 · 6b19b0c2

由 Tejun Heo 提交于 3月 06, 2009

Impact: fix relocation overflow during module load

x86_64 uses 32bit relocations for symbol access and static percpu
symbols whether in core or modules must be inside 2GB of the percpu
segement base which the dynamic percpu allocator doesn't guarantee.
This patch makes x86_64 reserve PERCPU_MODULE_RESERVE bytes in the
first chunk so that module percpu areas are always allocated from the
first chunk which is always inside the relocatable range.

This problem exists for any percpu allocator but is easily triggered
when using the embedding allocator because the second chunk is located
beyond 2GB on it.

This patch also changes the meaning of PERCPU_DYNAMIC_RESERVE such
that it only indicates the size of the area to reserve for dynamic
allocation as static and dynamic areas can be separate.  New
PERCPU_DYNAMIC_RESERVED is increased by 4k for both 32 and 64bits as
the reserved area separation eats away some allocatable space and
having slightly more headroom (currently between 4 and 8k after
minimal boot sans module area) makes sense for common case
performance.

x86_32 can address anywhere from anywhere and doesn't need reserving.

Mike Galbraith first reported the problem first and bisected it to the
embedding percpu allocator commit.
Signed-off-by: NTejun Heo <tj@kernel.org>
Reported-by: NMike Galbraith <efault@gmx.de>
Reported-by: NJaswinder Singh Rajput <jaswinder@kernel.org>

6b19b0c2

percpu, module: implement reserved allocation and use it for module percpu variables · edcb4639

由 Tejun Heo 提交于 3月 06, 2009

Impact: add reserved allocation functionality and use it for module
	percpu variables

This patch implements reserved allocation from the first chunk.  When
setting up the first chunk, arch can ask to set aside certain number
of bytes right after the core static area which is available only
through a separate reserved allocator.  This will be used primarily
for module static percpu variables on architectures with limited
relocation range to ensure that the module perpcu symbols are inside
the relocatable range.

If reserved area is requested, the first chunk becomes reserved and
isn't available for regular allocation.  If the first chunk also
includes piggy-back dynamic allocation area, a separate chunk mapping
the same region is created to serve dynamic allocation.  The first one
is called static first chunk and the second dynamic first chunk.
Although they share the page map, their different area map
initializations guarantee they serve disjoint areas according to their
purposes.

If arch doesn't setup reserved area, reserved allocation is handled
like any other allocation.
Signed-off-by: NTejun Heo <tj@kernel.org>

edcb4639

x86: make embedding percpu allocator return excessive free space · 9a4f8a87

由 Tejun Heo 提交于 3月 06, 2009

Impact: reduce unnecessary memory usage on certain configurations

Embedding percpu allocator allocates unit_size *
smp_num_possible_cpus() bytes consecutively and use it for the first
chunk.  However, if the static area is small, this can result in
excessive prellocated free space in the first chunk due to
PCPU_MIN_UNIT_SIZE restriction.

This patch makes embedding percpu allocator preallocate only what's
necessary as described by PERPCU_DYNAMIC_RESERVE and return the
leftover to the bootmem allocator.
Signed-off-by: NTejun Heo <tj@kernel.org>

9a4f8a87

percpu: use negative for auto for pcpu_setup_first_chunk() arguments · cafe8816

由 Tejun Heo 提交于 3月 06, 2009

Impact: argument semantic cleanup

In pcpu_setup_first_chunk(), zero @unit_size and @dyn_size meant
auto-sizing.  It's okay for @unit_size as 0 doesn't make sense but 0
dynamic reserve size is valid.  Alos, if arch @dyn_size is calculated
from other parameters, it might end up passing in 0 @dyn_size and
malfunction when the size is automatically adjusted.

This patch makes both @unit_size and @dyn_size ssize_t and use -1 for
auto sizing.
Signed-off-by: NTejun Heo <tj@kernel.org>

cafe8816

05 3月, 2009 5 次提交

[CPUFREQ] Prevent p4-clockmod from auto-binding to the ondemand governor. · 36e8abf3

由 Dave Jones 提交于 3月 05, 2009

The latency of p4-clockmod sucks so hard that scaling on a regular
basis with ondemand is a really bad idea.
Signed-off-by: NMatthew Garrett <mjg59@srcf.ucam.org>
Signed-off-by: NDave Jones <davej@redhat.com>

36e8abf3

x86: add Dell XPS710 reboot quirk · dd4124a8

由 Leann Ogasawara 提交于 3月 04, 2009

Dell XPS710 will hang on reboot.  This is resolved by adding a quirk to
set bios reboot.
Signed-off-by: NLeann Ogasawara <leann.ogasawara@canonical.com>
Signed-off-by: NTim Gardner <tim.gardner@canonical.com>
Cc: "manoj.iyer" <manoj.iyer@canonical.com>
Cc: <stable@kernel.org>
LKML-Reference: <1236196380.3231.89.camel@emiko>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

dd4124a8

x86, math-emu: fix init_fpu for task != current · ab9e1858

由 Daniel Glöckner 提交于 3月 04, 2009

Impact: fix math-emu related crash while using GDB/ptrace

init_fpu() calls finit to initialize a task's xstate, while finit always
works on the current task. If we use PTRACE_GETFPREGS on another
process and both processes did not already use floating point, we get
a null pointer exception in finit.

This patch creates a new function finit_task that takes a task_struct
parameter. finit becomes a wrapper that simply calls finit_task with
current. On the plus side this avoids many calls to get_current which
would each resolve to an inline assembler mov instruction.

An empty finit_task has been added to i387.h to avoid linker errors in
case the compiler still emits the call in init_fpu when
CONFIG_MATH_EMULATION is not defined.

The declaration of finit in i387.h has been removed as the remaining
code using this function gets its prototype from fpu_proto.h.
Signed-off-by: NDaniel Glöckner <dg@emlix.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: "Pallipadi Venkatesh" <venkatesh.pallipadi@intel.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Bill Metzenthen <billm@melbpc.org.au>
LKML-Reference: <E1Lew31-0004il-Fg@mailer.emlix.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

ab9e1858

x86: EFI: Back efi_ioremap with init_memory_mapping instead of FIX_MAP · dd39ecf5

由 Huang Ying 提交于 3月 04, 2009

Impact: Fix boot failure on EFI system with large runtime memory range

Brian Maly reported that some EFI system with large runtime memory
range can not boot. Because the FIX_MAP used to map runtime memory
range is smaller than run time memory range.

This patch fixes this issue by re-implement efi_ioremap() with
init_memory_mapping().
Reported-and-tested-by: NBrian Maly <bmaly@redhat.com>
Signed-off-by: NHuang Ying <ying.huang@intel.com>
Cc: Brian Maly <bmaly@redhat.com>
Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <1236135513.6204.306.camel@yhuang-dev.sh.intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

dd39ecf5

x86: fix DMI on EFI · ff0c0874

由 Brian Maly 提交于 3月 03, 2009

Impact: reactivate DMI quirks on EFI hardware

DMI tables are loaded by EFI, so the dmi calls must happen after
efi_init() and not before.

Currently Apple hardware uses DMI to determine the framebuffer mappings
for efifb. Without DMI working you also have no video on MacBook Pro.

This patch resolves the DMI issue for EFI hardware (DMI is now properly
detected at boot), and additionally efifb now loads on Apple hardware
(i.e. video works).
Signed-off-by: NBrian Maly <bmaly@redhat>
Acked-by: NYinghai Lu <yinghai@kernel.org>
Cc: ying.huang@intel.com
LKML-Reference: <49ADEDA3.1030406@redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

 arch/x86/kernel/setup.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

ff0c0874

03 3月, 2009 2 次提交

x86, signals: fix xine & firefox bustage · 25051702

由 Hiroshi Shimamoto 提交于 3月 02, 2009

Impact: fix bad frame in rt_sigreturn on 64-bit

After commit 97286a2b some applications
fail to return from signal handler:

[  145.150133] firefox[3250] bad frame in rt_sigreturn frame:00007f902b44eb28 ip:352e80b307 sp:7f902b44ef70 orax:ffffffffffffffff in libpthread-2.9.so[352e800000+17000]
[  665.519017] firefox[5420] bad frame in rt_sigreturn frame:00007faa8deaeb28 ip:352e80b307 sp:7faa8deaef70 orax:ffffffffffffffff in libpthread-2.9.so[352e800000+17000]

The root cause is forgetting to keep 64 byte aligned value of
fpstate for next stack pointer calculation.
Reported-by: NJaswinder Singh Rajput <jaswinder@kernel.org>
Reported-by: NMike Galbraith <efault@gmx.de>
Signed-off-by: NHiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
LKML-Reference: <49AC85C1.7060600@ct.jp.nec.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

25051702

x86-64: syscall-audit: fix 32/64 syscall hole · ccbe495c

由 Roland McGrath 提交于 2月 27, 2009

On x86-64, a 32-bit process (TIF_IA32) can switch to 64-bit mode with
ljmp, and then use the "syscall" instruction to make a 64-bit system
call. A 64-bit process make a 32-bit system call with int $0x80.

In both these cases, audit_syscall_entry() will use the wrong system
call number table and the wrong system call argument registers. This
could be used to circumvent a syscall audit configuration that filters
based on the syscall numbers or argument details.
Signed-off-by: NRoland McGrath <roland@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ccbe495c

02 3月, 2009 9 次提交

x86: unify chunks of kernel/process*.c · 389d1fb1

由 Jeremy Fitzhardinge 提交于 2月 27, 2009

With x86-32 and -64 using the same mechanism for managing the
tss io permissions bitmap, large chunks of process*.c are
trivially unifyable, including:

 - exit_thread
 - flush_thread
 - __switch_to_xtra (along with tsc enable/disable)

and as bonus pickups:

 - sys_fork
 - sys_vfork

(Note: asmlinkage expands to empty on x86-64)
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

389d1fb1

x86-32: use non-lazy io bitmap context switching · db949bba

由 Jeremy Fitzhardinge 提交于 2月 27, 2009

Impact: remove 32-bit optimization to prepare unification

x86-32 and -64 differ in the way they context-switch tasks
with io permission bitmaps.  x86-64 simply copies the next
tasks io bitmap into place (if any) on context switch.  x86-32
invalidates the bitmap on context switch, so that the next
IO instruction will fault; at that point it installs the
appropriate IO bitmap.

This makes context switching IO-bitmap-using tasks a bit more
less expensive, at the cost of making the next IO instruction
slower due to the extra fault.  This tradeoff only makes sense
if IO-bitmap-using processes are relatively common, but they
don't actually use IO instructions very often.

However, in a typical desktop system, the only process likely
to be using IO bitmaps is the X server, and nothing at all on
a server.  Therefore the lazy context switch doesn't really win
all that much, and its just a gratuitious difference from
64-bit code.

This patch removes the lazy context switch, with a view to
unifying this code in a later change.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

db949bba

x86_32: apic/numaq_32, fix section mismatch · b6122b38

由 Jiri Slaby 提交于 3月 02, 2009

Remove __cpuinitdata section placement for translation_table
structure, since it is referenced from a functions within .text.
Signed-off-by: NJiri Slaby <jirislaby@gmail.com>
Cc: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Cc: H. Peter Anvin <hpa@zytor.com>

b6122b38

x86_32: apic/summit_32, fix section mismatch · 2fcb1f1f

由 Jiri Slaby 提交于 3月 02, 2009

Remove __init section placement for some functions/data, so that
we don't get section mismatch warnings.

Also make inline function instead of empty setup_summit macro.

[v2]
One of them was not caught by
DEBUG_SECTION_MISMATCH=y
magic. Fix it.
Signed-off-by: NJiri Slaby <jirislaby@gmail.com>
Cc: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Cc: H. Peter Anvin <hpa@zytor.com>

2fcb1f1f

x86_32: apic/es7000_32, fix section mismatch · 871d78c6

由 Jiri Slaby 提交于 3月 02, 2009

Remove __init section placement for some functions, so that we don't
get section mismatch warnings.

[v2]:
2 of them were not caught by
DEBUG_SECTION_MISMATCH=y
magic. Fix it.
Signed-off-by: NJiri Slaby <jirislaby@gmail.com>
Cc: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Cc: H. Peter Anvin <hpa@zytor.com>

871d78c6

x86_32: apic/summit_32, fix cpu_mask_to_apicid · fae176d6

由 Jiri Slaby 提交于 3月 02, 2009

Perform same-cluster checking even for masks with all (nr_cpu_ids)
bits set and report correct apicid on success instead.

While at it, convert it to for_each_cpu and newer cpumask api.
Signed-off-by: NJiri Slaby <jirislaby@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

fae176d6

x86_32: apic/es7000_32, fix cpu_mask_to_apicid · 0edc0b32

由 Jiri Slaby 提交于 3月 02, 2009

Perform same-cluster checking even for masks with all (nr_cpu_ids)
bits set and report BAD_APICID on failure.

While at it, convert it to for_each_cpu.
Signed-off-by: NJiri Slaby <jirislaby@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

0edc0b32

x86_32: apic/es7000_32, cpu_mask_to_apicid cleanup · c2b20cbd

由 Jiri Slaby 提交于 3月 02, 2009

Remove es7000_cpu_mask_to_apicid_cluster completely, because it's
almost the same as es7000_cpu_mask_to_apicid except 2 code paths.
One of them is about to be removed soon, the another should be
BAD_APICID (it's a fail path).

The _cluster one was not invoked on apic->cpu_mask_to_apicid_and
anyway, since there was no _cluster_and variant.

Also use newer cpumask functions.
Signed-off-by: NJiri Slaby <jirislaby@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

c2b20cbd

x86_32: apic/bigsmp_32, de-inline functions · 9694cd6c

由 Jiri Slaby 提交于 3月 02, 2009

The ones which go only into struct apic are de-inlined
by compiler anyway, so remove the inline specifier from them.

Afterwards, remove bigsmp_setup_portio_remap completely as it
is unused.
Signed-off-by: NJiri Slaby <jirislaby@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

9694cd6c

01 3月, 2009 1 次提交

x86: remove double copy of show_cpuinfo_core for 32 and 64 bit · 327f4387

由 Jaswinder Singh Rajput 提交于 2月 28, 2009

Impact: unification

show_cpuinfo_core is identical for 32 and 64 bit and can be unified,
and CONFIG_X86_HT inherently depends on CONFIG_X86_SMP.
Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

327f4387

28 2月, 2009 5 次提交

x86: signal: introduce helper align_sigframe() · 1fae0279

由 Hiroshi Shimamoto 提交于 2月 27, 2009

Impact: cleanup

Introduce helper align_sigframe() to align stack pointer for signal frame.
Signed-off-by: NHiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

1fae0279

x86: signal: unify get_sigframe() · 75779f05

由 Hiroshi Shimamoto 提交于 2月 27, 2009

Impact: cleanup

Unify get_sigframe().
Signed-off-by: NHiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

75779f05

x86: signal: use 16 bytes boundary for rt_sigframe · 36a45265

由 Hiroshi Shimamoto 提交于 2月 27, 2009

Impact: cleanup

Supporting xsave/xrestore introduces 64 bytes boundary for save_i387_xstate().
16 bytes boundary is OK for rt_sigframe.
Signed-off-by: NHiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

36a45265

x86: signal: intrroduce get_sigframe() and replace get_sigstack() · 97286a2b

由 Hiroshi Shimamoto 提交于 2月 27, 2009

Impact: cleanup

Introduce get_sigframe() like 32-bit to replace get_sigstack().
Move the i387 stuff into get_sigframe().
Signed-off-by: NHiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

97286a2b

x86: signal: add __user annotation · 144b0712

由 Hiroshi Shimamoto 提交于 2月 27, 2009

Impact: cleanup

Add missing __user annotation to the parameter of get_sigframe().
Also change cast type to void __user * of *fpstate.
Signed-off-by: NHiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

144b0712

27 2月, 2009 1 次提交

x86: set X86_FEATURE_TSC_RELIABLE · 83ce4009

由 Ingo Molnar 提交于 2月 26, 2009

If the TSC is constant and non-stop, also set it reliable.

(We will turn this off in DMI quirks for multi-chassis systems)

The performance number on a 16-way Nehalem system running
32 tasks that context-switch between each other is significant:

   sched_clock_stable=0		sched_clock_stable=1
   ....................         ....................
   22.456925 million/sec        24.306972 million/sec   [+8.2%]

lmbench's "lat_ctx -s 0 2" goes from 0.63 microseconds to
0.59 microseconds - a 6.7% increase in context-switching
performance.

Perfstat of 1 million pipe context switches between two tasks:

 Performance counter stats for './pipe-test-1m':

       [before]           [after]
   ............      ............
   37621.421089      36436.848378    task clock ticks     (msecs)

              0                 0    CPU migrations       (events)
        2000274           2000189    context switches     (events)
            194               193    pagefaults           (events)
     8433799643        8171016416    CPU cycles           (events) -3.21%
     8370133368        8180999694    instructions         (events) -2.31%
        4158565           3895941    cache references     (events) -6.74%
          44312             46264    cache misses         (events)

    2349.287976       2279.362465    wall-time            (msecs)  -3.06%

The speedup comes straight from the reduction in the instruction
count. sched_clock_cpu() got simpler and the whole workload thus
executes faster.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

83ce4009