提交 · 81819f0fc8285a2a5a921c019e3e3d7b6169d225 · openanolis / cloud-kernel

08 5月, 2007 1 次提交

SLUB core · 81819f0f

由 Christoph Lameter 提交于 5月 06, 2007

This is a new slab allocator which was motivated by the complexity of the
existing code in mm/slab.c. It attempts to address a variety of concerns
with the existing implementation.

A. Management of object queues

   A particular concern was the complex management of the numerous object
   queues in SLAB. SLUB has no such queues. Instead we dedicate a slab for
   each allocating CPU and use objects from a slab directly instead of
   queueing them up.

B. Storage overhead of object queues

   SLAB Object queues exist per node, per CPU. The alien cache queue even
   has a queue array that contain a queue for each processor on each
   node. For very large systems the number of queues and the number of
   objects that may be caught in those queues grows exponentially. On our
   systems with 1k nodes / processors we have several gigabytes just tied up
   for storing references to objects for those queues  This does not include
   the objects that could be on those queues. One fears that the whole
   memory of the machine could one day be consumed by those queues.

C. SLAB meta data overhead

   SLAB has overhead at the beginning of each slab. This means that data
   cannot be naturally aligned at the beginning of a slab block. SLUB keeps
   all meta data in the corresponding page_struct. Objects can be naturally
   aligned in the slab. F.e. a 128 byte object will be aligned at 128 byte
   boundaries and can fit tightly into a 4k page with no bytes left over.
   SLAB cannot do this.

D. SLAB has a complex cache reaper

   SLUB does not need a cache reaper for UP systems. On SMP systems
   the per CPU slab may be pushed back into partial list but that
   operation is simple and does not require an iteration over a list
   of objects. SLAB expires per CPU, shared and alien object queues
   during cache reaping which may cause strange hold offs.

E. SLAB has complex NUMA policy layer support

   SLUB pushes NUMA policy handling into the page allocator. This means that
   allocation is coarser (SLUB does interleave on a page level) but that
   situation was also present before 2.6.13. SLABs application of
   policies to individual slab objects allocated in SLAB is
   certainly a performance concern due to the frequent references to
   memory policies which may lead a sequence of objects to come from
   one node after another. SLUB will get a slab full of objects
   from one node and then will switch to the next.

F. Reduction of the size of partial slab lists

   SLAB has per node partial lists. This means that over time a large
   number of partial slabs may accumulate on those lists. These can
   only be reused if allocator occur on specific nodes. SLUB has a global
   pool of partial slabs and will consume slabs from that pool to
   decrease fragmentation.

G. Tunables

   SLAB has sophisticated tuning abilities for each slab cache. One can
   manipulate the queue sizes in detail. However, filling the queues still
   requires the uses of the spin lock to check out slabs. SLUB has a global
   parameter (min_slab_order) for tuning. Increasing the minimum slab
   order can decrease the locking overhead. The bigger the slab order the
   less motions of pages between per CPU and partial lists occur and the
   better SLUB will be scaling.

G. Slab merging

   We often have slab caches with similar parameters. SLUB detects those
   on boot up and merges them into the corresponding general caches. This
   leads to more effective memory use. About 50% of all caches can
   be eliminated through slab merging. This will also decrease
   slab fragmentation because partial allocated slabs can be filled
   up again. Slab merging can be switched off by specifying
   slub_nomerge on boot up.

   Note that merging can expose heretofore unknown bugs in the kernel
   because corrupted objects may now be placed differently and corrupt
   differing neighboring objects. Enable sanity checks to find those.

H. Diagnostics

   The current slab diagnostics are difficult to use and require a
   recompilation of the kernel. SLUB contains debugging code that
   is always available (but is kept out of the hot code paths).
   SLUB diagnostics can be enabled via the "slab_debug" option.
   Parameters can be specified to select a single or a group of
   slab caches for diagnostics. This means that the system is running
   with the usual performance and it is much more likely that
   race conditions can be reproduced.

I. Resiliency

   If basic sanity checks are on then SLUB is capable of detecting
   common error conditions and recover as best as possible to allow the
   system to continue.

J. Tracing

   Tracing can be enabled via the slab_debug=T,<slabcache> option
   during boot. SLUB will then protocol all actions on that slabcache
   and dump the object contents on free.

K. On demand DMA cache creation.

   Generally DMA caches are not needed. If a kmalloc is used with
   __GFP_DMA then just create this single slabcache that is needed.
   For systems that have no ZONE_DMA requirement the support is
   completely eliminated.

L. Performance increase

   Some benchmarks have shown speed improvements on kernbench in the
   range of 5-10%. The locking overhead of slub is based on the
   underlying base allocation size. If we can reliably allocate
   larger order pages then it is possible to increase slub
   performance much further. The anti-fragmentation patches may
   enable further performance increases.

Tested on:
i386 UP + SMP, x86_64 UP + SMP + NUMA emulation, IA64 NUMA + Simulator

SLUB Boot options

slub_nomerge		Disable merging of slabs
slub_min_order=x	Require a minimum order for slab caches. This
			increases the managed chunk size and therefore
			reduces meta data and locking overhead.
slub_min_objects=x	Mininum objects per slab. Default is 8.
slub_max_order=x	Avoid generating slabs larger than order specified.
slub_debug		Enable all diagnostics for all caches
slub_debug=<options>	Enable selective options for all caches
slub_debug=<o>,<cache>	Enable selective options for a certain set of
			caches

Available Debug options
F		Double Free checking, sanity and resiliency
R		Red zoning
P		Object / padding poisoning
U		Track last free / alloc
T		Trace all allocs / frees (only use for individual slabs).

To use SLUB: Apply this patch and then select SLUB as the default slab
allocator.

[hugh@veritas.com: fix an oops-causing locking error]
[akpm@linux-foundation.org: various stupid cleanups and small fixes]
Signed-off-by: NChristoph Lameter <clameter@sgi.com>
Signed-off-by: NHugh Dickins <hugh@veritas.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

81819f0f

03 5月, 2007 5 次提交

msi: introduce ARCH_SUPPORTS_MSI Kconfig option (rev2) · f282b970

由 Dan Williams 提交于 4月 18, 2007

Allows architectures to advertise that they support MSI rather than listing
each architecture as a PCI_MSI dependency.
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

f282b970

[PATCH] i386: Add missing !X86_PAE dependincy to the 2G/2G split. · 6c2af358

由 Bill Irwin 提交于 5月 02, 2007

Only 1GB-aligned kernel/user splits are now handled for PAE. The
2GB/2GB split attempts to avoid aliasing vmallocspace with the 1:1
mapping for physical memory by using an actual split of 1.875/2.125
to accommodate 128MB of vmallocspace out of what would otherwise
be a full 2GB for userspace. That attempt disturbs the alignment
required by PAE for 2GB/2GB splits, and furthermore does not provide
a 2GB/2GB split as advertised.

This patch resolves the issues here in two manners. The first is
by providing a true 2GB/2GB split in addition to the 1.875/2.125
split. The second is by renaming the 1.875/2.125 split to
CONFIG_VMSPLIT_2G_OPT analogously to CONFIG_VMSPLIT_3G_OPT, which
performs a similar manuever to avoid aliasing vmallocspace with
the 1:1 mapping for physical memory around the 3GB boundary. With
the 1.875/2.125 split properly-named, its config option is then
tagged as depending on !HIGHMEM to express the PAE implementation's
current inability to deal with such unaligned splits.

This patch is essentially a combination of two patches, one written
by Eric Biederman and the other by Eric Dumazet. If they could add
their Signed-off-by: to this, I'd be much obliged.
Signed-off-by: NWilliam Irwin <wli@holomorphy.com>
Signed-off-by: NAndi Kleen <ak@suse.de>
Cc: Eric Dumazet <dada1@cosmosbay.com>
Cc: Mark Lord <lkml@rtr.ca>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Andi Kleen <ak@suse.de>

6c2af358

Z
[PATCH] i386: Now that the VDSO can be relocated, we can support it in VMI configurations. · 9f53a729
由 Zachary Amsden 提交于 5月 02, 2007
```
Signed-off-by: NZachary Amsden <zach@vmware.com>
Signed-off-by: NAndi Kleen <ak@suse.de>
```
9f53a729

[PATCH] i386: Use menuconfig objects - APM · 79463318

由 Jan Engelhardt 提交于 5月 02, 2007

(I hope Andi is the right one to Cc, otherwise please add, thanks!)

Use menuconfigs instead of menus, so the whole menu can be disabled at
once instead of going through all options.
Signed-off-by: NJan Engelhardt <jengelh@gmx.de>
Signed-off-by: NAndi Kleen <ak@suse.de>

79463318

[PATCH] i386: remove the APM_RTC_IS_GMT config option. · 05f36927

由 Parag Warudkar 提交于 5月 02, 2007

Signed-off-by: NParag Warudkar <parag.warudkar@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NAndi Kleen <ak@suse.de>

05f36927

13 3月, 2007 1 次提交

[PATCH] Fix VMI and COMPAT_VDSO for 2.6.21 · b6bc5d71

由 Zachary Amsden 提交于 3月 09, 2007

VMI is broken under COMPAT_VDSO, as Xen and other non hardware assisted
hypervisors will be.  I have been working on a fix for this which works
for older glibcs that panic when the new relocatable VDSO is used.

However, I believe at this time that the fix is going to be too radical
to consider at this stage in the release of 2.6.21.  We don't expect
this config option to be turned on by vendors for new distributions, so
at this point we are willing to drop support for it when VMI is compiled
in, and work on a patch for 2.6.22 which more fully addresses the
problem.
Signed-off-by: NZachary Amsden <zach@vmware.com>
Acked-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b6bc5d71

06 3月, 2007 4 次提交

[PATCH] paravirt: re-enable COMPAT_VDSO · c3442e29

由 Ingo Molnar 提交于 3月 05, 2007

CONFIG_PARAVIRT broke old glibc bootup: it silently turned off the
selectability of CONFIG_COMPAT_VDSO and thus rendered distro kernels
unbootable on old-style VDSO glibc setups.

the proper solution is to keep COMPAT_VDSO available - if a hypervisor
needs any modification of that concept then we'll judge those changes in
full context, once those changes are submitted.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c3442e29

[PATCH] paravirt: let users decide whether they want VMI · 0d05ad2c

由 Ingo Molnar 提交于 3月 05, 2007

do not use default=y for CONFIG_VMI (we do not do that for any driver or
special-hardware feature): the overwhelming majority of Linux users does
not need it, and interested users and distributions can enable it
as-needed.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0d05ad2c

[PATCH] paravirt: clarify VMI description · e9417fb3

由 Ingo Molnar 提交于 3月 05, 2007

Clarify the description of the CONFIG_VMI option: describe the reality
that VMI is a VMWare-only interface for now. Once that changes and
another hypervisor adopts the VMI ABI we can change the text.

As can be seen from the Xen paravirtualization patches submitted to lkml
the Xen project has chosen its own, non-VMI interface between Xen and
the para-Linux - so remove Xen from the description.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e9417fb3

[PATCH] paravirt: remove NO_IDLE_HZ on x86 · 3f1a73b6

由 Ingo Molnar 提交于 3月 05, 2007

Temove the mistaken turning on of NO_IDLE_HZ on x86+PARAVIRT kernels.

It's an obsolete, limited form of dynticks.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3f1a73b6

05 3月, 2007 1 次提交

[PATCH] vmi: fix nohz compile · a9eddc95

由 Zachary Amsden 提交于 3月 05, 2007

More goo from hrtimers integration.  We do compile and run properly with NO_HZ
enabled.  There was a period when we didn't because of a missing export, but
that was since fixed.

And with the clocksource code now firmly in place, we can get rid of code that
fixes up the wallclock, since this is done in the common infrastructure.  This
actually fixes a timer bug as well, that was caused by do_settimeofday no
longer being callable with interrupts disabled due to the use of
on_each_cpu().
Signed-off-by: NZachary Amsden <zach@vmware.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a9eddc95

20 2月, 2007 1 次提交

[PATCH] tick management: make broadcast dependent on local APIC · 575d5e72

由 Thomas Gleixner 提交于 2月 17, 2007

The broadcast functionality is only necessary when a local APIC is
available. Make the config switch depend on X86_LOCAL_APIC. This
resolves the mach-voyager breakage introduced by the tick managament
code.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

575d5e72

17 2月, 2007 4 次提交

[PATCH] i386: enable dynticks in kconfig · d40891e7

由 Ingo Molnar 提交于 2月 16, 2007

Enable dynamic ticks selection.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d40891e7

[PATCH] clockevents: i386 drivers · e9e2cdb4

由 Thomas Gleixner 提交于 2月 16, 2007

Add clockevent drivers for i386: lapic (local) and PIT/HPET (global).  Update
the timer IRQ to call into the PIT/HPET driver's event handler and the
lapic-timer IRQ to call into the lapic clockevent driver.  The assignement of
timer functionality is delegated to the core framework code and replaces the
compile and runtime evalution in do_timer_interrupt_hook()

Use the clockevents broadcast support and implement the lapic_broadcast
function for ACPI.

No changes to existing functionality.

[ kdump fix from Vivek Goyal <vgoyal@in.ibm.com> ]
[ fixes based on review feedback from Arjan van de Ven <arjan@infradead.org> ]
Cleanups-from: Adrian Bunk <bunk@stusta.de>
Build-fixes-from: Andrew Morton <akpm@osdl.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e9e2cdb4

[PATCH] clocksource: Add verification (watchdog) helper · 5d8b34fd

由 Thomas Gleixner 提交于 2月 16, 2007

The TSC needs to be verified against another clocksource.  Instead of using
hardwired assumptions of available hardware, provide a generic verification
mechanism.  The verification uses the best available clocksource and handles
the usability for high resolution timers / dynticks of the clocksource which
needs to be verified.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5d8b34fd

[PATCH] vmi-versus-hrtimers · b463fc60

由 Andrew Morton 提交于 2月 16, 2007

arch/i386/kernel/built-in.o: In function `vmi_stop_hz_timer':
: undefined reference to `next_timer_interrupt'

If CONFIG_NO_HZ, next_timer_interrupt() doesn't exist (and presumably doesn't
make sense).

Perhaps VMI shouildn't be playing with timer internals at this level.

Cc: Zachary Amsden <zach@vmware.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b463fc60

13 2月, 2007 2 次提交

[PATCH] i386: vMI timer patches · bbab4f3b

由 Zachary Amsden 提交于 2月 13, 2007

VMI timer code.  It works by taking over the local APIC clock when APIC is
configured, which requires a couple hooks into the APIC code.  The backend
timer code could be commonized into the timer infrastructure, but there are
some pieces missing (stolen time, in particular), and the exact semantics of
when to do accounting for NO_IDLE need to be shared between different
hypervisors as well.  So for now, VMI timer is a separate module.

[Adrian Bunk: cleanups]

Subject: VMI timer patches
Signed-off-by: NZachary Amsden <zach@vmware.com>
Signed-off-by: NAndi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: NAndrew Morton <akpm@osdl.org>

bbab4f3b

[PATCH] i386: vMI backend for paravirt-ops · 7ce0bcfd

由 Zachary Amsden 提交于 2月 13, 2007

Fairly straightforward implementation of VMI backend for paravirt-ops.

[Adrian Bunk: some cleanups]
Signed-off-by: NZachary Amsden <zach@vmware.com>
Signed-off-by: NAndi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: NAndrew Morton <akpm@osdl.org>

7ce0bcfd

12 2月, 2007 1 次提交

[PATCH] Set CONFIG_ZONE_DMA for arches with GENERIC_ISA_DMA · 5ac6da66

由 Christoph Lameter 提交于 2月 10, 2007

As Andi pointed out: CONFIG_GENERIC_ISA_DMA only disables the ISA DMA
channel management. Other functionality may still expect GFP_DMA to
provide memory below 16M. So we need to make sure that CONFIG_ZONE_DMA is
set independent of CONFIG_GENERIC_ISA_DMA. Undo the modifications to
mm/Kconfig where we made ZONE_DMA dependent on GENERIC_ISA_DMA and set
theses explicitly in each arches Kconfig.

Reviews must occur for each arch in order to determine if ZONE_DMA can be
switched off. It can only be switched off if we know that all devices
supported by a platform are capable of performing DMA transfers to all of
memory (Some arches already support this: uml, avr32, sh sh64, parisc and
IA64/Altix).

In order to switch ZONE_DMA off conditionally, one would have to establish
a scheme by which one can assure that no drivers are enabled that are only
capable of doing I/O to a part of memory, or one needs to provide an
alternate means of performing an allocation from a specific range of memory
(like provided by alloc_pages_range()) and insure that all drivers use that
call. In that case the arches alloc_dma_coherent() may need to be modified
to call alloc_pages_range() instead of relying on GFP_DMA.
Signed-off-by: NChristoph Lameter <clameter@sgi.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5ac6da66

06 1月, 2007 1 次提交

[PATCH] i386: Restore CONFIG_PHYSICAL_START option · dd0ec16f

由 Vivek Goyal 提交于 1月 05, 2007

o Relocatable bzImage support had got rid of CONFIG_PHYSICAL_START option
  thinking that now this option is not required as people can build a
  second kernel as relocatable and load it anywhere. So need of compiling
  the kernel for a custom address was gone. But Magnus uses vmlinux images
  for second kernel in Xen environment and he wants to continue to use
  it.

o Restoring the CONFIG_PHYSICAL_START option for the time being. I think
  down the line we can get rid of it.
Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

dd0ec16f

10 12月, 2006 1 次提交

[PATCH] x86-64: no paravirt for X86_VOYAGER or X86_VISWS · f0f32fcc

由 Randy Dunlap 提交于 12月 09, 2006

Since Voyager and Visual WS already define ARCH_SETUP,
it looks like PARAVIRT shouldn't be offered for them.

In file included from arch/i386/kernel/setup.c:63:
include/asm-i386/mach-visws/setup_arch.h:8:1: warning: "ARCH_SETUP" redefin=
ed
In file included from include/asm/msr.h:5,
                 from include/asm/processor.h:17,
                 from include/asm/thread_info.h:16,
                 from include/linux/thread_info.h:21,
                 from include/linux/preempt.h:9,
                 from include/linux/spinlock.h:49,
                 from include/linux/capability.h:45,
                 from include/linux/sched.h:46,
                 from arch/i386/kernel/setup.c:26:
include/asm/paravirt.h:163:1: warning: this is the location of the previous=
 definition
In file included from arch/i386/kernel/setup.c:63:
include/asm-i386/mach-visws/setup_arch.h:8:1: warning: "ARCH_SETUP" redefin=
ed
In file included from include/asm/msr.h:5,
                 from include/asm/processor.h:17,
                 from include/asm/thread_info.h:16,
                 from include/linux/thread_info.h:21,
                 from include/linux/preempt.h:9,
                 from include/linux/spinlock.h:49,
                 from include/linux/capability.h:45,
                 from include/linux/sched.h:46,
                 from arch/i386/kernel/setup.c:26:
include/asm/paravirt.h:163:1: warning: this is the location of the previous=
 definition
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>

f0f32fcc

09 12月, 2006 1 次提交

[PATCH] Generic BUG for i386 · 91768d6c

由 Jeremy Fitzhardinge 提交于 12月 08, 2006

This makes i386 use the generic BUG machinery.  There are no functional
changes from the old i386 implementation.

The main advantage in using the generic BUG machinery for i386 is that the
inlined overhead of BUG is just the ud2a instruction; the file+line(+function)
information are no longer inlined into the instruction stream.  This reduces
cache pollution, and makes disassembly work properly.
Signed-off-by: NJeremy Fitzhardinge <jeremy@goop.org>
Cc: Andi Kleen <ak@muc.de>
Cc: Hugh Dickens <hugh@veritas.com>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

91768d6c

07 12月, 2006 6 次提交

[PATCH] x86-64: Make ix86 default to HIGHMEM4G instead of NOHIGHMEM · f6ca8083

由 Randy Dunlap 提交于 12月 07, 2006

Generally better for allmodconfig coverage.
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: NAndi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>

f6ca8083

[PATCH] i386: always enable regparm · a1a70c25

由 Adrian Bunk 提交于 12月 07, 2006

-mregparm=3 has been enabled by default for some time on i386, and AFAIK
there aren't any problems with it left.

This patch removes the REGPARM config option and sets -mregparm=3
unconditionally.
Signed-off-by: NAdrian Bunk <bunk@stusta.de>
Signed-off-by: NAndi Kleen <ak@suse.de>

a1a70c25

[PATCH] paravirt: header and stubs for paravirtualisation · d3561b7f

由 Rusty Russell 提交于 12月 07, 2006

Create a paravirt.h header for all the critical operations which need to be
replaced with hypervisor calls, and include that instead of defining native
operations, when CONFIG_PARAVIRT.

This patch does the dumbest possible replacement of paravirtualized
instructions: calls through a "paravirt_ops" structure.  Currently these are
function implementations of native hardware: hypervisors will override the ops
structure with their own variants.

All the pv-ops functions are declared "fastcall" so that a specific
register-based ABI is used, to make inlining assember easier.

And:

+From: Andy Whitcroft <apw@shadowen.org>

The paravirt ops introduce a 'weak' attribute onto memory_setup().
Code ordering leads to the following warnings on x86:

    arch/i386/kernel/setup.c:651: warning: weak declaration of
                `memory_setup' after first use results in unspecified behavior

Move memory_setup() to avoid this.
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NChris Wright <chrisw@sous-sol.org>
Signed-off-by: NAndi Kleen <ak@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Zachary Amsden <zach@vmware.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NAndy Whitcroft <apw@shadowen.org>

d3561b7f

[PATCH] i386: Mark CONFIG_RELOCATABLE EXPERIMENTAL · 4c7aa6c3

由 Vivek Goyal 提交于 12月 07, 2006

Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NAndi Kleen <ak@suse.de>

4c7aa6c3

[PATCH] i386: Implement CONFIG_PHYSICAL_ALIGN · e69f202d

由 Vivek Goyal 提交于 12月 07, 2006

o Now CONFIG_PHYSICAL_START is being replaced with CONFIG_PHYSICAL_ALIGN.
  Hardcoding the kernel physical start value creates a problem in relocatable
  kernel context due to boot loader limitations. For ex, if somebody
  compiles a relocatable kernel to be run from address 4MB, but this kernel
  will run from location 1MB as grub loads the kernel at physical address
  1MB. Kernel thinks that I am a relocatable kernel and I should run from
  the address I have been loaded at. So somebody wanting to run kernel
  from 4MB alignment location (for improved performance regions) can't do
  that.

o Hence, Eric proposed that probably CONFIG_PHYSICAL_ALIGN will make
  more sense in relocatable kernel context. At run time kernel will move
  itself to a physical addr location which meets user specified alignment
  restrictions.
Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: NAndi Kleen <ak@suse.de>

e69f202d

[PATCH] i386: Relocatable kernel support · 968de4f0

由 Eric W. Biederman 提交于 12月 07, 2006

This patch modifies the i386 kernel so that if CONFIG_RELOCATABLE is
selected it will be able to be loaded at any 4K aligned address below
1G.  The technique used is to compile the decompressor with -fPIC and
modify it so the decompressor is fully relocatable.  For the main
kernel relocations are generated.  Resulting in a kernel that is relocatable
with no runtime overhead and no need to modify the source code.

A reserved 32bit word in the parameters has been assigned
to serve as a stack so we figure out where are running.
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: NAndi Kleen <ak@suse.de>

968de4f0

04 10月, 2006 1 次提交

Attack of "the the"s in arch · 4b3f686d

由 Matt LaPlante 提交于 10月 03, 2006

The patch below corrects multiple occurances of "the the"
typos across several files, both in source comments and KConfig files.
There is no actual code changed, only text. Note this only affects the /arch
directory, and I believe I could find many more elsewhere. :)
Signed-off-by: NAdrian Bunk <bunk@stusta.de>

4b3f686d

02 10月, 2006 1 次提交

[PATCH] Kprobes: Make kprobe modules more portable · 3a872d89

由 Ananth N Mavinakayanahalli 提交于 10月 02, 2006

In an effort to make kprobe modules more portable, here is a patch that:

o Introduces the "symbol_name" field to struct kprobe.
  The symbol->address resolution now happens in the kernel in an
  architecture agnostic manner. 64-bit powerpc users no longer have
  to specify the ".symbols"
o Introduces the "offset" field to struct kprobe to allow a user to
  specify an offset into a symbol.
o The legacy mechanism of specifying the kprobe.addr is still supported.
  However, if both the kprobe.addr and kprobe.symbol_name are specified,
  probe registration fails with an -EINVAL.
o The symbol resolution code uses kallsyms_lookup_name(). So
  CONFIG_KPROBES now depends on CONFIG_KALLSYMS
o Apparantly kprobe modules were the only legitimate out-of-tree user of
  the kallsyms_lookup_name() EXPORT. Now that the symbol resolution
  happens in-kernel, remove the EXPORT as suggested by Christoph Hellwig
o Modify tcp_probe.c that uses the kprobe interface so as to make it
  work on multiple platforms (in its earlier form, the code wouldn't
  work, say, on powerpc)
Signed-off-by: NAnanth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: NPrasanna S Panchamukhi <prasanna@in.ibm.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

3a872d89

27 9月, 2006 3 次提交

[PATCH] x86 microcode: add sysfs and hotplug support · 9a4b9efa

由 Shaohua Li 提交于 9月 27, 2006

Add sysfs support.  Currently each CPU has three microcode related
attributes.  One is 'version' which shows current ucode version of CPU.
Tools can use the attribute do validation or show CPU ucode status.  one is
'reload' which allows manually reloading ucode.  Another is
'processor_flags', which exports processor flags, so we can write tools to
check if CPU has latest ucode.  Also add suspend/resume and CPU hotplug
support.

[akpm@osdl.org: cleanups, build fix]
[bunk@stusta.de: Kconfig fixes]
Signed-off-by: NShaohua Li <shaohua.li@intel.com>
Acked-by: NTigran Aivazian <tigran@veritas.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: NAdrian Bunk <bunk@stusta.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

9a4b9efa

[PATCH] x86 microcode: microcode driver cleanup. · 9a3110bf

由 Shaohua Li 提交于 9月 27, 2006

Clean up microcode update driver and make it more readable.

[akpm@osdl.org: cleanups]
Signed-off-by: NShaohua Li <shaohua.li@intel.com>
Acked-by: NTigran Aivazian <tigran@veritas.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

9a3110bf

[PATCH] Have x86 use add_active_range() and free_area_init_nodes · 4cfee88a

由 Mel Gorman 提交于 9月 27, 2006

Size zones and holes in an architecture independent manner for x86.

[akpm@osdl.org: build fix]
Signed-off-by: NMel Gorman <mel@csn.ul.ie>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Andi Kleen <ak@muc.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "Keith Mannthey" <kmannth@gmail.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

4cfee88a

26 9月, 2006 5 次提交

[PATCH] x86: enable VMSPLIT for highmem kernels · 753b9f86

由 Dave Hansen 提交于 9月 25, 2006

The current VMSPLIT Kconfig option is disabled whenever highmem is on.
This is a bit screwy because the people who need to change VMSPLIT the most
tend to be the ones with highmem and constrained lowmem.

So, remove the highmem dependency.  But, re-include the dependency for the
"full 1GB of lowmem" option.  You can't have the full 1GB of lowmem and
highmem because of the need for the vmalloc(), kmap(), etc...  areas.

I thought there would be at least a bit of tweaking to do to
get it to work, but everything seems OK.

Boot tested on a 4GB x86 machine, and a 12GB 3-node NUMA-Q:

elm3b82:~# cat /proc/meminfo
MemTotal:      3695412 kB
MemFree:       3659540 kB
...
LowTotal:      2909008 kB
LowFree:       2892324 kB
...
elm3b82:~# zgrep PAE /proc/config.gz
CONFIG_X86_PAE=y

larry:~# cat /proc/meminfo
MemTotal:     11845900 kB
MemFree:      11786748 kB
...
LowTotal:      2855180 kB
LowFree:       2830092 kB
Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

753b9f86

[PATCH] x86: make __FIXADDR_TOP variable to allow it to make space for a hypervisor · 052e7994

由 Jeremy Fitzhardinge 提交于 9月 25, 2006

Make __FIXADDR_TOP a variable, so that it can be set to not get in the way of
address space a hypervisor may want to reserve.

Original patch by Gerd Hoffmann <kraxel@suse.de>
Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: NChris Wright <chrisw@sous-sol.org>
Cc: Gerd Hoffmann <kraxel@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

052e7994

[PATCH] i386: Remove experimental mark of kexec · 371c2f27

由 Eric W. Biederman 提交于 9月 26, 2006

kexec has been marked experimental for a year now and all
of the serious problems have been worked through.  So it
is time (if not past time) to remove the experimental mark.
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NAndi Kleen <ak@suse.de>

371c2f27

[PATCH] i386/x86-64: Improve Kconfig description of CRASH_DUMP · 1edf7778

由 Andi Kleen 提交于 9月 26, 2006

Improve Kconfig description of CONFIG_CRASH_DUMP. Previously
it was too brief to be useful.

Cc: vgoyal@in.ibm.com
Cc: ebiederm@xmission.com
Signed-off-by: NAndi Kleen <ak@suse.de>

1edf7778

[PATCH] i386: Allow to use GENERICARCH for UP kernels · 874c4fe3

由 Andi Kleen 提交于 9月 26, 2006

There are some machines around (large xSeries or Unisys ES7000) that
need physical IO-APIC destination mode to access all of their IO
devices. This currently doesn't work in UP kernels as used in
distribution installers.

This patch allows to compile even UP kernels as GENERICARCH which
allows to use physical or clustered APIC mode.
Signed-off-by: NAndi Kleen <ak@suse.de>

874c4fe3

28 8月, 2006 1 次提交

[PATCH] x86: NUMAQ Kconfig fix · 38e716aa

由 KAMEZAWA Hiroyuki 提交于 8月 27, 2006

When we select NUMA with i386, the system is only X86_NUMAQ or using ACPI.
Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

38e716aa

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功