提交 · d17abcd5417d84cfa8a225160481203a37dc81d4 · openeuler / raspberrypi-kernel

31 3月, 2009 1 次提交

x86: fix mismerge in arch/x86/include/asm/timer.h · 25c1a411

由 Stephen Rothwell 提交于 3月 30, 2009

Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

25c1a411

30 3月, 2009 2 次提交

cpumask: remove node_to_first_cpu · 0451fb2e

由 Rusty Russell 提交于 3月 30, 2009

Everyone defines it, and only one person uses it
(arch/mips/sgi-ip27/ip27-nmi.c).  So just open code it there.
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Cc: linux-mips@linux-mips.org

0451fb2e

lguest: use KVM hypercalls · 4cd8b5e2

由 Matias Zabaljauregui 提交于 3月 14, 2009

Impact: cleanup

This patch allow us to use KVM hypercalls

Signed-off-by: Matias Zabaljauregui <zabaljauregui at gmail.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

4cd8b5e2

28 3月, 2009 1 次提交

generic compat_sys_ustat · 2b1c6bd7

由 Christoph Hellwig 提交于 11月 28, 2008

Due to a different size of ino_t ustat needs a compat handler, but
currently only x86 and mips provide one.  Add a generic compat_sys_ustat
and switch all architectures over to it.  Instead of doing various
user copy hacks compat_sys_ustat just reimplements sys_ustat as
it's trivial.  This was suggested by Arnd Bergmann.

Found by Eric Sandeen when running xfstests/017 on ppc64, which causes
stack smashing warnings on RHEL/Fedora due to the too large amount of
data writen by the syscall.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

2b1c6bd7

27 3月, 2009 1 次提交

x86: headers cleanup - setup.h · 17d14040

由 Cyrill Gorcunov 提交于 1月 14, 2009

Impact: cleanup

'make headers_check' warn us about leaking of kernel private
(mostly compile time vars) data to userspace in headers. Fix it.

Guard this one by __KERNEL__.
Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

17d14040

26 3月, 2009 1 次提交

Revert "x86: don't compile vsmp_64 for 32bit" · 70511134

由 Ravikiran G Thirumalai 提交于 3月 23, 2009

Partial revert of commit 129d8bc8
titled 'x86: don't compile vsmp_64 for 32bit'

Commit reverted to compile vsmp_64.c if CONFIG_X86_64 is defined,
since is_vsmp_box() needs to indicate that TSCs are not synchronized, and
hence, not a valid time source, even when CONFIG_X86_VSMP is not defined.
Signed-off-by: NRavikiran Thirumalai <kiran@scalex86.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: shai@scalex86.org
LKML-Reference: <20090324061429.GH7278@localdomain>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

70511134

25 3月, 2009 1 次提交

x86: use default_cpu_mask_to_apicid for 64bit · f56e5034

由 Yinghai Lu 提交于 3月 24, 2009

Impact: cleanup

Use online_mask directly on 64bit too.
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
LKML-Reference: <49C94DAE.9070300@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

f56e5034

24 3月, 2009 17 次提交

KVM: Report IRQ injection status to userspace. · 4925663a

由 Gleb Natapov 提交于 2月 04, 2009

IRQ injection status is either -1 (if there was no CPU found
that should except the interrupt because IRQ was masked or
ioapic was misconfigured or ...) or >= 0 in that case the
number indicates to how many CPUs interrupt was injected.
If the value is 0 it means that the interrupt was coalesced
and probably should be reinjected.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

4925663a

x86: Add EFER descriptions for FFXSR · d2062693

由 Alexander Graf 提交于 2月 02, 2009

AMD k10 includes support for the FFXSR feature, which leaves out
XMM registers on FXSAVE/FXSAVE when the EFER_FFXSR bit is set in
EFER.

The CPUID feature bit exists already, but the EFER bit is missing
currently, so this patch adds it to the list of known EFER bits.
Signed-off-by: NAlexander Graf <agraf@suse.de>
CC: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

d2062693

KVM: Avoid using CONFIG_ in userspace visible headers · 91b2ae77

由 Avi Kivity 提交于 1月 19, 2009

Kconfig symbols are not available in userspace, and are not stripped by
headers-install.  Avoid their use by adding #defines in <asm/kvm.h> to
suit each architecture.
Signed-off-by: NAvi Kivity <avi@redhat.com>

91b2ae77

KVM: MMU: Rename "metaphysical" attribute to "direct" · f6e2c02b

由 Avi Kivity 提交于 1月 11, 2009

This actually describes what is going on, rather than alerting the reader
that something strange is going on.
Signed-off-by: NAvi Kivity <avi@redhat.com>

f6e2c02b

KVM: Move struct kvm_pio_request into x86 kvm_host.h · 1c08364c

由 Avi Kivity 提交于 1月 04, 2009

This is an x86 specific stucture and has no business living in common code.
Signed-off-by: NAvi Kivity <avi@redhat.com>

1c08364c

KVM: PIT: provide an option to disable interrupt reinjection · 52d939a0

由 Marcelo Tosatti 提交于 12月 30, 2008

Certain clocks (such as TSC) in older 2.6 guests overaccount for lost
ticks, causing severe time drift. Interrupt reinjection magnifies the
problem.

Provide an option to disable it.

[avi: allow room for expansion in case we want to disable reinjection
      of other timers]
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

52d939a0

KVM: introduce kvm_read_guest_virt, kvm_write_guest_virt · 77c2002e

由 Izik Eidus 提交于 12月 29, 2008

This commit change the name of emulator_read_std into kvm_read_guest_virt,
and add new function name kvm_write_guest_virt that allow writing into a
guest virtual address.
Signed-off-by: NIzik Eidus <ieidus@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

77c2002e

KVM: VMX: initialize TSC offset relative to vm creation time · 53f658b3

由 Marcelo Tosatti 提交于 12月 11, 2008

VMX initializes the TSC offset for each vcpu at different times, and
also reinitializes it for vcpus other than 0 on APIC SIPI message.

This bug causes the TSC's to appear unsynchronized in the guest, even if
the host is good.

Older Linux kernels don't handle the situation very well, so
gettimeofday is likely to go backwards in time:

http://www.mail-archive.com/kvm@vger.kernel.org/msg02955.html
http://sourceforge.net/tracker/index.php?func=detail&aid=2025534&group_id=180599&atid=893831

Fix it by initializating the offset of each vcpu relative to vm creation
time, and moving it from vmx_vcpu_reset to vmx_vcpu_setup, out of the
APIC MP init path.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

53f658b3

KVM: MMU: Segregate mmu pages created with different cr4.pge settings · 2f0b3d60

由 Avi Kivity 提交于 12月 21, 2008

Don't allow a vcpu with cr4.pge cleared to use a shadow page created with
cr4.pge set; this might cause a cr3 switch not to sync ptes that have the
global bit set (the global bit has no effect if !cr4.pge).

This can only occur on smp with different cr4.pge settings for different
vcpus (since a cr4 change will resync the shadow ptes), but there's no
cost to being correct here.
Signed-off-by: NAvi Kivity <avi@redhat.com>

2f0b3d60

KVM: MMU: Inherit a shadow page's guest level count from vcpu setup · a770f6f2

由 Avi Kivity 提交于 12月 21, 2008

Instead of "calculating" it on every shadow page allocation, set it once
when switching modes, and copy it when allocating pages.

This doesn't buy us much, but sets up the stage for inheriting more
information related to the mmu setup.
Signed-off-by: NAvi Kivity <avi@redhat.com>

a770f6f2

KVM: x86: Virtualize debug registers · 42dbaa5a

由 Jan Kiszka 提交于 12月 15, 2008

So far KVM only had basic x86 debug register support, once introduced to
realize guest debugging that way. The guest itself was not able to use
those registers.

This patch now adds (almost) full support for guest self-debugging via
hardware registers. It refactors the code, moving generic parts out of
SVM (VMX was already cleaned up by the KVM_SET_GUEST_DEBUG patches), and
it ensures that the registers are properly switched between host and
guest.

This patch also prepares debug register usage by the host. The latter
will (once wired-up by the following patch) allow for hardware
breakpoints/watchpoints in guest code. If this is enabled, the guest
will only see faked debug registers without functionality, but with
content reflecting the guest's modifications.

Tested on Intel only, but SVM /should/ work as well, but who knows...

Known limitations: Trapping on tss switch won't work - most probably on
Intel.

Credits also go to Joerg Roedel - I used his once posted debugging
series as platform for this patch.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

42dbaa5a

KVM: New guest debug interface · d0bfb940

由 Jan Kiszka 提交于 12月 15, 2008

This rips out the support for KVM_DEBUG_GUEST and introduces a new IOCTL
instead: KVM_SET_GUEST_DEBUG. The IOCTL payload consists of a generic
part, controlling the "main switch" and the single-step feature. The
arch specific part adds an x86 interface for intercepting both types of
debug exceptions separately and re-injecting them when the host was not
interested. Moveover, the foundation for guest debugging via debug
registers is layed.

To signal breakpoint events properly back to userland, an arch-specific
data block is now returned along KVM_EXIT_DEBUG. For x86, the arch block
contains the PC, the debug exception, and relevant debug registers to
tell debug events properly apart.

The availability of this new interface is signaled by
KVM_CAP_SET_GUEST_DEBUG. Empty stubs for not yet supported archs are
provided.

Note that both SVM and VTX are supported, but only the latter was tested
yet. Based on the experience with all those VTX corner case, I would be
fairly surprised if SVM will work out of the box.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

d0bfb940

KVM: VMX: Support for injecting software exceptions · 8ab2d2e2

由 Jan Kiszka 提交于 12月 15, 2008

VMX differentiates between processor and software generated exceptions
when injecting them into the guest. Extend vmx_queue_exception
accordingly (and refactor related constants) so that we can use this
service reliably for the new guest debugging framework.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

8ab2d2e2

KVM: SVM: Add VMRUN handler · 3d6368ef

由 Alexander Graf 提交于 11月 25, 2008

This patch implements VMRUN. VMRUN enters a virtual CPU and runs that
in the same context as the normal guest CPU would run.
So basically it is implemented the same way, a normal CPU would do it.

We also prepare all intercepts that get OR'ed with the original
intercepts, as we do not allow a level 2 guest to be intercepted less
than the first level guest.

v2 implements the following improvements:

- fixes the CPL check
- does not allocate iopm when not used
- remembers the host's IF in the HIF bit in the hflags

v3:

- make use of the new permission checking
- add support for V_INTR_MASKING_MASK

v4:

- use host page backed hsave

v5:

- remove IOPM merging code

v6:

- save cr4 so PAE l1 guests work

v7:

- return 0 on vmrun so we check the MSRs too
- fix MSR check to use the correct variable
Acked-by: NJoerg Roedel <joro@8bytes.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

3d6368ef

KVM: SVM: Implement GIF, clgi and stgi · 1371d904

由 Alexander Graf 提交于 11月 25, 2008

This patch implements the GIF flag and the clgi and stgi instructions that
set this flag. Only if the flag is set (default), interrupts can be received by
the CPU.

To keep the information about that somewhere, this patch adds a new hidden
flags vector. that is used to store information that does not go into the
vmcb, but is SVM specific.

I tried to write some code to make -no-kvm-irqchip work too, but the first
level guest won't even boot with that atm, so I ditched it.

v2 moves the hflags to x86 generic code
v3 makes use of the new permission helper
v6 only enables interrupt_window if GIF=1
Acked-by: NJoerg Roedel <joro@8bytes.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

1371d904

KVM: SVM: Move EFER and MSR constants to generic x86 code · 9962d032

由 Alexander Graf 提交于 11月 25, 2008

MSR_EFER_SVME_MASK, MSR_VM_CR and MSR_VM_HSAVE_PA are set in KVM
specific headers. Linux does have nice header files to collect
EFER bits and MSR IDs, so IMHO we should put them there.

While at it, I also changed the naming scheme to match that
of the other defines.

(introduced in v6)
Acked-by: NJoerg Roedel <joro@8bytes.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

9962d032

x86/dmi: fix dmi_alloc() section mismatches · c8608d6b

由 Jeremy Fitzhardinge 提交于 3月 22, 2009

Impact: section mismatch fix

Ingo reports these warnings:
> WARNING: vmlinux.o(.text+0x6a288e): Section mismatch in reference from
> the function dmi_alloc() to the function .init.text:extend_brk()
> The function dmi_alloc() references
> the function __init extend_brk().
> This is often because dmi_alloc lacks a __init annotation or the
> annotation of extend_brk is wrong.

dmi_alloc() is a static inline, and so should be immune to this
kind of error.  But force it to be inlined and make it __init
anyway, just to be extra sure.

All of dmi_alloc()'s callers are already __init.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <49C6B23C.2040308@goop.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

c8608d6b

23 3月, 2009 1 次提交

x86: e820 fix various signedness issues in setup.c and e820.c · ba639039

由 Jaswinder Singh Rajput 提交于 3月 23, 2009

Impact: cleanup

This fixed various signedness issues in setup.c and e820.c:
arch/x86/kernel/setup.c:455:53: warning: incorrect type in argument 3 (different signedness)
arch/x86/kernel/setup.c:455:53: expected int *pnr_map
arch/x86/kernel/setup.c:455:53: got unsigned int extern [toplevel] *<noident>
arch/x86/kernel/setup.c:639:53: warning: incorrect type in argument 3 (different signedness)
arch/x86/kernel/setup.c:639:53: expected int *pnr_map
arch/x86/kernel/setup.c:639:53: got unsigned int extern [toplevel] *<noident>
arch/x86/kernel/setup.c:820:54: warning: incorrect type in argument 3 (different signedness)
arch/x86/kernel/setup.c:820:54: expected int *pnr_map
arch/x86/kernel/setup.c:820:54: got unsigned int extern [toplevel] *<noident>

arch/x86/kernel/e820.c:670:53: warning: incorrect type in argument 3 (different signedness)
arch/x86/kernel/e820.c:670:53: expected int *pnr_map
arch/x86/kernel/e820.c:670:53: got unsigned int [toplevel] *<noident>
Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>

ba639039

20 3月, 2009 1 次提交

x86, CPA: Add set_pages_arrayuc and set_pages_array_wb · 0f350755

由 venkatesh.pallipadi@intel.com 提交于 3月 19, 2009

Add new interfaces:

  set_pages_array_uc()
  set_pages_array_wb()

that can be used change the page attribute for a bunch of pages with
flush etc done once at the end of all the changes. These interfaces
are similar to existing set_memory_array_uc() and set_memory_array_wc().
Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: arjan@infradead.org
Cc: eric@anholt.net
Cc: airlied@redhat.com
LKML-Reference: <20090319215358.901545000@intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

0f350755

19 3月, 2009 1 次提交

x86: with the last user gone, remove set_pte_present · 71ff49d7

由 Jeremy Fitzhardinge 提交于 3月 18, 2009

Impact: cleanup

set_pte_present() is no longer used, directly or indirectly,
so remove it.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Xen-devel <xen-devel@lists.xensource.com>
Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Avi Kivity <avi@redhat.com>
LKML-Reference: <1237406613-2929-2-git-send-email-jeremy@goop.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

71ff49d7

18 3月, 2009 6 次提交

x86: add x2apic_wrmsr_fence() to x2apic flush tlb paths · ce4e240c

由 Suresh Siddha 提交于 3月 17, 2009

Impact: optimize APIC IPI related barriers

Uncached MMIO accesses for xapic are inherently serializing and hence
we don't need explicit barriers for xapic IPI paths.

x2apic MSR writes/reads don't have serializing semantics and hence need
a serializing instruction or mfence, to make all the previous memory
stores globally visisble before the x2apic msr write for IPI.

Add x2apic_wrmsr_fence() in flush tlb path to x2apic specific paths.
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "steiner@sgi.com" <steiner@sgi.com>
Cc: Nick Piggin <npiggin@suse.de>
LKML-Reference: <1237313814.27006.203.camel@localhost.localdomain>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

ce4e240c

x86, ioapic: Fix non atomic allocation with interrupts disabled · 05c3dc2c

由 Suresh Siddha 提交于 3月 16, 2009

Impact: fix possible race

save_mask_IO_APIC_setup() was using non atomic memory allocation while getting
called with interrupts disabled. Fix this by splitting this into two different
function. Allocation part save_IO_APIC_setup() now happens before
disabling interrupts.
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

05c3dc2c

x86, x2apic: cleanup the IO-APIC level migration with interrupt-remapping · 0280f7c4

由 Suresh Siddha 提交于 3月 16, 2009

Impact: simplification

In the current code, for level triggered migration, we need to modify the
io-apic RTE with the update vector information, along with modifying interrupt
remapping table entry(IRTE) with vector and destination. This is to ensure that
remote IRR bit inthe IOAPIC RTE gets cleared when the cpu does EOI.

With this patch, for level triggered, we eliminate the io-apic RTE modification
(with the updated vector information), by using a virtual vector (io-apic pin
number). Real vector that is used for interrupting cpu will be coming from
the interrupt-remapping table entry. Trigger mode in the IRTE will always be
edge, and the actual level or edge trigger will be setup in the IO-APIC RTE.
So a level triggered interrupt will appear as an edge to the local apic
cpu but still as level to the IO-APIC.

With this change, level irq migration can be done by simply modifying
the interrupt-remapping table entry with out changing the io-apic RTE.
And as the interrupt appears as edge at the cpu, in addition to do the
local apic EOI, we need to do IO-APIC directed EOI to clear the remote
IRR bit in the IO-APIC RTE.

This simplies the irq migration in the presence of interrupt-remapping.
Idea-by: NRajesh Sankaran <rajesh.sankaran@intel.com>
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

0280f7c4

x86, x2apic: fix clear_local_APIC() in the presence of x2apic · cf6567fe

由 Suresh Siddha 提交于 3月 16, 2009

Impact: cleanup, paranoia

We were not clearing the local APIC in clear_local_APIC() in the
presence of x2apic. Fix it.
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

cf6567fe

x86, x2apic: enable fault handling for intr-remapping · 9d783ba0

由 Suresh Siddha 提交于 3月 16, 2009

Impact: interface augmentation (not yet used)

Enable fault handling flow for intr-remapping aswell. Fault handling
code now shared by both dma-remapping and intr-remapping.
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

9d783ba0

x86/brk: make the brk reservation symbols inaccessible from C · 0b1c723d

由 Jeremy Fitzhardinge 提交于 3月 14, 2009

Impact: bulletproofing, clarification

The brk reservation symbols are just there to document the amount
of space reserved by brk users in the final vmlinux file.  Their
addresses are irrelevent, and using their addresses will cause
certain havok.  Name them ".brk.NAME", which is a valid asm symbol
but C can't reference it; it also highlights their special
role in the symbol table.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

0b1c723d

17 3月, 2009 2 次提交

dma-debug: x86 architecture bindings · 2118d0c5

由 Joerg Roedel 提交于 1月 09, 2009

Impact: make use of DMA-API debugging code in x86
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>

2118d0c5

x86, paravirt: prevent gcc from generating the wrong addressing mode · 42854dc0

由 Jeremy Fitzhardinge 提交于 3月 16, 2009

Impact: fix crash on VMI (VMware)

When we generate a call sequence for calling a paravirtualized
function, we presume that the generated code is "call *0xXXXXX",
which is a 6 byte opcode; this is larger than a normal
direct call, and so we can patch a direct call over it.

At the moment, however we give gcc enough rope to hang us by
putting the address in a register and generating a two byte
indirect-via-register call.  Prevent this by explicitly
dereferencing the function pointer and passing it into the
asm as a constant.

This prevents crashes in VMI, as it cannot handle unpatchable
callsites.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Alok Kataria <akataria@vmware.com>
LKML-Reference: <49BEEDC2.2070809@goop.org>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

42854dc0

15 3月, 2009 5 次提交

x86: allow extend_brk users to reserve brk space · 796216a5

由 Jeremy Fitzhardinge 提交于 3月 12, 2009

Impact: new interface; remove hard-coded limit

Add RESERVE_BRK(name, size) macro to reserve space in the brk
area.  This should be a conservative (ie, larger) estimate of
how much space might possibly be required from the brk area.
Any unused space will be freed, so there's no real downside
on making the reservation too large (within limits).

The name should be unique within a given file, and somewhat
descriptive.

The C definition of RESERVE_BRK() ends up being more complex than
one would expect to work around a cluster of gcc infelicities:

  The first attempt was to simply try putting __section(.brk_reservation)
  on a variable.  This doesn't work because it ends up making it a
  @progbits section, which gets actual space allocated in the vmlinux
  executable.

  The second attempt was to emit the space into a section using asm,
  but gcc doesn't allow arguments to be passed to file-level asm()
  statements, making it hard to pass in the size.

  The final attempt is to wrap the asm() in a function to allow
  it to have arguments, and put the function itself into the
  .discard section, which vmlinux*.lds drops entirely from the
  emitted vmlinux.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

796216a5

x86-32: compute initial mapping size more accurately · 7543c1de

由 Yinghai Lu 提交于 3月 12, 2009

Impact: simplification

We only need to map the kernel in head_32.S, not the whole of
lowmem.  We use 512MB as a reasonable (but arbitrary) limit on
the maximum size of the kernel image.
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

7543c1de

x86: use brk allocation for DMI · 6de6cb44

由 Jeremy Fitzhardinge 提交于 2月 27, 2009

Impact: use new interface instead of previous ad hoc implementation

Use extend_brk() to allocate memory for DMI rather than having an
ad-hoc allocator.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

6de6cb44

x86-32: use brk segment for allocating initial kernel pagetable · ccf3fe02

由 Jeremy Fitzhardinge 提交于 2月 27, 2009

Impact: use new interface instead of previous ad hoc implementation

Rather than having special purpose init_pg_table_start/end variables
to delimit the kernel pagetable built by head_32.S, just use the brk
mechanism to extend the bss for the new pagetable.

This patch removes init_pg_table_start/end and pg0, defines __brk_base
(which is page-aligned and immediately follows _end), initializes
the brk region to start there, and uses it for the 32-bit pagetable.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

ccf3fe02

x86: add brk allocation for very, very early allocations · 93dbda7c

由 Jeremy Fitzhardinge 提交于 2月 26, 2009

Impact: new interface

Add a brk()-like allocator which effectively extends the bss in order
to allow very early code to do dynamic allocations.  This is better than
using statically allocated arrays for data in subsystems which may never
get used.

The space for brk allocations is in the bss ELF segment, so that the
space is mapped properly by the code which maps the kernel, and so
that bootloaders keep the space free rather than putting a ramdisk or
something into it.

The bss itself, delimited by __bss_stop, ends before the brk area
(__brk_base to __brk_limit).  The kernel text, data and bss is reserved
up to __bss_stop.

Any brk-allocated data is reserved separately just before the kernel
pagetable is built, as that code allocates from unreserved spaces
in the e820 map, potentially allocating from any unused brk memory.
Ultimately any unused memory in the brk area is used in the general
kernel memory pool.

Initially the brk space is set to 1MB, which is probably much larger
than any user needs (the largest current user is i386 head_32.S's code
to build the pagetables to map the kernel, which can get fairly large
with a big kernel image and no PSE support).  So long as the system
has sufficient memory for the bootloader to reserve the kernel+1MB brk,
there are no bad effects resulting from an over-large brk.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

93dbda7c