提交 · 42cbd8efb0746b55112de45173219f76c54390da · openeuler / raspberrypi-kernel

18 12月, 2010 2 次提交

x86, kexec: Limit the crashkernel address appropriately · 7f8595bf

由 H. Peter Anvin 提交于 12月 16, 2010

Keep the crash kernel address below 512 MiB for 32 bits and 896 MiB
for 64 bits.  For 32 bits, this retains compatibility with earlier
kernel releases, and makes it work even if the vmalloc= setting is
adjusted.

For 64 bits, we should be able to increase this substantially once a
hard-coded limit in kexec-tools is fixed.
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <20101217195035.GE14502@redhat.com>

7f8595bf

Revert "x86: allocate space within a region top-down" · 5e52f1c5

由 Bjorn Helgaas 提交于 12月 16, 2010

This reverts commit 1af3c2e4.
Acked-by: NH. Peter Anvin <hpa@zytor.com>
Signed-off-by: NBjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

5e52f1c5

18 11月, 2010 1 次提交

x86, amd-nb: Complete the rename of AMD NB and related code · eec1d4fa

由 Hans Rosenfeld 提交于 10月 29, 2010

Not only the naming of the files was confusing, it was even more so for
the function and variable names.

Renamed the K8 NB and NUMA stuff that is also used on other AMD
platforms. This also renames the CONFIG_K8_NUMA option to
CONFIG_AMD_NUMA and the related file k8topology_64.c to
amdtopology_64.c. No functional changes intended.
Signed-off-by: NHans Rosenfeld <hans.rosenfeld@amd.com>
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>

eec1d4fa

27 10月, 2010 2 次提交

x86: allocate space within a region top-down · 1af3c2e4

由 Bjorn Helgaas 提交于 10月 26, 2010

Request that allocate_resource() use available space from high addresses
first, rather than the default of using low addresses first.

The most common place this makes a difference is when we move or assign
new PCI device resources. Low addresses are generally scarce, so it's
better to use high addresses when possible. This follows Windows practice
for PCI allocation.

Reference: https://bugzilla.kernel.org/show_bug.cgi?id=16228#c42Signed-off-by: NBjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

1af3c2e4

x86: update iomem_resource end based on CPU physical address capabilities · 419afdf5

由 Bjorn Helgaas 提交于 10月 26, 2010

The iomem_resource map reflects the available physical address space.
We statically initialize the end to -1, i.e., 0xffffffff_ffffffff, but
of course we can only use as much as the CPU can address.

This patch updates the end based on the CPU capabilities, so we don't
mistakenly allocate space that isn't usable, as we're likely to do when
allocating from the top-down.
Signed-off-by: NBjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

419afdf5

21 10月, 2010 1 次提交

x86-32, mm: Add an initial page table for core bootstrapping · b40827fa

由 Borislav Petkov 提交于 8月 28, 2010

This patch adds an initial page table with low mappings used exclusively
for booting APs/resuming after ACPI suspend/machine restart. After this,
there's no need to add low mappings to swapper_pg_dir and zap them later
or create own swsusp PGD page solely for ACPI sleep needs - we have
initial_page_table for that.
Signed-off-by: NBorislav Petkov <bp@alien8.de>
LKML-Reference: <20101020070526.GA9588@liondog.tnic>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

b40827fa

14 10月, 2010 1 次提交

x86-64: Only set max_pfn_mapped to 512 MiB if we enter via head_64.S · 67e87f0a

由 Jeremy Fitzhardinge 提交于 10月 13, 2010

head_64.S maps up to 512 MiB, but that is not necessarity true for
other entry paths, such as Xen.

Thus, co-locate the setting of max_pfn_mapped with the code to
actually set up the page tables in head_64.S.  The 32-bit code is
already so co-located.  (The Xen code already sets max_pfn_mapped
correctly for its own use case.)

-v2:

 Yinghai fixed the following bug in this patch:

 |
 | max_pfn_mapped is in .bss section, so we need to set that
 | after bss get cleared. Without that we crash on bootup.
 |
 | That is safe because Xen does not call x86_64_start_kernel().
 |
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Fixed-by: NYinghai Lu <yinghai@kernel.org>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
LKML-Reference: <4CB6AB24.9020504@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

67e87f0a

06 10月, 2010 1 次提交

x86, memblock: Fix crashkernel allocation · 9f4c1396

由 Yinghai Lu 提交于 10月 05, 2010

Cai Qian found crashkernel is broken with the x86 memblock changes.

1. crashkernel=128M@32M always reported that range is used, even if
   the first kernel is small and does not usethat range

2. we always got following report when using "kexec -p"
	Could not find a free area of memory of a000 bytes...
	locate_hole failed

The root cause is that generic memblock_find_in_range() will try to
allocate from the top of the range, whereas the kexec code was written
assuming that allocation was always near the bottom and that it could
blindly extend memory upward.  Unfortunately the kexec code doesn't
have a system for requesting the range that it really needs, so this
is subject to probabilistic failures.

This patch hacks around the problem by limiting the target range
heuristically to below the traditional bzImage max range.  This number
is arbitrary and not always correct, and a much better result would be
obtained by having kexec communicate this number based on the kernel
header information and any appropriate command line options.
Reported-and-Bisected-by: NCAI Qian <caiqian@redhat.com>
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
LKML-Reference: <4CABAF2A.5090501@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

9f4c1396

21 9月, 2010 2 次提交

jump label: Make dynamic no-op selection available outside of ftrace · f49aa448

由 Jason Baron 提交于 9月 17, 2010

Move Steve's code for finding the best 5-byte no-op from ftrace.c to
alternative.c. The idea is that other consumers (in this case jump label)
want to make use of that code.
Signed-off-by: NJason Baron <jbaron@redhat.com>
LKML-Reference: <96259ae74172dcac99c0020c249743c523a92e18.1284733808.git.jbaron@redhat.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

f49aa448

x86, k8: Rename k8.[ch] to amd_nb.[ch] and CONFIG_K8_NB to CONFIG_AMD_NB · 23ac4ae8

由 Andreas Herrmann 提交于 9月 17, 2010

The file names are somehow misleading as the code is not specific to
AMD K8 CPUs anymore. The files accomodate code for other AMD CPU
northbridges as well.

Same is true for the config option which is valid for AMD CPU
northbridges in general and not specific to K8.
Signed-off-by: NAndreas Herrmann <andreas.herrmann3@amd.com>
LKML-Reference: <20100917160343.GD4958@loge.amd.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

23ac4ae8

28 8月, 2010 4 次提交

x86: Remove old bootmem code · 774ea0bc

由 Yinghai Lu 提交于 8月 25, 2010

Requested by Ingo, Thomas and HPA.

The old bootmem code is no longer necessary, and the transition is
complete.  Remove it.
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

774ea0bc

x86, memblock: Use memblock_memory_size()/memblock_free_memory_size() to get correct dma_reserve · 6f2a7536

由 Yinghai Lu 提交于 8月 25, 2010

memblock_memory_size() will return memory size in memblock.memory.region.
memblock_free_memory_size() will return free memory size in memblock.memory.region.

So We can get exact reseved size in specified range.

Set the size right after initmem_init(), because later bootmem API will
get area above 16M. (except some fallback).

Later after we remove the bootmem, We could call that just before paging_init().
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

6f2a7536

x86, memblock: Replace e820_/_early string with memblock_ · a9ce6bc1

由 Yinghai Lu 提交于 8月 25, 2010

1.include linux/memblock.h directly. so later could reduce e820.h reference.
2 this patch is done by sed scripts mainly

-v2: use MEMBLOCK_ERROR instead of -1ULL or -1UL
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

a9ce6bc1

x86: Use memblock to replace early_res · 72d7c3b3

由 Yinghai Lu 提交于 8月 25, 2010

1. replace find_e820_area with memblock_find_in_range
2. replace reserve_early with memblock_x86_reserve_range
3. replace free_early with memblock_x86_free_range.
4. NO_BOOTMEM will switch to use memblock too.
5. use _e820, _early wrap in the patch, in following patch, will
   replace them all
6. because memblock_x86_free_range support partial free, we can remove some special care
7. Need to make sure that memblock_find_in_range() is called after memblock_x86_fill()
   so adjust some calling later in setup.c::setup_arch()
   -- corruption_check and mptable_update

-v2: Move reserve_brk() early
    Before fill_memblock_area, to avoid overlap between brk and memblock_find_in_range()
    that could happen We have more then 128 RAM entry in E820 tables, and
    memblock_x86_fill() could use memblock_find_in_range() to find a new place for
    memblock.memory.region array.
    and We don't need to use extend_brk() after fill_memblock_area()
    So move reserve_brk() early before fill_memblock_area().
-v3: Move find_smp_config early
    To make sure memblock_find_in_range not find wrong place, if BIOS doesn't put mptable
    in right place.
-v4: Treat RESERVED_KERN as RAM in memblock.memory. and they are already in
    memblock.reserved already..
    use __NOT_KEEP_MEMBLOCK to make sure memblock related code could be freed later.
-v5: Generic version __memblock_find_in_range() is going from high to low, and for 32bit
    active_region for 32bit does include high pages
    need to replace the limit with memblock.default_alloc_limit, aka get_max_mapped()
-v6: Use current_limit instead
-v7: check with MEMBLOCK_ERROR instead of -1ULL or -1L
-v8: Set memblock_can_resize early to handle EFI with more RAM entries
-v9: update after kmemleak changes in mainline
Suggested-by: NDavid S. Miller <davem@davemloft.net>
Suggested-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Suggested-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

72d7c3b3

26 8月, 2010 1 次提交

x86, bios: Make the x86 early memory reservation a kernel option · 9ea77bdb

由 H. Peter Anvin 提交于 8月 25, 2010

Add a kernel command-line option so the x86 early memory reservation
size can be adjusted at runtime instead of only at compile time.
Suggested-by: NAndrew Morton <akpm@linux-foundation.org>
LKML-Reference: <tip-d0cd7425@git.kernel.org>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

9ea77bdb

25 8月, 2010 1 次提交

x86, bios: By default, reserve the low 64K for all BIOSes · d0cd7425

由 H. Peter Anvin 提交于 8月 24, 2010

The laundry list of BIOSes that need the low 64K reserved is getting
very long, so make it the default across all BIOSes.  This also allows
the code to be simplified and unified with the reservation code for
the first 4K.

This resolves kernel bugzilla 16661 and who knows what else...
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
LKML-Reference: <tip-*@git.kernel.org>

d0cd7425

24 8月, 2010 1 次提交

x86, vmware: Remove deprecated VMI kernel support · 9863c90f

由 Alok Kataria 提交于 8月 23, 2010

With the recent innovations in CPU hardware acceleration technologies
from Intel and AMD, VMware ran a few experiments to compare these
techniques to guest paravirtualization technique on VMware's platform.
These hardware assisted virtualization techniques have outperformed the
performance benefits provided by VMI in most of the workloads. VMware
expects that these hardware features will be ubiquitous in a couple of
years, as a result, VMware has started a phased retirement of this
feature from the hypervisor.

Please note that VMI has always been an optimization and non-VMI kernels
still work fine on VMware's platform.
Latest versions of VMware's product which support VMI are,
Workstation 7.0 and VSphere 4.0 on ESX side, future maintainence
releases for these products will continue supporting VMI.

For more details about VMI retirement take a look at this,
http://blogs.vmware.com/guestosguide/2009/09/vmi-retirement.html

This feature removal was scheduled for 2.6.37 back in September 2009.
Signed-off-by: NAlok N Kataria <akataria@vmware.com>
LKML-Reference: <1282600151.19396.22.camel@ank32.eng.vmware.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

9863c90f

19 8月, 2010 1 次提交

x86-32: Separate 1:1 pagetables from swapper_pg_dir · fd89a137

由 Joerg Roedel 提交于 8月 16, 2010

This patch fixes machine crashes which occur when heavily exercising the
CPU hotplug codepaths on a 32-bit kernel. These crashes are caused by
AMD Erratum 383 and result in a fatal machine check exception. Here's
the scenario:

1. On 32-bit, the swapper_pg_dir page table is used as the initial page
table for booting a secondary CPU.

2. To make this work, swapper_pg_dir needs a direct mapping of physical
memory in it (the low mappings). By adding those low, large page (2M)
mappings (PAE kernel), we create the necessary conditions for Erratum
383 to occur.

3. Other CPUs which do not participate in the off- and onlining game may
use swapper_pg_dir while the low mappings are present (when leave_mm is
called). For all steps below, the CPU referred to is a CPU that is using
swapper_pg_dir, and not the CPU which is being onlined.

4. The presence of the low mappings in swapper_pg_dir can result
in TLB entries for addresses below __PAGE_OFFSET to be established
speculatively. These TLB entries are marked global and large.

5. When the CPU with such TLB entry switches to another page table, this
TLB entry remains because it is global.

6. The process then generates an access to an address covered by the
above TLB entry but there is a permission mismatch - the TLB entry
covers a large global page not accessible to userspace.

7. Due to this permission mismatch a new 4kb, user TLB entry gets
established. Further, Erratum 383 provides for a small window of time
where both TLB entries are present. This results in an uncorrectable
machine check exception signalling a TLB multimatch which panics the
machine.

There are two ways to fix this issue:

        1. Always do a global TLB flush when a new cr3 is loaded and the
        old page table was swapper_pg_dir. I consider this a hack hard
        to understand and with performance implications

        2. Do not use swapper_pg_dir to boot secondary CPUs like 64-bit
        does.

This patch implements solution 2. It introduces a trampoline_pg_dir
which has the same layout as swapper_pg_dir with low_mappings. This page
table is used as the initial page table of the booting CPU. Later in the
bringup process, it switches to swapper_pg_dir and does a global TLB
flush. This fixes the crashes in our test cases.

-v2: switch to swapper_pg_dir right after entering start_secondary() so
that we are able to access percpu data which might not be mapped in the
trampoline page table.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
LKML-Reference: <20100816123833.GB28147@aftab>
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

fd89a137

13 8月, 2010 1 次提交

x86, cleanup: Remove obsolete boot_cpu_id variable · f6e9456c

由 Robert Richter 提交于 7月 21, 2010

boot_cpu_id is there for historical reasons and was renamed to
boot_cpu_physical_apicid in patch:

 c70dcb74 x86: change boot_cpu_id to boot_cpu_physical_apicid

However, there are some remaining occurrences of boot_cpu_id that are
never touched in the kernel and thus its value is always 0.

This patch removes boot_cpu_id completely.
Signed-off-by: NRobert Richter <robert.richter@amd.com>
LKML-Reference: <1279731838-1522-8-git-send-email-robert.richter@amd.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

f6e9456c

19 6月, 2010 1 次提交

x86, olpc: Add support for calling into OpenFirmware · fd699c76

由 Andres Salomon 提交于 6月 18, 2010

Add support for saving OFW's cif, and later calling into it to run OFW
commands.  OFW remains resident in memory, living within virtual range
0xff800000 - 0xffc00000.  A single page directory entry points to the
pgdir that OFW actually uses, so rather than saving the entire page
table, we grab and install that one entry permanently in the kernel's
page table.

This is currently only used by the OLPC XO.  Note that this particular
calling convention breaks PAE and PAT, and so cannot be used on newer
x86 hardware.
Signed-off-by: NAndres Salomon <dilinger@queued.net>
LKML-Reference: <20100618174653.7755a39a@dev.queued.net>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

fd699c76

25 5月, 2010 1 次提交

x86, setup: Phoenix BIOS fixup is needed on Dell Inspiron Mini 1012 · 3d6e77a3

由 Gabor Gombas 提交于 5月 24, 2010

The low-memory corruption checker triggers during suspend/resume, so we
need to reserve the low 64k.  Don't be fooled that the BIOS identifies
itself as "Dell Inc.", it's still Phoenix BIOS.

[ hpa: I think we blacklist almost every BIOS in existence.  We should
either change this to a whitelist or just make it unconditional. ]
Signed-off-by: NGabor Gombas <gombasg@digikabel.hu>
LKML-Reference: <201005241913.o4OJDIMM010877@imap1.linux-foundation.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
Cc: <stable@kernel.org>

3d6e77a3

21 5月, 2010 1 次提交

x86, kgdb: early trap init for early debug · 29c84391

由 Jan Kiszka 提交于 5月 20, 2010

Allow the x86 arch to have early exception processing for the purpose
of debugging via the kgdb.
Signed-off-by: NJan Kiszka <jan.kiszka@web.de>
Signed-off-by: NJason Wessel <jason.wessel@windriver.com>

29c84391

02 4月, 2010 1 次提交

ibft, x86: Change reserve_ibft_region() to find_ibft_region() · 042be38e

由 Yinghai Lu 提交于 4月 01, 2010

This allows arch code could decide the way to reserve the ibft.

And we should reserve ibft as early as possible, instead of BOOTMEM
stage, in case the table is in RAM range and is not reserved by BIOS
(this will often be the case.)

Move to just after find_smp_config().

Also when CONFIG_NO_BOOTMEM=y, We will not have reserve_bootmem() anymore.

-v2: fix typo about ibft pointed by Konrad Rzeszutek Wilk <konrad@darnok.org>
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
LKML-Reference: <4BB510FB.80601@kernel.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Peter Jones <pjones@redhat.com>
Cc: Konrad Rzeszutek Wilk <konrad@kernel.org>
CC: Jan Beulich <jbeulich@novell.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

042be38e

30 3月, 2010 2 次提交

include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6

由 Tejun Heo 提交于 3月 24, 2010

include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: NTejun Heo <tj@kernel.org>
Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

5a0e3ad6

x86: Make sure free_init_pages() frees pages on page boundary · c967da6a

由 Yinghai Lu 提交于 3月 28, 2010

When CONFIG_NO_BOOTMEM=y, it could use memory more effiently, or
in a more compact fashion.

Example:

 Allocated new RAMDISK: 00ec2000 - 0248ce57
 Move RAMDISK from 000000002ea04000 - 000000002ffcee56 to 00ec2000 - 0248ce56

The new RAMDISK's end is not page aligned.
Last page could be shared with other users.

When free_init_pages are called for initrd or .init, the page
could be freed and we could corrupt other data.

code segment in free_init_pages():

 |        for (; addr < end; addr += PAGE_SIZE) {
 |                ClearPageReserved(virt_to_page(addr));
 |                init_page_count(virt_to_page(addr));
 |                memset((void *)(addr & ~(PAGE_SIZE-1)),
 |                        POISON_FREE_INITMEM, PAGE_SIZE);
 |                free_page(addr);
 |                totalram_pages++;
 |        }

last half page could be used as one whole free page.

So page align the boundaries.

-v2: make the original initramdisk to be aligned, according to
     Johannes, otherwise we have the chance to lose one page.
     we still need to keep initrd_end not aligned, otherwise it could
     confuse decompressor.
-v3: change to WARN_ON instead, suggested by Johannes.
-v4: use PAGE_ALIGN, suggested by Johannes.
     We may fix that macro name later to PAGE_ALIGN_UP, and PAGE_ALIGN_DOWN
     Add comments about assuming ramdisk start is aligned
     in relocate_initrd(), change to re get ramdisk_image instead of save it
     to make diff smaller. Add warning for wrong range, suggested by Johannes.
-v6: remove one WARN()
     We need to align beginning in free_init_pages()
     do not copy more than ramdisk_size, noticed by Johannes
Reported-by: NStanislaw Gruszka <sgruszka@redhat.com>
Tested-by: NStanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
Cc: David Miller <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <1269830604-26214-3-git-send-email-yinghai@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

c967da6a

25 2月, 2010 1 次提交

x86: Do not reserve brk for DMI if it's not going to be used · e808bae2

由 Thadeu Lima de Souza Cascardo 提交于 2月 09, 2010

This will save 64K bytes from memory when loading linux if DMI is
disabled, which is good for embedded systems.
Signed-off-by: NThadeu Lima de Souza Cascardo <cascardo@holoscopio.com>
LKML-Reference: <1265758732-19320-1-git-send-email-cascardo@holoscopio.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

e808bae2

13 2月, 2010 1 次提交

x86: Make 64 bit use early_res instead of bootmem before slab · 08677214

由 Yinghai Lu 提交于 2月 10, 2010

Finally we can use early_res to replace bootmem for x86_64 now.

Still can use CONFIG_NO_BOOTMEM to enable it or not.

-v2: fix 32bit compiling about MAX_DMA32_PFN
-v3: folded bug fix from LKML message below
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
LKML-Reference: <4B747239.4070907@kernel.org>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

08677214

11 2月, 2010 2 次提交

x86: Only call dma32_reserve_bootmem 64bit !CONFIG_NUMA · c252a5bb

由 Yinghai Lu 提交于 2月 10, 2010

64bit NUMA already make enough space under 4G with new early_node_mem.
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
LKML-Reference: <1265793639-15071-16-git-send-email-yinghai@kernel.org>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

c252a5bb

x86: Call early_res_to_bootmem one time · 1842f90c

由 Yinghai Lu 提交于 2月 10, 2010

Simplify setup_node_mem: don't use bootmem from other node, instead
just find_e820_area in early_node_mem.

This keeps the boundary between early_res and boot mem more clear, and
lets us only call early_res_to_bootmem() one time instead of for all
nodes.
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
LKML-Reference: <1265793639-15071-12-git-send-email-yinghai@kernel.org>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

1842f90c

02 2月, 2010 1 次提交

x86: Remove BIOS data range from e820 · 1b5576e6

由 Yinghai Lu 提交于 1月 22, 2010

In preparation for moving to the generic page_is_ram(), make explicit
what we expect to be reserved and not reserved.
Tested-by: NWu Fengguang <fengguang.wu@intel.com>
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
LKML-Reference: <20100122033004.335813103@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

1b5576e6

30 1月, 2010 1 次提交

x86: Add quirk for Intel DG45FC board to avoid low memory corruption · 7c099ce1

由 David Härdeman 提交于 1月 28, 2010

Commit 6aa542a6 added a quirk for the
Intel DG45ID board due to low memory corruption. The Intel DG45FC
shares the same BIOS (and the same bug) as noted in:

  http://bugzilla.kernel.org/show_bug.cgi?id=13736Signed-off-by: NDavid Härdeman <david@hardeman.nu>
LKML-Reference: <20100128200254.GA9134@hardeman.nu>
Cc: <stable@kernel.org>
Cc: Alexey Fisher <bug-track@fisher-privat.net>
Cc: ykzhao <yakui.zhao@intel.com>
Cc: Tony Bones <aabonesml@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

7c099ce1

11 12月, 2009 1 次提交

x86: Use find_e820() instead of hard coded trampoline address · 893f38d1

由 Yinghai Lu 提交于 12月 10, 2009

Jens found the following crash/regression:

[    0.000000] found SMP MP-table at [ffff8800000fdd80] fdd80
[    0.000000] Kernel panic - not syncing: Overlapping early reservations 12-f011 MP-table mpc to 0-fff BIOS data page

and

[    0.000000] Kernel panic - not syncing: Overlapping early reservations 12-f011 MP-table mpc to 6000-7fff TRAMPOLINE

and bisected it to b24c2a92 ("x86: Move find_smp_config()
earlier and avoid bootmem usage").

It turns out the BIOS is using the first 64k for mptable,
without reserving it.

So try to find good range for the real-mode trampoline instead of
hard coding it, in case some bios tries to use that range for sth.
Reported-by: NJens Axboe <jens.axboe@oracle.com>
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
Tested-by: NJens Axboe <jens.axboe@oracle.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
LKML-Reference: <4B21630A.6000308@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

893f38d1

01 12月, 2009 1 次提交

x86: Fix a section mismatch in arch/x86/kernel/setup.c · 9eaa192d

由 Helight.Xu 提交于 11月 30, 2009

copy_edd() should be __init.
warning msg:
WARNING: vmlinux.o(.text+0x7759): Section mismatch in reference from the
function copy_edd() to the variable .init.data:boot_params
The function copy_edd() references
the variable __initdata boot_params.
This is often because copy_edd lacks a __initdata
annotation or the annotation of boot_params is wrong.
Signed-off-by: NZhenwenXu <helight.xu@gmail.com>
LKML-Reference: <4B139F8F.4000907@gmail.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

9eaa192d

24 11月, 2009 1 次提交

x86: Move find_smp_config() earlier and avoid bootmem usage · b24c2a92

由 Yinghai Lu 提交于 11月 24, 2009

Move the find_smp_config() call to before bootmem is initialized.
Use reserve_early() instead of reserve_bootmem() in it.

This simplifies the code, we only need to call find_smp_config()
once and can remove the now unneeded reserve parameter from
x86_init_mpparse::find_smp_config.

We thus also reduce x86's dependency on bootmem allocations.
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
LKML-Reference: <4B0BB9F2.70907@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

b24c2a92

23 11月, 2009 1 次提交

x86: Change crash kernel to reserve via reserve_early() · 44280733

由 Yinghai Lu 提交于 11月 22, 2009

use find_e820_area()/reserve_early() instead.

-v2: address Eric's request, to restore original semantics.
     will fail, if the provided address can not be used.
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
Acked-by: NEric W. Biederman <ebiederm@xmission.com>
LKML-Reference: <4B09E2F9.7040403@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

44280733

17 11月, 2009 2 次提交

x86, mm: Report state of NX protections during boot · 4b0f3b81

由 Kees Cook 提交于 11月 13, 2009

It is possible for x86_64 systems to lack the NX bit either due to the
hardware lacking support or the BIOS having turned off the CPU capability,
so NX status should be reported. Additionally, anyone booting NX-capable
CPUs in 32bit mode without PAE will lack NX functionality, so this change
provides feedback for that case as well.
Signed-off-by: NKees Cook <kees.cook@canonical.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
LKML-Reference: <1258154897-6770-6-git-send-email-hpa@zytor.com>

4b0f3b81

x86, mm: Clean up and simplify NX enablement · 4763ed4d

由 H. Peter Anvin 提交于 11月 13, 2009

The 32- and 64-bit code used very different mechanisms for enabling
NX, but even the 32-bit code was enabling NX in head_32.S if it is
available.  Furthermore, we had a bewildering collection of tests for
the available of NX.

This patch:

a) merges the 32-bit set_nx() and the 64-bit check_efer() function
   into a single x86_configure_nx() function.  EFER control is left
   to the head code.

b) eliminates the nx_enabled variable entirely.  Things that need to
   test for NX enablement can verify __supported_pte_mask directly,
   and cpu_has_nx gives the supported status of NX.
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Vegard Nossum <vegardno@ifi.uio.no>
Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Chris Wright <chrisw@sous-sol.org>
LKML-Reference: <1258154897-6770-5-git-send-email-hpa@zytor.com>
Acked-by: NKees Cook <kees.cook@canonical.com>

4763ed4d

12 11月, 2009 1 次提交

x86: Make sure wakeup trampoline code is below 1MB · 196cf0d6

由 Yinghai Lu 提交于 11月 10, 2009

Instead of using bootmem, try find_e820_area()/reserve_early(),
and call acpi_reserve_memory() early, to allocate the wakeup
trampoline code area below 1M.

This is more reliable, and it also removes a dependency on
bootmem.

-v2: change function name to acpi_reserve_wakeup_memory(),
     as suggested by Rafael.
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
Acked-by: NH. Peter Anvin <hpa@zytor.com>
Acked-by: NRafael J. Wysocki <rjw@sisk.pl>
Cc: pm list <linux-pm@lists.linux-foundation.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <4AFA210B.3020207@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

196cf0d6

10 11月, 2009 1 次提交

x86: Under BIOS control, restore AP's APIC_LVTTHMR to the BSP value · a2202aa2

由 Yong Wang 提交于 11月 10, 2009

On platforms where the BIOS handles the thermal monitor interrupt,
APIC_LVTTHMR on each logical CPU is programmed to generate a SMI
and OS must not touch it.

Unfortunately AP bringup sequence using INIT-SIPI-SIPI clears all
the LVT entries except the mask bit. Essentially this results in
all LVT entries including the thermal monitoring interrupt set
to masked (clearing the bios programmed value for APIC_LVTTHMR).

And this leads to kernel take over the thermal monitoring
interrupt on AP's but not on BSP (leaving the bios programmed
value only on BSP).

As a result of this, we have seen system hangs when the thermal
monitoring interrupt is generated.

Fix this by reading the initial value of thermal LVT entry on
BSP and if bios has taken over the control, then program the
same value on all AP's and leave the thermal monitoring
interrupt control on all the logical cpu's to the bios.
Signed-off-by: NYong Wang <yong.y.wang@intel.com>
Reviewed-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Arjan van de Ven <arjan@infradead.org>
LKML-Reference: <20091110013824.GA24940@ywang-moblin2.bj.intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Cc: stable@kernel.org

a2202aa2

07 11月, 2009 1 次提交

x86: Add Phoenix/MSC BIOSes to lowmem corruption list · f1b291d4

由 Simon Kagstrom 提交于 11月 06, 2009

We have a board with a Phoenix/MSC BIOS which also corrupts the low
64KB of RAM, so add an entry to the table.
Signed-off-by: NSimon Kagstrom <simon.kagstrom@netinsight.net>
LKML-Reference: <20091106154404.002648d9@marrow.netinsight.se>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

f1b291d4