提交 · ea0be473a1f0ee89024a24d8ea4b05fbf6efcee3 · OpenHarmony / kernel_linux

15 11月, 2005 19 次提交

A
[PATCH] x86_64: Allow modular build of ia32 aout loader · ea0be473
由 Andi Kleen 提交于 11月 05, 2005
```
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
```
ea0be473

[PATCH] x86_64: Force correct address space size for MTRR on some 64bit Intel Xeons · af9c142d

由 Shaohua Li 提交于 11月 05, 2005

They report 40bit, but only have 36bits of physical address space.
This caused problems with setting up the correct masks for MTRR.

CPUID workaround for steppings 0F33h(supporting x86) and 0F34h(supporting x86
and EM64T). Detail info can be found at:
http://download.intel.com/design/Xeon/specupdt/30240216.pdf
http://download.intel.com/design/Pentium4/specupdt/30235221.pdf

Signed-off-by: Shaohua Li<shaohua.li@intel.com>
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

af9c142d

[PATCH] x86_64: Optimize NUMA node hash function · 529a3404

由 Eric Dumazet 提交于 11月 05, 2005

Compute the highest possible value for memnode_shift, in order to reduce
footprint of memnodemap[] to the minimum, thus making all users
(phys_to_nid(), kfree()), more cache friendly.

Before the patch :

 Node 0 MemBase 0000000000000000 Limit 00000001ffffffff
 Node 1 MemBase 0000000200000000 Limit 00000003ffffffff
 Using 23 for the hash shift. Max adder is 3ffffffff

After the patch :

 Node 0 MemBase 0000000000000000 Limit 00000001ffffffff
 Node 1 MemBase 0000000200000000 Limit 00000003ffffffff
 Using 33 for the hash shift.

In this case, only 2 bytes of memnodemap[] are used, instead of 2048
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

529a3404

[PATCH] x86_64: Save/restore CS in 64bit signal handlers and force __USER_CS for CS · e4e5d324

由 Bryan Ford 提交于 11月 05, 2005

This allows to run 64bit signal handlers in 64bit processes that run small
code snippets in compat mode.
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

e4e5d324

[PATCH] x86_64: New heuristics to find out hotpluggable CPUs. · 420f8f68

由 Andi Kleen 提交于 11月 05, 2005

With a NR_CPUS==128 kernel with CPU hotplug enabled we would waste 4MB
on per CPU data of all possible CPUs.  The reason was that HOTPLUG
always set up possible map to NR_CPUS cpus and then we need to allocate
that much (each per CPU data is roughly ~32k now)

The underlying problem is that ACPI didn't tell us how many hotplug CPUs
the platform supports.  So the old code just assumed all, which would
lead to this memory wastage.

This implements some new heuristics:

 - If the BIOS specified disabled CPUs in the ACPI/mptables assume they
   can be enabled later (this is bending the ACPI specification a bit,
   but seems like a obvious extension)
 - The user can overwrite it with a new additionals_cpus=NUM option
 - Otherwise use half of the available CPUs or 2, whatever is more.

Cc: ashok.raj@intel.com
Cc: len.brown@intel.com
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

420f8f68

[PATCH] x86_64: Replace swiotlb extern with include · 59170891

由 Andi Kleen 提交于 11月 05, 2005

Minor victory on the continuous quest against all stray extern.
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

59170891

[PATCH] x86_64: Replace cpu_pda extern with include · 4d74dbd7

由 Andi Kleen 提交于 11月 05, 2005

Minor cleanup - remove obsolete extern
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

4d74dbd7

[PATCH] x86_64: Only use asm/sections.h to declare section symbols · 2bc0414e

由 Andi Kleen 提交于 11月 05, 2005

Adding __initdata_* to asm-generic/sections.h
Replaces a lot of open coded externs in arch/x86_64/*
I had to change __bss_end to __bss_stop to match the other architectures.
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

2bc0414e

[PATCH] x86_64: Unmap NULL during early bootup · f6c2e333

由 Siddha, Suresh B 提交于 11月 05, 2005

We should zap the low mappings, as soon as possible, so that we can catch
kernel bugs more effectively. Previously early boot had NULL mapped
and didn't trap on NULL references.

This patch introduces boot_level4_pgt, which will always have low identity
addresses mapped.  Druing boot, all the processors will use this as their
level4 pgt.  On BP, we will switch to init_level4_pgt as soon as we enter C
code and zap the low mappings as soon as we are done with the usage of
identity low mapped addresses.  On AP's we will zap the low mappings as
soon as we jump to C code.
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: NAshok Raj <ashok.raj@intel.com>
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

f6c2e333

[PATCH] x86_64: Speed up numa_node_id by putting it directly into the PDA · 69d81fcd

由 Andi Kleen 提交于 11月 05, 2005

Not go from the CPU number to an mapping array.
Mode number is often used now in fast paths.

This also adds a generic numa_node_id to all the topology includes

Suggested by Eric Dumazet
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

69d81fcd

[PATCH] x86_64: Fix gcc 4 warning in aperture.c · 50895c5d

由 Andi Kleen 提交于 11月 05, 2005

Fix

  arch/x86_64/kernel/aperture.c: In function #iommu_hole_init#:
  arch/x86_64/kernel/aperture.c:199: warning: #aper_order# may be used uninitialized in this function
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

50895c5d

[PATCH] x86-64/i386: Fix CPU model for family 6 · f5f786d0

由 Suresh Siddha 提交于 11月 05, 2005

According to cpuid instruction in IA32 SDM-Vol2, when computing cpu model,
we need to consider extended model ID for family 0x6 also.

AK: Also added fixes/simplifcation from Petr Vandrovec
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

f5f786d0

[PATCH] x86_64: Remove duplicate __cpuinit define · e9b59d83

由 Ashok Raj 提交于 11月 05, 2005

Remove duplicate __cpuinit in smp.c. Already defined in init.h which is
already included.
Signed-off-by: NAshok Raj <ashok.raj@intel.com>
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

e9b59d83

A
[PATCH] x86_64: Use the DMA32 zone for dma_alloc_coherent()/pci_alloc_consistent · 47492d36
由 Andi Kleen 提交于 11月 05, 2005
```
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
```
47492d36

[PATCH] i386/x86-64: Share interrupt vectors when there is a large number of interrupt sources · 6004e1b7

由 James Cleverdon 提交于 11月 05, 2005

Here's a patch that builds on Natalie Protasevich's IRQ compression
patch and tries to work for MPS boots as well as ACPI. It is meant for
a 4-node IBM x460 NUMA box, which was dying because it had interrupt
pins with GSI numbers > NR_IRQS and thus overflowed irq_desc.

The problem is that this system has 270 GSIs (which are 1:1 mapped with
I/O APIC RTEs) and an 8-node box would have 540. This is much bigger
than NR_IRQS (224 for both i386 and x86_64). Also, there aren't enough
vectors to go around. There are about 190 usable vectors, not counting
the reserved ones and the unused vectors at 0x20 to 0x2F. So, my patch
attempts to compress the GSI range and share vectors by sharing IRQs.

Cc: "Protasevich, Natalie" <Natalie.Protasevich@unisys.com>
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

6004e1b7

[PATCH] x86_64: Support for AMD specific MCE Threshold. · 89b831ef

由 Jacob Shin 提交于 11月 05, 2005

MC4_MISC - DRAM Errors Threshold Register realized under AMD K8 Rev F.
This register is used to count correctable and uncorrectable ECC errors that occur during DRAM read operations.
The user may interface through sysfs files in order to change the threshold configuration.

bank%d/error_count - reads current error count, write to clear.
bank%d/interrupt_enable - set/clear interrupt enable.
bank%d/threshold_limit - read/write the threshold limit.

APIC vector 0xF9 in hw_irq.h.
5 software defined bank ids in mce.h.
new apic.c function to setup threshold apic lvt.
defaults to interrupt off, count enabled, and threshold limit max.
sysfs interface created on /sys/devices/system/threshold.

AK: added some ifdefs to make it compile on UP
Signed-off-by: NJacob Shin <jacob.shin@amd.com>
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

89b831ef

[PATCH] x86_64: Account mem_map in VM holes accounting · e18c6874

由 Andi Kleen 提交于 11月 05, 2005

The VM needs to know about lost memory in zones to accurately
balance dirty pages. This patch accounts mem_map in there too,
which fixes a constant errror of a few percent. Also some
other misc mappings and the kernel text itself are accounted
too.
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

e18c6874

[PATCH] x86_64: Add 4GB DMA32 zone · a2f1b424

由 Andi Kleen 提交于 11月 05, 2005

Add a new 4GB GFP_DMA32 zone between the GFP_DMA and GFP_NORMAL zones.

As a bit of historical background: when the x86-64 port
was originally designed we had some discussion if we should
use a 16MB DMA zone like i386 or a 4GB DMA zone like IA64 or
both. Both was ruled out at this point because it was in early
2.4 when VM is still quite shakey and had bad troubles even
dealing with one DMA zone.  We settled on the 16MB DMA zone mainly
because we worried about older soundcards and the floppy.

But this has always caused problems since then because
device drivers had trouble getting enough DMA able memory. These days
the VM works much better and the wide use of NUMA has proven
it can deal with many zones successfully.

So this patch adds both zones.

This helps drivers who need a lot of memory below 4GB because
their hardware is not accessing more (graphic drivers - proprietary
and free ones, video frame buffer drivers, sound drivers etc.).
Previously they could only use IOMMU+16MB GFP_DMA, which
was not enough memory.

Another common problem is that hardware who has full memory
addressing for >4GB misses it for some control structures in memory
(like transmit rings or other metadata).  They tended to allocate memory
in the 16MB GFP_DMA or the IOMMU/swiotlb then using pci_alloc_consistent,
but that can tie up a lot of precious 16MB GFPDMA/IOMMU/swiotlb memory
(even on AMD systems the IOMMU tends to be quite small) especially if you have
many devices.  With the new zone pci_alloc_consistent can just put
this stuff into memory below 4GB which works better.

One argument was still if the zone should be 4GB or 2GB. The main
motivation for 2GB would be an unnamed not so unpopular hardware
raid controller (mostly found in older machines from a particular four letter
company) who has a strange 2GB restriction in firmware. But
that one works ok with swiotlb/IOMMU anyways, so it doesn't really
need GFP_DMA32. I chose 4GB to be compatible with IA64 and because
it seems to be the most common restriction.

The new zone is so far added only for x86-64.

For other architectures who don't set up this
new zone nothing changes. Architectures can set a compatibility
define in Kconfig CONFIG_DMA_IS_DMA32 that will define GFP_DMA32
as GFP_DMA. Otherwise it's a nop because on 32bit architectures
it's normally not needed because GFP_NORMAL (=0) is DMA able
enough.

One problem is still that GFP_DMA means different things on different
architectures. e.g. some drivers used to have #ifdef ia64  use GFP_DMA
(trusting it to be 4GB) #elif __x86_64__ (use other hacks like
the swiotlb because 16MB is not enough) ... . This was quite
ugly and is now obsolete.

These should be now converted to use GFP_DMA32 unconditionally. I haven't done
this yet. Or best only use pci_alloc_consistent/dma_alloc_coherent
which will use GFP_DMA32 transparently.
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

a2f1b424

[PATCH] x86_64: Update defconfig · 56720367

由 Andi Kleen 提交于 11月 05, 2005

Rerun and enable autofs 4, relayfs and softdog
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

56720367

03 11月, 2005 1 次提交

[PATCH] x86-64: bitops fix for -Os · 06024f21

由 Alexandre Oliva 提交于 10月 31, 2005

This fixes the x86-64 find_[first|next]_zero_bit() function for the
end-of-range case.  It didn't test for a zero size, and the "rep scas"
would do entirely the wrong thing.
Signed-off-by: NAlexandre Oliva <oliva@lsd.ic.unicamp.br>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

06024f21

31 10月, 2005 11 次提交

[PATCH] hpet-RTC: cache the comparator register · 7811fb8f

由 Clemens Ladisch 提交于 10月 30, 2005

Reads from an HPET register require a round trip to the south bridge and are
almost as slow as PCI reads.  By caching the last value we've written to the
comparator register, we can eliminate all HPET reads from the fast path in the
emulated RTC interrupt handler.
Signed-off-by: NClemens Ladisch <clemens@ladisch.de>
Acked-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

7811fb8f

[PATCH] hpet-RTC: fix timer config register accesses · 5f819949

由 Clemens Ladisch 提交于 10月 30, 2005

Make sure that the RTC timer is in non-periodic mode; some stupid BIOS might
have initialized it to periodic mode.

Furthermore, don't set the SETVAL bit in the config register.  This wouldn't
have any effect unless the timer was in period mode (which it isn't), and then
the actual timer frequency would be half that of the desired one because
incrementing the comparator in the interrupt handler would be done after the
hardware has already incremented it itself.
Signed-off-by: NClemens Ladisch <clemens@ladisch.de>
Acked-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

5f819949

[PATCH] hpet-RTC: disable interrupt when no longer needed · f00c96f3

由 Clemens Ladisch 提交于 10月 30, 2005

When the emulated RTC interrupt is no longer needed, we better disable it;
otherwise, we get a spurious interrupt whenever the timer has rolled over and
reaches the same comparator value.

Having a superfluous interrupt every five minutes doesn't hurt much, but it's
bad style anyway.  ;-)
Signed-off-by: NClemens Ladisch <clemens@ladisch.de>
Acked-by: N"Pallipadi, Venkatesh" <venkatesh.pallipadi@intel.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

f00c96f3

[PATCH] jiffies_64 cleanup · ecea8d19

由 Thomas Gleixner 提交于 10月 30, 2005

Define jiffies_64 in kernel/timer.c rather than having 24 duplicated
defines in each architecture.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

ecea8d19

[PATCH] Remove orphaned TIOCGDEV compat ioctl · 371e8c25

由 Brian Gerst 提交于 10月 30, 2005

This ioctl doesn't exist for native i386.
Signed-off-by: NBrian Gerst <bgerst@didntduck.org>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

371e8c25

[PATCH] introduce setup_timer() helper · a8db2db1

由 Oleg Nesterov 提交于 10月 30, 2005

Every user of init_timer() also needs to initialize ->function and ->data
fields.  This patch adds a simple setup_timer() helper for that.

The schedule_timeout() is patched as an example of usage.
Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

a8db2db1

[PATCH] swsusp: rework memory freeing on resume · 2c1b4a5c

由 Rafael J. Wysocki 提交于 10月 30, 2005

The following patch makes swsusp use the PG_nosave and PG_nosave_free flags to
mark pages that should be freed in case of an error during resume.

This allows us to simplify the code and to use swsusp_free() in all of the
swsusp's resume error paths, which makes them actually work.
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

2c1b4a5c

[PATCH] Clean up mtrr compat ioctl code · c5311781

由 Brian Gerst 提交于 10月 30, 2005

Handle 32-bit mtrr ioctls in the mtrr driver instead of the ia32
compatability layer.
Signed-off-by: NBrian Gerst <bgerst@didntduck.org>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

c5311781

[PATCH] x86: vmx cpu feature detection · daedb82d

由 Kamble, Nitin A 提交于 10月 30, 2005

If VMX feature is available in the CPU, this patch will make it visible in
the /proc/cpuinfo with the cpuid detection.
Signed-Off-By: NNitin A Kamble <nitin.a.kamble@intel.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

daedb82d

[PATCH] FPU context corrupted after resume · 08967f94

由 Shaohua Li 提交于 10月 30, 2005

mxcsr_feature_mask_init isn't needed in suspend/resume time (we can use
boot time mask).  And actually it's harmful, as it clear task's saved
fxsave in resume.  This bug is widely seen by users using zsh.

(akpm: my eyes.  Fixed some surrounding whitespace mess)

Signed-off-by: Shaohua Li<shaohua.li@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

08967f94

[PATCH] i386 and x86_64 TSC set_cyc2ns_scale imprecision · dacb16b1

由 Mathieu Desnoyers 提交于 10月 30, 2005

I just found out that some precision is unnecessarily lost in the
arch/i386/kernel/timers/timer_tsc.c:set_cyc2ns_scale function.  It uses a
cpu_mhz parameter when it could use a cpu_khz.  In the specific case of an
Intel P4 running at 3001.171 Mhz, the truncation to 3001 Mhz leads to an
imprecision of 19 microseconds per second : this is very sad for a timer with
nearly nanosecond accuracy.

Fix the x86_64 architecture too.

Cc: george anzinger <george@mvista.com>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

dacb16b1

30 10月, 2005 2 次提交

[PATCH] mm: init_mm without ptlock · 872fec16

由 Hugh Dickins 提交于 10月 29, 2005

First step in pushing down the page_table_lock. init_mm.page_table_lock has
been used throughout the architectures (usually for ioremap): not to serialize
kernel address space allocation (that's usually vmlist_lock), but because
pud_alloc,pmd_alloc,pte_alloc_kernel expect caller holds it.

Reverse that: don't lock or unlock init_mm.page_table_lock in any of the
architectures; instead rely on pud_alloc,pmd_alloc,pte_alloc_kernel to take
and drop it when allocating a new one, to check lest a racing task already
did. Similarly no page_table_lock in vmalloc's map_vm_area.

Some temporary ugliness in __pud_alloc and __pmd_alloc: since they also handle
user mms, which are converted only by a later patch, for now they have to lock
differently according to whether or not it's init_mm.

If sources get muddled, there's a danger that an arch source taking
init_mm.page_table_lock will be mixed with common source also taking it (or
neither take it). So break the rules and make another change, which should
break the build for such a mismatch: remove the redundant mm arg from
pte_alloc_kernel (ppc64 scrapped its distinct ioremap_mm in 2.6.13).

Exceptions: arm26 used pte_alloc_kernel on user mm, now pte_alloc_map; ia64
used pte_alloc_map on init_mm, now pte_alloc_kernel; parisc had bad args to
pmd_alloc and pte_alloc_kernel in unused USE_HPPA_IOREMAP code; ppc64
map_io_page forgot to unlock on failure; ppc mmu_mapin_ram and ppc64 im_free
took page_table_lock for no good reason.
Signed-off-by: NHugh Dickins <hugh@veritas.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

872fec16

[PATCH] mm: mm_init set_mm_counters · 404351e6

由 Hugh Dickins 提交于 10月 29, 2005

How is anon_rss initialized?  In dup_mmap, and by mm_alloc's memset; but
that's not so good if an mm_counter_t is a special type.  And how is rss
initialized?  By set_mm_counter, all over the place.  Come on, we just need to
initialize them both at once by set_mm_counter in mm_init (which follows the
memcpy when forking).
Signed-off-by: NHugh Dickins <hugh@veritas.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

404351e6

28 10月, 2005 1 次提交

[PATCH] gfp_t: dma-mapping (amd64) · f80aabb0

由 Al Viro 提交于 10月 21, 2005

Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

f80aabb0

11 10月, 2005 2 次提交

[PATCH] x86_64: Allocate cpu local data for all possible CPUs · 421c7ce6

由 Andi Kleen 提交于 10月 10, 2005

CPU hotplug fills up the possible map to NR_CPUs, but it did that after
setting up per CPU data. This lead to CPU data not getting allocated
for all possible CPUs, which lead to various side effects.
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

421c7ce6

[PATCH] x86_64: Fix change_page_attr cache flushing · 094804c5

由 Andi Kleen 提交于 10月 11, 2005

Noticed by Terence Ripperda

Undo wrong change in global_flush_tlb. We need to flush the caches in all
cases, not just when pages were reverted. This was a bogus optimization
added earlier, but it was wrong.
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

094804c5

10 10月, 2005 2 次提交

[PATCH] i386: fix stack alignment for signal handlers · d347f372

由 Markus F.X.J. Oberhumer 提交于 10月 09, 2005

This fixes the setup of the alignment of the signal frame, so that all
signal handlers are run with a properly aligned stack frame.

The current code "over-aligns" the stack pointer so that the stack frame
is effectively always mis-aligned by 4 bytes. But what we really want
is that on function entry ((sp + 4) & 15) == 0, which matches what would
happen if the stack were aligned before a "call" instruction.
Signed-off-by: NMarkus F.X.J. Oberhumer <markus@oberhumer.com>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

d347f372

[PATCH] x86_64: Set up safe page tables during resume · 3dd08325

由 Rafael J. Wysocki 提交于 10月 09, 2005

The following patch makes swsusp avoid the possible temporary corruption
of page translation tables during resume on x86-64.  This is achieved by
creating a copy of the relevant page tables that will not be modified by
swsusp and can be safely used by it on resume.

The problem is that during resume on x86-64 swsusp may temporarily
corrupt the page tables used for the direct mapping of RAM.  If that
happens, a page fault occurs and cannot be handled properly, which leads
to the solid hang of the affected system.  This leads to the loss of the
system's state from before suspend and may result in the loss of data or
the corruption of filesystems, so it is a serious issue.  Also, it
appears to happen quite often (for me, as often as 50% of the time).

The problem is related to the fact that (at least) one of the PMD
entries used in the direct memory mapping (starting at PAGE_OFFSET)
points to a page table the physical address of which is much greater
than the physical address of the PMD entry itself.  Moreover,
unfortunately, the physical address of the page table before suspend
(i.e.  the one stored in the suspend image) happens to be different to
the physical address of the corresponding page table used during resume
(i.e.  the one that is valid right before swsusp_arch_resume() in
arch/x86_64/kernel/suspend_asm.S is executed).  Thus while the image is
restored, the "offending" PMD entry gets overwritten, so it does not
point to the right physical address any more (i.e.  there's no page
table at the address pointed to by it, because it points to the address
the page table has been at during suspend).  Consequently, if the PMD
entry is used later on, and it _is_ used in the process of copying the
image pages, a page fault occurs, but it cannot be handled in the normal
way and the system hangs.

In principle we can call create_resume_mapping() from
swsusp_arch_resume() (ie.  from suspend_asm.S), but then the memory
allocations in create_resume_mapping(), resume_pud_mapping(), and
resume_pmd_mapping() must be made carefully so that we use _only_
NosaveFree pages in them (the other pages are overwritten by the loop in
swsusp_arch_resume()).  Additionally, we are in atomic context at that
time, so we cannot use GFP_KERNEL.  Moreover, if one of the allocations
fails, we should free all of the allocated pages, so we need to trace
them somehow.

All of this is done in the appended patch, except that the functions
populating the page tables are located in arch/x86_64/kernel/suspend.c
rather than in init.c.  It may be done in a more elegan way in the
future, with the help of some swsusp patches that are in the works now.

[AK: move some externs into headers, renamed a function]
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

3dd08325

05 10月, 2005 1 次提交

[PATCH] x86_64: Drop global bit from early low mappings · 944d2647

由 Andi Kleen 提交于 10月 05, 2005

Drop global bit from early low mappings

Suggested by Linus, originally also proposed by Suresh.

This fixes a race condition with early start of udev, originally
tracked down by Suresh B. Siddha. The problem was that switching
to the user space VM would not clear the global low mappings
for the beginning of memory, which lead to memory corruption.

Drop the global bits.

The kernel mapping stays global because it should stay constant.
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

944d2647

04 10月, 2005 1 次提交

[PATCH] x86_64: Fix numa node topology detection for srat based x86_64 boxes · ddea7be0

由 Ravikiran G Thirumalai 提交于 10月 03, 2005

2.6.14-rc2 does not assign cpus to proper nodeids on our em64t numa boxen.
Our boxes use acpi srat for parsing the numa information.

srat_detect_node() used phys_proc_id[] to get to the cpu's local apic id,
but phys_proc_id[] represents the cpu<->initial_apic_id mapping.  The
following patch fixes this problem.  Now apicid_to_node[] is properly
indexed with the local apic id.
Signed-off-by: NRavikiran Thirumalai <kiran@scalex86.org>
Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

ddea7be0

OpenHarmony / kernel_linux 上一次同步 3 年多

OpenHarmony / kernel_linux
上一次同步 3 年多