提交 · e405d067298b2b960bf20318e91ed842157c65bc · openeuler / raspberrypi-kernel

10 4月, 2006 8 次提交

[PATCH] x86_64: Proper null pointer check in powernow_k8_get · 4211a303

由 Jacob Shin 提交于 4月 07, 2006

This prevents crashes on dual core system when enough ticks are lost.

Replaces earlier patch by me.

Cc: Dave Jones <davej@redhat.com>
Signed-off-by: NThomas Renninger <trenn@suse.de>
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

4211a303

A
[PATCH] x86_64: Revert earlier powernow-k8 change · d7fa706c
由 Andi Kleen 提交于 4月 07, 2006
```
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
```
d7fa706c

[PATCH] i386: Consolidate modern APIC handling · 95d769aa

由 Andi Kleen 提交于 4月 07, 2006

AMD systems have a modern APIC that supports 8 bit IDs, but
don't have a XAPIC version number.  Add a new "modern_apic"
subfunction that handles this correctly and use it (nearly)
everywhere where XAPIC is tested for.

I removed one wart: the code specified that external APICs
would use an 8bit APIC ID. But I checked a real 82093 data sheet
and it says clearly that they only use 4bit. So I removed
this special case since it would a bit awkward to implement now.

I removed the valid APIC tests in mptable parsing completely. On any modern
system they only check against the full field width (8bit) anyways
and are no-ops. This also fixes them doing the wrong thing
on >8 core Opterons.

This makes i386 boot again on 16 core Opterons.

Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

95d769aa

[PATCH] x86-64/i386: Don't process APICs/IO-APICs in ACPI when APIC is disabled. · d3b6a349

由 Andi Kleen 提交于 4月 07, 2006

When nolapic was passed or the local APIC was disabled
for another reason ACPI would still parse the IO-APICs
until these were explicitely disabled with noapic.

Usually this resulted in a non booting configuration unless
"nolapic noapic" was used.

I also disabled the local APIC parsing in this case, although
that's only cosmetic (suppresses a few printks)

This hopefully makes nolapic work in all cases.
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

d3b6a349

[PATCH] x86_64: Don't sanity check Type 1 PCI bus access on newer systems · ec0f08ee

由 Andi Kleen 提交于 4月 07, 2006

Horus systems don't have anything on bus 0 which makes
the Type 1 sanity checks fail.  Use the DMI BIOS year to
check for newer systems and always assume Type 1 works on them.
I used 2001 as an pretty arbitary cutoff year.

Cc: gregkh@suse.de
Cc: Navin Boppuri <navin.boppuri@newisys.com>
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

ec0f08ee

[PATCH] i386/x86-64: Check that MCFG points to an e820 reserved area · 946f2ee5

由 Arjan van de Ven 提交于 4月 07, 2006

This patch introduces a user for the e820_all_mapped function:

There have been several machines that don't have a working MMCONFIG,
often because of a buggy MCFG table in the ACPI bios. This patch adds a
simple sanity check that detects a whole bunch of these cases, and when
it detects it, linux now boots rather than crash-and-burns.

The accuracy of this detection can in principle be improved if there was
a "is this entire range in e820 with THIS attribute", but no such
function exist and the complexity needed for this is not really worth
it; this simple check already catches most cases anyway.
Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

946f2ee5

[PATCH] x86_64: Introduce e820_all_mapped · 95222368

由 Arjan van de Ven 提交于 4月 07, 2006

Introduce a e820_all_mapped() function which checks if the entire range
<start,end> is mapped with type.

This is done by moving the local start variable to the end of each
known-good region; if at the end of the function the start address is
still before end, there must be a part that's not of the correct type;
otherwise it's a good region.
Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

95222368

[PATCH] x86_64: Support memory hotadd without sparsemem · 9d99aaa3

由 Andi Kleen 提交于 4月 07, 2006

Memory hotadd doesn't need SPARSEMEM, but can be handled by just preallocating
mem_maps. This only needs some untangling of ifdefs to enable the necessary
code even without SPARSEMEM.

Originally from Keith Mannthey, hacked by AK.
Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

9d99aaa3

01 4月, 2006 6 次提交

kexec: grammar fix for crash_save_this_cpu() · 36a891b6

由 Horms 提交于 4月 01, 2006

kexec: grammar fix for crash_save_this_cpu()
Signed-Off-By: NHorms <horms@verge.net.au>
Signed-off-by: NAdrian Bunk <bunk@stusta.de>

36a891b6

[PATCH] unexport get_wchan · 0cb3463f

由 Adrian Bunk 提交于 3月 31, 2006

The only user of get_wchan is the proc fs - and proc can't be built modular.
Signed-off-by: NAdrian Bunk <bunk@stusta.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

0cb3463f

[PATCH] sys_sync_file_range() · f79e2abb

由 Andrew Morton 提交于 3月 31, 2006

Remove the recently-added LINUX_FADV_ASYNC_WRITE and LINUX_FADV_WRITE_WAIT
fadvise() additions, do it in a new sys_sync_file_range() syscall instead.
Reasons:

- It's more flexible.  Things which would require two or three syscalls with
  fadvise() can be done in a single syscall.

- Using fadvise() in this manner is something not covered by POSIX.

The patch wires up the syscall for x86.

The sycall is implemented in the new fs/sync.c.  The intention is that we can
move sys_fsync(), sys_fdatasync() and perhaps sys_sync() into there later.

Documentation for the syscall is in fs/sync.c.

A test app (sync_file_range.c) is in
http://www.zip.com.au/~akpm/linux/patches/stuff/ext3-tools.tar.gz.

The available-to-GPL-modules do_sync_file_range() is for knfsd: "A COMMIT can
say NFS_DATA_SYNC or NFS_FILE_SYNC.  I can skip the ->fsync call for
NFS_DATA_SYNC which is hopefully the more common."

Note: the `async' writeout mode SYNC_FILE_RANGE_WRITE will turn synchronous if
the queue is congested.  This is trivial to fix: add a new flag bit, set
wbc->nonblocking.  But I'm not sure that we want to expose implementation
details down to that level.

Note: it's notable that we can sync an fd which wasn't opened for writing.
Same with fsync() and fdatasync()).

Note: the code takes some care to handle attempts to sync file contents
outside the 16TB offset on 32-bit machines.  It makes such attempts appear to
succeed, for best 32-bit/64-bit compatibility.  Perhaps it should make such
requests fail...

Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: Ulrich Drepper <drepper@redhat.com>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

f79e2abb

[PATCH] Don't pass boot parameters to argv_init[] · 9b41046c

由 OGAWA Hirofumi 提交于 3月 31, 2006

The boot cmdline is parsed in parse_early_param() and
parse_args(,unknown_bootoption).

And __setup() is used in obsolete_checksetup().

	start_kernel()
		-> parse_args()
			-> unknown_bootoption()
				-> obsolete_checksetup()

If __setup()'s callback (->setup_func()) returns 1 in
obsolete_checksetup(), obsolete_checksetup() thinks a parameter was
handled.

If ->setup_func() returns 0, obsolete_checksetup() tries other
->setup_func().  If all ->setup_func() that matched a parameter returns 0,
a parameter is seted to argv_init[].

Then, when runing /sbin/init or init=app, argv_init[] is passed to the app.
If the app doesn't ignore those arguments, it will warning and exit.

This patch fixes a wrong usage of it, however fixes obvious one only.
Signed-off-by: NOGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

9b41046c

[PATCH] Mark unwind info for signal trampolines in vDSOs · da2e9e1f

由 Jakub Jelinek 提交于 3月 31, 2006

Mark unwind info for signal trampolines using the new S augmentation flag
introduced in: http://gcc.gnu.org/PR26208.

GCC 4.2 (or patched earlier GCC) will be able to special case unwinding
through frames right above signal trampolines.  As the augmentations start
with z flag and S is at the very end of the augmentation string, older GCCs
will just skip the S flag as unknown (that's why an augmentation flag was
chosen over say a new CFA opcode).
Signed-off-by: NJakub Jelinek <jakub@redhat.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

da2e9e1f

[PATCH] i386 kdump timer vector lockup fix · 1a75a3f0

由 Vivek Goyal 提交于 3月 31, 2006

Porting the patch I posted for x86_64 to i386.

http://marc.theaimsgroup.com/?l=linux-kernel&m=114178139610707&w=2

o While using kdump, after a system crash when second kernel boots, timer
  vector gets (0x31) locked and CPU does not see timer interrupts
  travelling from IOAPIC to APIC.  Currently it does not lead to boot
  failure in second kernel as timer interrupts continues to come as ExtInt
  through LAPIC directly, but fixing it is good in case some boards do not
  support the other mode.

o After a system crash, it is not safe to service interrupts any more,
  hence interrupts are disabled.  This leads to pending interrupts at
  LAPIC.  LAPIC sends these interrupts to the CPU during early boot of
  second kernel.  Other pending interrupts are discarded saying unexpected
  trap but timer interrupt is serviced and CPU does not issue an LAPIC EOI
  because it think this interrupt came from i8259 and sends ack to 8259.
  This leads to vector 0x31 locking as LAPIC does not clear respective ISR
  and keeps on waiting for EOI.

o This patch issues extra EOI for the pending interrupts who have ISR set.

o Though today only timer seems to be the special case because in early
  boot it thinks interrupts are coming from i8259 and uses
  mask_and_ack_8259A() as ack handler and does not issue LAPIC EOI.  But
  probably doing it in generic manner for all vectors makes sense.
Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

1a75a3f0

31 3月, 2006 1 次提交

[PATCH] Introduce sys_splice() system call · 5274f052

由 Jens Axboe 提交于 3月 30, 2006

This adds support for the sys_splice system call. Using a pipe as a
transport, it can connect to files or sockets (latter as output only).

From the splice.c comments:

   "splice": joining two ropes together by interweaving their strands.

   This is the "extended pipe" functionality, where a pipe is used as
   an arbitrary in-memory buffer. Think of a pipe as a small kernel
   buffer that you can use to transfer data from one end to the other.

   The traditional unix read/write is extended with a "splice()" operation
   that transfers data buffers to or from a pipe buffer.

   Named by Larry McVoy, original implementation from Linus, extended by
   Jens to support splicing to files and fixing the initial implementation
   bugs.
Signed-off-by: NJens Axboe <axboe@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

5274f052

29 3月, 2006 3 次提交

[PATCH] fix signed vs unsigned in nmi watchdog · b791ccef

由 Jesper Juhl 提交于 3月 28, 2006

Fix "signed vs unsigned" in nmi_watchdog_tick.
Signed-off-by: NJesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

b791ccef

[PATCH] arch/i386/kernel/microcode.c: remove the obsolete microcode_ioctl · f45e4656

由 Adrian Bunk 提交于 3月 28, 2006

Nowadays, even Debian stable ships a microcode_ctl utility recent enough to no
longer use this ioctl.
Signed-off-by: NAdrian Bunk <bunk@stusta.de>
Acked-by: NTigran Aivazian <tigran_aivazian@symantec.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

f45e4656

[PATCH] for_each_possible_cpu: i386 · c8912599

由 KAMEZAWA Hiroyuki 提交于 3月 28, 2006

This patch replaces for_each_cpu with for_each_possible_cpu.

under arch/i386.
Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

c8912599

28 3月, 2006 11 次提交

[CPUFREQ] powernow: remove private for_each_cpu_mask() · 64840e27

由 Andrew Morton 提交于 3月 25, 2006

It is unneeded and wrong.
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NDave Jones <davej@redhat.com>

64840e27

[CPUFREQ] hotplug cpu fix for powernow-k8 · eef5167e

由 shin, jacob 提交于 3月 27, 2006

Andi's previous fix to initialise powernow_data on all siblings
will not work properly with CPU Hotplug.
Signed-off-by: NJacob Shin <jacob.shin@amd.com>
Signed-off-by: NDave Jones <davej@redhat.com>

eef5167e

[PATCH] fbdev: Make BIOS EDID reading configurable · 59153f7d

由 Antonino A. Daplas 提交于 3月 27, 2006

DDC reading via the Video BIOS may take several tens of seconds with some
combination of display cards and monitors.

Make this option configurable.  It defaults to `y' to minimise disruption.
Signed-off-by: NAntonino Daplas <adaplas@pol.net>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

59153f7d

[PATCH] Notifier chain update: API changes · e041c683

由 Alan Stern 提交于 3月 27, 2006

The kernel's implementation of notifier chains is unsafe.  There is no
protection against entries being added to or removed from a chain while the
chain is in use.  The issues were discussed in this thread:

    http://marc.theaimsgroup.com/?l=linux-kernel&m=113018709002036&w=2

We noticed that notifier chains in the kernel fall into two basic usage
classes:

	"Blocking" chains are always called from a process context
	and the callout routines are allowed to sleep;

	"Atomic" chains can be called from an atomic context and
	the callout routines are not allowed to sleep.

We decided to codify this distinction and make it part of the API.  Therefore
this set of patches introduces three new, parallel APIs: one for blocking
notifiers, one for atomic notifiers, and one for "raw" notifiers (which is
really just the old API under a new name).  New kinds of data structures are
used for the heads of the chains, and new routines are defined for
registration, unregistration, and calling a chain.  The three APIs are
explained in include/linux/notifier.h and their implementation is in
kernel/sys.c.

With atomic and blocking chains, the implementation guarantees that the chain
links will not be corrupted and that chain callers will not get messed up by
entries being added or removed.  For raw chains the implementation provides no
guarantees at all; users of this API must provide their own protections.  (The
idea was that situations may come up where the assumptions of the atomic and
blocking APIs are not appropriate, so it should be possible for users to
handle these things in their own way.)

There are some limitations, which should not be too hard to live with.  For
atomic/blocking chains, registration and unregistration must always be done in
a process context since the chain is protected by a mutex/rwsem.  Also, a
callout routine for a non-raw chain must not try to register or unregister
entries on its own chain.  (This did happen in a couple of places and the code
had to be changed to avoid it.)

Since atomic chains may be called from within an NMI handler, they cannot use
spinlocks for synchronization.  Instead we use RCU.  The overhead falls almost
entirely in the unregister routine, which is okay since unregistration is much
less frequent that calling a chain.

Here is the list of chains that we adjusted and their classifications.  None
of them use the raw API, so for the moment it is only a placeholder.

  ATOMIC CHAINS
  -------------
arch/i386/kernel/traps.c:		i386die_chain
arch/ia64/kernel/traps.c:		ia64die_chain
arch/powerpc/kernel/traps.c:		powerpc_die_chain
arch/sparc64/kernel/traps.c:		sparc64die_chain
arch/x86_64/kernel/traps.c:		die_chain
drivers/char/ipmi/ipmi_si_intf.c:	xaction_notifier_list
kernel/panic.c:				panic_notifier_list
kernel/profile.c:			task_free_notifier
net/bluetooth/hci_core.c:		hci_notifier
net/ipv4/netfilter/ip_conntrack_core.c:	ip_conntrack_chain
net/ipv4/netfilter/ip_conntrack_core.c:	ip_conntrack_expect_chain
net/ipv6/addrconf.c:			inet6addr_chain
net/netfilter/nf_conntrack_core.c:	nf_conntrack_chain
net/netfilter/nf_conntrack_core.c:	nf_conntrack_expect_chain
net/netlink/af_netlink.c:		netlink_chain

  BLOCKING CHAINS
  ---------------
arch/powerpc/platforms/pseries/reconfig.c:	pSeries_reconfig_chain
arch/s390/kernel/process.c:		idle_chain
arch/x86_64/kernel/process.c		idle_notifier
drivers/base/memory.c:			memory_chain
drivers/cpufreq/cpufreq.c		cpufreq_policy_notifier_list
drivers/cpufreq/cpufreq.c		cpufreq_transition_notifier_list
drivers/macintosh/adb.c:		adb_client_list
drivers/macintosh/via-pmu.c		sleep_notifier_list
drivers/macintosh/via-pmu68k.c		sleep_notifier_list
drivers/macintosh/windfarm_core.c	wf_client_list
drivers/usb/core/notify.c		usb_notifier_list
drivers/video/fbmem.c			fb_notifier_list
kernel/cpu.c				cpu_chain
kernel/module.c				module_notify_list
kernel/profile.c			munmap_notifier
kernel/profile.c			task_exit_notifier
kernel/sys.c				reboot_notifier_list
net/core/dev.c				netdev_chain
net/decnet/dn_dev.c:			dnaddr_chain
net/ipv4/devinet.c:			inetaddr_chain

It's possible that some of these classifications are wrong.  If they are,
please let us know or submit a patch to fix them.  Note that any chain that
gets called very frequently should be atomic, because the rwsem read-locking
used for blocking chains is very likely to incur cache misses on SMP systems.
(However, if the chain's callout routines may sleep then the chain cannot be
atomic.)

The patch set was written by Alan Stern and Chandra Seetharaman, incorporating
material written by Keith Owens and suggestions from Paul McKenney and Andrew
Morton.

[jes@sgi.com: restructure the notifier chain initialization macros]
Signed-off-by: NAlan Stern <stern@rowland.harvard.edu>
Signed-off-by: NChandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: NJes Sorensen <jes@sgi.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

e041c683

[PATCH] lightweight robust futexes: i386 · dfd4e3ec

由 Ingo Molnar 提交于 3月 27, 2006

i386: add the futex_atomic_cmpxchg_inuser() assembly implementation, and wire
up the new syscalls.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NArjan van de Ven <arjan@infradead.org>
Acked-by: NUlrich Drepper <drepper@redhat.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

dfd4e3ec

[PATCH] unify PFN_* macros · 22a9835c

由 Dave Hansen 提交于 3月 27, 2006

Just about every architecture defines some macros to do operations on pfns.
 They're all virtually identical.  This patch consolidates all of them.

One minor glitch is that at least i386 uses them in a very skeletal header
file.  To keep away from #include dependency hell, I stuck the new
definitions in a new, isolated header.

Of all of the implementations, sh64 is the only one that varied by a bit.
It used some masks to ensure that any sign-extension got ripped away before
the arithmetic is done.  This has been posted to that sh64 maintainers and
the development list.

Compiles on x86, x86_64, ia64 and ppc64.
Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

22a9835c

[PATCH] for_each_online_pgdat: remove sorting pgdat · 3571761f

由 KAMEZAWA Hiroyuki 提交于 3月 27, 2006

Because pgdat_list was linked to pgdat_list in *reverse* order, (By default)
some of arch has to sort it by themselves.

for_each_pgdat has gone..for_each_online_pgdat() uses node_online_map, which
doesn't need to be sorted.

This patch removes codes for sorting pgdat.
Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

3571761f

[PATCH] for_each_online_pgdat: renaming for_each_pgdat · ec936fc5

由 KAMEZAWA Hiroyuki 提交于 3月 27, 2006

Replace for_each_pgdat() with for_each_online_pgdat().
Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

ec936fc5

[PATCH] x86: don't use cpuid.2 to determine cache info if cpuid.4 is supported · b06be912

由 Shaohua Li 提交于 3月 27, 2006

Don't use cpuid.2 to determine cache info if cpuid.4 is supported.  The
exception is P4 trace cache.  We always use cpuid.2 to get trace cache
under P4.
Signed-off-by: NShaohua Li <shaohua.li@intel.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

b06be912

[PATCH] sched: new sched domain for representing multi-core · 1e9f28fa

由 Siddha, Suresh B 提交于 3月 27, 2006

Add a new sched domain for representing multi-core with shared caches
between cores. Consider a dual package system, each package containing two
cores and with last level cache shared between cores with in a package. If
there are two runnable processes, with this appended patch those two
processes will be scheduled on different packages.

On such systems, with this patch we have observed 8% perf improvement with
specJBB(2 warehouse) benchmark and 35% improvement with CFP2000 rate(with 2
users).

This new domain will come into play only on multi-core systems with shared
caches. On other systems, this sched domain will be removed by domain
degeneration code. This new domain can be also used for implementing power
savings policy (see OLS 2005 CMP kernel scheduler paper for more details..
I will post another patch for power savings policy soon)

Most of the arch/* file changes are for cpu_coregroup_map() implementation.
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

1e9f28fa

[PATCH] PM-Timer: don't use workaround if chipset is not buggy · dbffa471

由 OGAWA Hirofumi 提交于 3月 27, 2006

Current timer_pm.c reads I/O port triple times, in order to avoid the bug
of chipset.  But I/O port is slow.

2.6.16 (pmtmr)
Simple gettimeofday: 3.6532 microseconds

2.6.16+patch (pmtmr)
Simple gettimeofday: 1.4582 microseconds

[if chip is buggy, probably it will be 7us or more in 4.2% of probability.]

This patch adds blacklist of buggy chip, and if chip is not buggy, this
uses fast normal version instead of slow workaround version.

If chip is buggy, warnings "pmtmr is slow".  But sounds like there is gray
zone.  I found the PIIX4 errata, but I couldn't find the ICH4 errata.  But
some motherboard seems to have problem.

So, if we found a ICH4, generate warnings, and use a workaround version.
If user's ICH4 is good, the user can specify the "pmtmr_good" boot
parameter to use fast version.
Acked-by: NJohn Stultz <johnstul@us.ibm.com>
Signed-off-by: NOGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

dbffa471

27 3月, 2006 11 次提交

[PATCH] bitops: i386: use generic bitops · 1cc2b994

由 Akinobu Mita 提交于 3月 26, 2006

- remove generic_fls64()
- remove sched_find_first_bit()
- remove generic_hweight{32,16,8}()
- remove ext2_{set,clear,test,find_first_zero,find_next_zero}_bit()
- remove minix_{test,set,test_and_clear,test,find_first_zero}_bit()
Signed-off-by: NAkinobu Mita <mita@miraclelinux.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

1cc2b994

[PATCH] kprobes: fix broken fault handling for i386 · b4026513

由 Prasanna S Panchamukhi 提交于 3月 26, 2006

Provide proper kprobes fault handling, if a user-specified pre/post handlers
tries to access user address space, through copy_from_user(), get_user() etc.

The user-specified fault handler gets called only if the fault occurs while
executing user-specified handlers. In such a case user-specified handler is
allowed to fix it first, later if the user-specifed fault handler does not fix
it, we try to fix it by calling fix_exception().

The user-specified handler will not be called if the fault happens when single
stepping the original instruction, instead we reset the current probe and
allow the system page fault handler to fix it up.
Signed-off-by: NPrasanna S Panchamukhi <prasanna@in.ibm.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

b4026513

[PATCH] kprobe handler: discard user space trap · 2326c770

由 bibo,mao 提交于 3月 26, 2006

Currently kprobe handler traps only happen in kernel space, so function
kprobe_exceptions_notify should skip traps which happen in user space.
This patch modifies this, and it is based on 2.6.16-rc4.
Signed-off-by: Nbibo mao <bibo.mao@intel.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: "Keshavamurthy, Anil S" <anil.s.keshavamurthy@intel.com>
Cc: <hiramatu@sdl.hitachi.co.jp>
Signed-off-by: NPrasanna S Panchamukhi <prasanna@in.ibm.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

2326c770

[PATCH] kretprobe instance recycled by parent process · c6fd91f0

由 bibo mao 提交于 3月 26, 2006

When kretprobe probes the schedule() function, if the probed process exits
then schedule() will never return, so some kretprobe instances will never
be recycled.

In this patch the parent process will recycle retprobe instances of the
probed function and there will be no memory leak of kretprobe instances.
Signed-off-by: Nbibo mao <bibo.mao@intel.com>
Cc: Masami Hiramatsu <hiramatu@sdl.hitachi.co.jp>
Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

c6fd91f0

[PATCH] kretprobe: kretprobe-booster · c9becf58

由 Masami Hiramatsu 提交于 3月 26, 2006

In normal operation, kretprobe makes a target function return to trampoline
code.  A kprobe (called trampoline_probe) has been inserted in the trampoline
code.  When the kernel hits this kprobe, it calls kretprobe's handler and it
returns to the original return address.

Kretprobe-booster removes the trampoline_probe.  It allows the trampoline code
to call kretprobe's handler directly instead of invoking kprobe.  The
trampoline code returns to the original return address.

(changelog from Chuck Ebbert <76306.1226@compuserve.com> - thanks ;))
Signed-off-by: NMasami Hiramatsu <hiramatu@sdl.hitachi.co.jp>
Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

c9becf58

[PATCH] x86: kprobes-booster · 311ac88f

由 Masami Hiramatsu 提交于 3月 26, 2006

Current kprobe copies the original instruction at the probe point and replaces
it with a breakpoint instruction (int3).  When the kernel hits the probe
point, kprobe handler is invoked.  And the copied instruction is single-step
executed on the copied buffer (not on the original address) by kprobe.  After
that, the kprobe checks registers and modify it (if need) as if the
instructions was executed on the original address.

My proposal is based on the fact there are many instructions which do NOT
require the register modification after the single-step execution.  When the
copied instruction is a kind of them, kprobe just jumps back to the next
instruction after single-step execution.  If so, why don't we execute those
instructions directly?

With kprobe-booster patch, kprobes will execute a copied instruction directly
and (if need) jump back to original code.  This direct execution is executed
when the kprobe don't have both post_handler and break_handler, and the copied
instruction can be executed directly.

I sorted instructions which can be executed directly or not;

- Call instructions are NG(can not be executed directly).
  We should correct the return address pushed into top of stack.
- Indirect instructions except for absolute indirect-jumps
  are NG. Those instructions changes EIP randomly. We should
  check EIP and correct it.
- Instructions that change EIP beyond the range of the
  instruction buffer are NG.
- Instructions that change EIP to tail 5 bytes of the
  instruction buffer (it is the size of a jump instruction).
  We must write a jump instruction which backs to original
  kernel code in the instruction buffer.
- Break point instruction is NG. We should not touch EIP and
  pass to other handlers.
- Absolute direct/indirect jumps are OK.- Conditional Jumps are NG.
- Halt and software-interruptions are NG. Because it will stay on
  the instruction buffer of kprobes.
- Prefixes are NG.
- Unknown/reserved opcode is NG.
- Other 1 byte instructions are OK. But those instructions need a
  jump back code.
- 2 bytes instructions are mapped sparsely. So, in this release,
  this patch don't boost those instructions.

>From Intel's IA-32 opcode map described in IA-32 Intel Architecture Software
Developer's Manual Vol.2 B, I determined that following opcodes are not
boostable.

- 0FH (2byte escape)
- 70H - 7FH (Jump on condition)
- 9AH (Call) and 9CH (Pushf)
- C0H-C1H (Grp 2: includes reserved opcode)
- C6H-C7H (Grp11: includes reserved opcode)
- CCH-CEH (Software-interrupt)
- D0H-D3H (Grp2: includes reserved opcode)
- D6H (Reserved)
- D8H-DFH (Coprocessor)
- E0H-E3H (loop/conditional jump)
- E8H (Call)
- F0H-F3H (Prefixes and reserved)
- F4H (Halt)
- F6H-F7H (Grp3: includes reserved opcode)
- FEH-FFH(Grp4,5: includes reserved opcode)

Kprobe-booster checks whether target instruction can be boosted (can be
executed directly) at arch_copy_kprobe() function.  If the target instruction
can be boosted, it clears "boostable" flag.  If not, it sets "boostable" flag
-1.  This is disabled status.  In resume_execution() function, If "boostable"
flag is cleared, kprobe-booster measures the size of the target instruction
and sets "boostable" flag 1.

In kprobe_handler(), kprobe checks the "boostable" flag.  If the flag is 1, it
resets current kprobe and executes instruction buffer directly instead of
single stepping.

When unregistering a boosted kprobe, it calls synchronize_sched()
after "int3" is removed. So we can ensure followings after
the synchronize_sched() called.
- interrupt handlers are finished on all CPUs.
- instruction buffer is not executed on all CPUs.
And we can release the boosted kprobe safely.

And also, on preemptible kernel, the booster is not enabled where the kernel
preemption is enabled.  So, there are no preempted threads on the instruction
buffer.

The description of kretprobe-booster:
====================================

In the normal operation, kretprobe make a target function return to trampoline
code.  And a kprobe (called trampoline_probe) have been inserted at the
trampoline code.  When the kernel hits this kprobe, it calls kretprobe's
handler and it returns to original return address.

Kretprobe-booster patch removes the trampoline_probe.  It allows the
trampoline code to call kretprobe's handler directly instead of invoking
kprobe.  And tranpoline code returns to original return address.

This new trampoline code stores and restores registers, so the kretprobe
handler is still able to access those registers.

Current kprobe has about 1.3 usec/probe(*) overhead, and kprobe-booster patch
reduces it to 0.6 usec/probe(*).  Also current kretprobe has about 2.0
usec/probe(*) overhead.  Kprobe-booster patch reduces it to 1.3 usec/probe(*),
and the combination of both kprobe-booster patch and kretprobe-booster patch
reduces it to 0.9 usec/probe(*).

I expect the combination of both patches can reduce half of a probing
overhead.

Performance numbers strongly depend on the processor model.

Andrew Morton wrote:
> These preempt tricks look rather nasty.  Can you please describe what the
> problem is, precisely?  And how this code avoids it?  Perhaps we can find
> something cleaner.

The problem is how to remove the copied instructions of the
kprobe *safely* on the preemptable kernel (CONFIG_PREEMPT=y).

Kprobes basically executes the following actions;

(1)int3
(2)preempt_disable()
(3)kprobe_prehandler()
(4)copied instructioin(single step)
(5)kprobe_posthandler()
(6)preempt_enable()
(7)return to the original code

During the execution of copied instruction, preemption is
disabled (from step (2) to (6)).
When unregistering the probes, Kprobe waits for RCU
quiescent state by using synchronize_sched() after removing
int3 instruction.
Thus we can ensure the copied instruction is not executed.

On the other hand, kprobe-booster executes the following actions;

(1)int3
(2)preempt_disable()
(3)kprobe_prehandler()
(4)preempt_enable()             <-- this one is added by my patch
(5)copied instruction(direct execution)
(6)jmp back to the original code

The problem is that we have no way to prevent preemption on
step (5) or (6). We cannot call preempt_disable() after step (6),
because there are no rooms to do that. Thus, some other
processes may be preempted at step(5) or (6) on preemptable kernel.
And I couldn't find the easy way to ensure that other processes'
stack do *not* have the address of them. (I thought some way
to do that, but those are very costly.)

So currently, I simply boost the kprobe only when the probe
point is already preemption disabled.

> Also, the patch adds a preempt_enable() but I don't see a corresponding
> preempt_disable().  Am I missing something?

It is corresponding to the preempt_disable() in the top of
kprobe_handler().
I copied the code of kprobe_handler() here:

static int __kprobes kprobe_handler(struct pt_regs *regs)
{
        struct kprobe *p;
        int ret = 0;
        kprobe_opcode_t *addr = NULL;
        unsigned long *lp;
        struct kprobe_ctlblk *kcb;

        /*
         * We don't want to be preempted for the entire
         * duration of kprobe processing
         */
        preempt_disable();             <-- HERE
        kcb = get_kprobe_ctlblk();
Signed-off-by: NMasami Hiramatsu <hiramatu@sdl.hitachi.co.jp>
Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

311ac88f

[PATCH] kprobes: clean up resume_execute() · b50ea74c

由 Masami Hiramatsu 提交于 3月 26, 2006

Clean up kprobe's resume_execute() for i386 arch.
Signed-off-by: NMasami Hiramatsu <hiramatu@sdl.hitachi.co.jp>
Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

b50ea74c

[PATCH] fix array overrun in efi.c · d6d21dfd

由 Darren Jenkins 提交于 3月 26, 2006

Coverity found an over-run @ line 364 of efi.c

This is due to the loop checking the size correctly, then adding a '\0'
after possibly hitting the end of the array.

Ensure the loop exits with one space left in the array.
Signed-off-by: NDarren Jenkins <darrenrjenkins@gmail.com>
Signed-off-by: NAdrian Bunk <bunk@stusta.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

d6d21dfd

[PATCH] sem2mutex: misc static one-file mutexes · 14cc3e2b

由 Ingo Molnar 提交于 3月 26, 2006

Semaphore to mutex conversion.

The conversion was generated via scripts, and the result was validated
automatically via a script as well.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Jens Axboe <axboe@suse.de>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Acked-by: NAlasdair G Kergon <agk@redhat.com>
Cc: Greg KH <greg@kroah.com>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Adam Belay <ambx1@neo.rr.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

14cc3e2b

[PATCH] EFI fixes · 23dd842c

由 Tolentino, Matthew E 提交于 3月 26, 2006

Here's a patch that fixes EFI boot for x86 on 2.6.16-rc5-mm3.  The
off-by-one is admittedly my fault, but the other two fix up the rest.

Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Matt Domsch <Matt_Domsch@dell.com>
Cc: "Tolentino, Matthew E" <matthew.e.tolentino@intel.com>
Cc: "Brown, Len" <len.brown@intel.com>
Cc: Andi Kleen <ak@muc.de>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

23dd842c

[PATCH] EFI: keep physical table addresses in efi structure · b2c99e3c

由 Bjorn Helgaas 提交于 3月 26, 2006

Almost all users of the table addresses from the EFI system table want
physical addresses.  So rather than doing the pa->va->pa conversion, just keep
physical addresses in struct efi.

This fixes a DMI bug: the efi structure contained the physical SMBIOS address
on x86 but the virtual address on ia64, so dmi_scan_machine() used ioremap()
on a virtual address on ia64.

This is essentially the same as an earlier patch by Matt Tolentino:
	http://marc.theaimsgroup.com/?l=linux-kernel&m=112130292316281&w=2
except that this changes all table addresses, not just ACPI addresses.

Matt's original patch was backed out because it caused MCAs on HP sx1000
systems.  That problem is resolved by the ioremap() attribute checking added
for ia64.
Signed-off-by: NBjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Matt Domsch <Matt_Domsch@dell.com>
Cc: "Tolentino, Matthew E" <matthew.e.tolentino@intel.com>
Cc: "Brown, Len" <len.brown@intel.com>
Cc: Andi Kleen <ak@muc.de>
Acked-by: N"Luck, Tony" <tony.luck@intel.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

b2c99e3c