提交 · d040c1614c24162adc3fe106b182596999264e26 · openeuler / raspberrypi-kernel

12 2月, 2009 2 次提交

syscall define: fix uml compile bug · 6c597963

由 Heiko Carstens 提交于 2月 11, 2009

With the new system call defines we get this on uml:

arch/um/sys-i386/built-in.o: In function `sys_call_table':
(.rodata+0x308): undefined reference to `sys_sigprocmask'

Reason for this is that uml passes the preprocessor option
-Dsigprocmask=kernel_sigprocmask to gcc when compiling the kernel.
This causes SYSCALL_DEFINE3(sigprocmask, ...) to be expanded to
SYSCALL_DEFINEx(3, kernel_sigprocmask, ...) and finally to a system
call named sys_kernel_sigprocmask.  However sys_sigprocmask is missing
because of this.

To avoid macro expansion for the system call name just concatenate the
name at first define instead of carrying it through severel levels.
This was pointed out by Al Viro.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Reviewed-by: NWANG Cong <wangcong@zeuux.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6c597963

cgroups: fix lockdep subclasses overflow · cfebe563

由 Li Zefan 提交于 2月 11, 2009

I enabled all cgroup subsystems when compiling kernel, and then:
 # mount -t cgroup -o net_cls xxx /mnt
 # mkdir /mnt/0

This showed up immediately:
 BUG: MAX_LOCKDEP_SUBCLASSES too low!
 turning off the locking correctness validator.

It's caused by the cgroup hierarchy lock:
	for (i = 0; i < CGROUP_SUBSYS_COUNT; i++) {
		struct cgroup_subsys *ss = subsys[i];
		if (ss->root == root)
			mutex_lock_nested(&ss->hierarchy_mutex, i);
	}

Now we have 9 cgroup subsystems, and the above 'i' for net_cls is 8, but
MAX_LOCKDEP_SUBCLASSES is 8.

This patch uses different lockdep keys for different subsystems.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Acked-by: NPaul Menage <menage@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

cfebe563

11 2月, 2009 5 次提交

x86, ptrace, mm: fix double-free on race · 9f339e70

由 Markus Metzger 提交于 2月 11, 2009

Ptrace_detach() races with __ptrace_unlink() if the traced task is
reaped while detaching. This might cause a double-free of the BTS
buffer.

Change the ptrace_detach() path to only do the memory accounting in
ptrace_bts_detach() and leave the buffer free to ptrace_bts_untrace()
which will be called from __ptrace_unlink().

The fix follows a proposal from Oleg Nesterov.
Reported-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NMarkus Metzger <markus.t.metzger@intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

9f339e70

timers: fix TIMER_ABSTIME for process wide cpu timers · 4da94d49

由 Peter Zijlstra 提交于 2月 11, 2009

The POSIX timer interface allows for absolute time expiry values through the
TIMER_ABSTIME flag, therefore we have to synchronize the timer to the clock
every time we start it.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

4da94d49

timers: split process wide cpu clocks/timers, fix · 3fccfd67

由 Peter Zijlstra 提交于 2月 10, 2009

To decrease the chance of a missed enable, always enable the timer when we
sample it, we'll always disable it when we find that there are no active timers
in the jiffy tick.

This fixes a flood of warnings reported by Mike Galbraith.
Reported-by: NMike Galbraith <efault@gmx.de>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

3fccfd67

hugetlbfs: fix build failure with !CONFIG_HUGETLBFS · 1db8508c

由 Stefan Richter 提交于 2月 10, 2009

Fix regression due to 5a6fe125,
"Do not account for the address space used by hugetlbfs using VM_ACCOUNT"
which added an argument to the function hugetlb_file_setup() but not to
the macro hugetlb_file_setup().
Reported-by: NChris Clayton <chris2553@googlemail.com>
Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>
Acked-by: NMel Gorman <mel@csn.ul.ie>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1db8508c

Do not account for the address space used by hugetlbfs using VM_ACCOUNT · 5a6fe125

由 Mel Gorman 提交于 2月 10, 2009

When overcommit is disabled, the core VM accounts for pages used by anonymous
shared, private mappings and special mappings. It keeps track of VMAs that
should be accounted for with VM_ACCOUNT and VMAs that never had a reserve
with VM_NORESERVE.

Overcommit for hugetlbfs is much riskier than overcommit for base pages
due to contiguity requirements. It avoids overcommiting on both shared and
private mappings using reservation counters that are checked and updated
during mmap(). This ensures (within limits) that hugepages exist in the
future when faults occurs or it is too easy to applications to be SIGKILLed.

As hugetlbfs makes its own reservations of a different unit to the base page
size, VM_ACCOUNT should never be set. Even if the units were correct, we would
double account for the usage in the core VM and hugetlbfs. VM_NORESERVE may
be set because an application can request no reserves be made for hugetlbfs
at the risk of getting killed later.

With commit fc8744ad, VM_NORESERVE and
VM_ACCOUNT are getting unconditionally set for hugetlbfs-backed mappings. This
breaks the accounting for both the core VM and hugetlbfs, can trigger an
OOM storm when hugepage pools are too small lockups and corrupted counters
otherwise are used. This patch brings hugetlbfs more in line with how the
core VM treats VM_NORESERVE but prevents VM_ACCOUNT being set.
Signed-off-by: NMel Gorman <mel@csn.ul.ie>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5a6fe125

10 2月, 2009 2 次提交

elf: add ELF_CORE_COPY_KERNEL_REGS() · 6cd61c0b

由 Tejun Heo 提交于 2月 09, 2009

ELF core dump is used for both user land core dump and kernel crash
dump.  Depending on architecture, register might need to be accessed
differently for userland and kernel.  Allow architectures to define
ELF_CORE_COPY_KERNEL_REGS() and use different operation for kernel
register dump.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

6cd61c0b

x86: spinlocks: define dummy __raw_spin_is_contended · a5ef7ca0

由 Kyle McMartin 提交于 2月 08, 2009

Architectures other than mips and x86 are not using ticket spinlocks.
Therefore, the contention on the lock is meaningless, since there is
nobody known to be waiting on it (arguably /fairly/ unfair locks).

Dummy it out to return 0 on other architectures.
Signed-off-by: NKyle McMartin <kyle@redhat.com>
Acked-by: NRalf Baechle <ralf@linux-mips.org>
Acked-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a5ef7ca0

09 2月, 2009 3 次提交

acpi/x86: introduce __apci_map_table, v4 · 7d97277b

由 Yinghai Lu 提交于 2月 07, 2009

to prevent wrongly overwriting fixmap that still want to use.

ACPI used to rely on low mappings being all linearly mapped and
grew a habit: it never really unmapped certain kinds of tables
after use.

This can cause problems - for example the hypothetical case
when some spurious access still references it.

v2: remove prev_map and prev_size in __apci_map_table
v3: let acpi_os_unmap_memory() call early_iounmap too, so remove extral calling to
early_acpi_os_unmap_memory
v4: fix typo in one acpi_get_table_with_size calling
Signed-off-by: NYinghai Lu <yhlu.kernel@gmail.com>
Acked-by: NLen Brown <len.brown@intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

7d97277b

percpu: make PER_CPU_BASE_SECTION overridable by arches · d3770449

由 Brian Gerst 提交于 2月 08, 2009

Impact: bug fix

IA-64 needs to put percpu data in the seperate section even on UP.
Fixes regression caused by "percpu: refactor percpu.h"
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Acked-by: NTony Luck <tony.luck@intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

d3770449

async: Rename _special -> _domain for clarity. · 766ccb9e

由 Cornelia Huck 提交于 1月 20, 2009

Rename the async_*_special() functions to async_*_domain(), which
describes the purpose of these functions much better.
[Broke up long lines to silence checkpatch]
Signed-off-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>

766ccb9e

08 2月, 2009 2 次提交

drm/i915: add fence register management to execbuf · 0f973f27

由 Jesse Barnes 提交于 1月 26, 2009

Adds code to set up fence registers at execbuf time on pre-965 chips as
necessary.  Also fixes up a few bugs in the pre-965 tile register support
(get_order != ffs).  The number of fences available to the kernel defaults
to the hw limit minus 3 (for legacy X front/back/depth), but a new parameter
allows userspace to override that as needed.
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: NEric Anholt <eric@anholt.net>
Signed-off-by: NDave Airlie <airlied@linux.ie>

0f973f27

module: remove over-zealous check in __module_get() · 7f9a50a5

由 Rusty Russell 提交于 2月 07, 2009

Impact: fix spurious BUG_ON() triggered under load

module_refcount() isn't reliable outside stop_machine(), as demonstrated
by Karsten Keil <kkeil@suse.de>, networking can trigger it under load
(an inc on one cpu and dec on another while module_refcount() is tallying
 can give false results, for example).

Almost noone should be using __module_get, but that's another issue.

Cc: Karsten Keil <kkeil@suse.de>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7f9a50a5

07 2月, 2009 2 次提交

net_dma: call dmaengine_get only if NET_DMA enabled · b4bd07c2

由 David S. Miller 提交于 2月 06, 2009

Based upon a patch from Atsushi Nemoto <anemo@mba.ocn.ne.jp>

--------------------
The commit 649274d9 ("net_dma:
acquire/release dma channels on ifup/ifdown") added unconditional call
of dmaengine_get() to net_dma.  The API should be called only if
NET_DMA was enabled.
--------------------
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Acked-by: NDan Williams <dan.j.williams@intel.com>

b4bd07c2

ACPI: Enable bit 11 in _PDC to advertise hw coord · d96f94c6

由 Pallipadi, Venkatesh 提交于 2月 02, 2009

Bit 11 in intel PDC definitions is meant for OS capability to handle
hardware coordination of P-states. In Linux we have always supported
hwardware coordination of P-states. Just let the BIOSes know that we
support it, by setting this bit.

Some BIOSes use this bit to choose between hardware or software coordination
and without this change below, BIOSes switch to software coordination, which
is not very optimal in terms of power consumption and extra wakeups from idle.
Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: NLen Brown <len.brown@intel.com>

d96f94c6

06 2月, 2009 7 次提交

timers: split process wide cpu clocks/timers, remove spurious warning · 7d8e23df

由 Ingo Molnar 提交于 2月 06, 2009

Mike Galbraith reported that the new warning in thread_group_cputimer()
triggers en masse with Amarok running.

Oleg Nesterov observed:

  Can't fastpath_timer_check()->thread_group_cputimer() have the
  false warning too? Suppose we had the timer, then posix_cpu_timer_del()
  removes this timer, but task_cputime_zero(&sig->cputime_expires) still
  not true.

Remove the spurious debug warning.
Reported-by: NMike Galbraith <efault@gmx.de>
Explained-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

7d8e23df

I
smp, generic: introduce arch_disable_smp_support(), build fix · a146649b
由 Ingo Molnar 提交于 1月 31, 2009
```
This function should be provided on UP too.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
```
a146649b

smp, generic: introduce arch_disable_smp_support() instead of disable_ioapic_setup() · 65a4e574

由 Ingo Molnar 提交于 1月 31, 2009

Impact: cleanup

disable_ioapic_setup() in init/main.c is ugly as the function is
x86-specific. The #ifdef inline prototype there is ugly too.

Replace it with a generic arch_disable_smp_support() function - which
has a weak alias for non-x86 architectures and for non-ioapic x86 builds.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

65a4e574

atyfb: fix CONFIG_ namespace violations · fe86175b

由 Randy Dunlap 提交于 2月 04, 2009

Fix namespace violations by changing non-kconfig CONFIG_ names to CNFG_*.

Fixes breakage in staging/, which adds a real CONFIG_PANEL.
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

fe86175b

wait: prevent exclusive waiter starvation · 777c6c5f

由 Johannes Weiner 提交于 2月 04, 2009

With exclusive waiters, every process woken up through the wait queue must
ensure that the next waiter down the line is woken when it has finished.

Interruptible waiters don't do that when aborting due to a signal.  And if
an aborting waiter is concurrently woken up through the waitqueue, noone
will ever wake up the next waiter.

This has been observed with __wait_on_bit_lock() used by
lock_page_killable(): the first contender on the queue was aborting when
the actual lock holder woke it up concurrently.  The aborted contender
didn't acquire the lock and therefor never did an unlock followed by
waking up the next waiter.

Add abort_exclusive_wait() which removes the process' wait descriptor from
the waitqueue, iff still queued, or wakes up the next waiter otherwise.
It does so under the waitqueue lock.  Racing with a wake up means the
aborting process is either already woken (removed from the queue) and will
wake up the next waiter, or it will remove itself from the queue and the
concurrent wake up will apply to the next waiter after it.

Use abort_exclusive_wait() in __wait_event_interruptible_exclusive() and
__wait_on_bit_lock() when they were interrupted by other means than a wake
up through the queue.

[akpm@linux-foundation.org: coding-style fixes]
Reported-by: NChris Mason <chris.mason@oracle.com>
Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
Mentored-by: NOleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Chuck Lever <cel@citi.umich.edu>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: <stable@kernel.org>		["after some testing"]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

777c6c5f

fbmem: don't call copy_from/to_user() with mutex held · 1f5e31d7

由 Andrea Righi 提交于 2月 04, 2009

Avoid calling copy_from/to_user() with fb_info->lock mutex held in fbmem
ioctl().

fb_mmap() is called under mm->mmap_sem (A) held, that also acquires
fb_info->lock (B); fb_ioctl() takes fb_info->lock (B) and does
copy_from/to_user() that might acquire mm->mmap_sem (A), causing a
deadlock.

NOTE: it doesn't push down the fb_info->lock in each own driver's
fb_ioctl(), so there are still potential deadlocks elsewhere.
Signed-off-by: NAndrea Righi <righi.andrea@gmail.com>
Cc: Dave Jones <davej@redhat.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Johannes Weiner <hannes@saeurebad.de>
Cc: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: Harvey Harrison <harvey.harrison@gmail.com>
Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1f5e31d7

generic swap(): don't return a value from swap() · ac7b9004

由 Peter Zijlstra 提交于 2月 04, 2009

The swap() macro is accidentally retuning the value of its first argument.
Change it into a doesn't-return-anything macro before someone goes and
relies upon this behaviour.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wu Fengguang <wfg@linux.intel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ac7b9004

05 2月, 2009 5 次提交

timers: split process wide cpu clocks/timers · 4cd4c1b4

由 Peter Zijlstra 提交于 2月 05, 2009

Change the process wide cpu timers/clocks so that we:

 1) don't mess up the kernel with too many threads,
 2) don't have a per-cpu allocation for each process,
 3) have no impact when not used.

In order to accomplish this we're going to split it into two parts:

 - clocks; which can take all the time they want since they run
           from user context -- ie. sys_clock_gettime(CLOCK_PROCESS_CPUTIME_ID)

 - timers; which need constant time sampling but since they're
           explicity used, the user can pay the overhead.

The clock readout will go back to a full sum of the thread group, while the
timers will run of a global 'clock' that only runs when needed, so only
programs that make use of the facility pay the price.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

4cd4c1b4

signal: re-add dead task accumulation stats. · 32bd671d

由 Peter Zijlstra 提交于 2月 05, 2009

We're going to split the process wide cpu accounting into two parts:

 - clocks; which can take all the time they want since they run
           from user context.

 - timers; which need constant time tracing but can affort the overhead
           because they're default off -- and rare.

The clock readout will go back to a full sum of the thread group, for this
we need to re-add the exit stats that were removed in the initial itimer
rework (f06febc9: timers: fix itimer/many thread hang).

Furthermore, since that full sum can be rather slow for large thread groups
and we have the complete dead task stats, revert the do_notify_parent time
computation.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

32bd671d

crypto: shash - Fix tfm destruction · 412e87ae

由 Herbert Xu 提交于 2月 05, 2009

We were freeing an offset into the slab object instead of the
start.  This patch fixes it by calling crypto_destroy_tfm which
allows the correct address to be given.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

412e87ae

crypto: api - Fix zeroing on free · 7b2cd92a

由 Herbert Xu 提交于 2月 05, 2009

Geert Uytterhoeven pointed out that we're not zeroing all the
memory when freeing a transform.  This patch fixes it by calling
ksize to ensure that we zero everything in sight.
Reported-by: NGeert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

7b2cd92a

PCI: return error on failure to read PCI ROMs · 97c44836

由 Timothy S. Nelson 提交于 1月 30, 2009

This patch makes the ROM reading code return an error to user space if
the size of the ROM read is equal to 0.

The patch also emits a warnings if the contents of the ROM are invalid,
and documents the effects of the "enable" file on ROM reading.
Signed-off-by: NTimothy S. Nelson <wayland@wayland.id.au>
Acked-by: NAlex Villacis-Lasso <a_villacis@palosanto.com>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

97c44836

03 2月, 2009 6 次提交

sched: add missing kernel-doc in sched.h · 35626129

由 Randy Dunlap 提交于 2月 02, 2009

Add kernel-doc notation for @lock:

include/linux/sched.h:457: No description found for parameter 'lock'
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

35626129

libata: implement HORKAGE_1_5_GBPS and apply it to WD My Book · 9062712f

由 Tejun Heo 提交于 1月 29, 2009

3Gbps is often much more prone to transmission failures. It's usually
okay to let EH handle speed down after transmission failures but some
WD My Book drives completely shutdown after certain transmission
failures and after it only power cycling can revive them. Combined
with the fact that external drives often end up with cable assembly
which is longer than usual and more likely to have intervening gender,
this makes these drives very likely to shutdown under certain
configurations virtually rendering them unusable.

This patch implements HOARKGE_1_5_GBPS and applies it to WD My Book
such that 1.5Gbps is forced once the device is identified.

Please take a look at the following bz for related reports.

http://bugzilla.kernel.org/show_bug.cgi?id=9913Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJeff Garzik <jgarzik@redhat.com>

9062712f

libata: clear dev->ering in smarter way · 99cf610a

由 Tejun Heo 提交于 1月 29, 2009

dev->ering used to be cleared together with the rest of ata_device in
ata_dev_init() which is called whenever a probing event occurs.
dev->ering is about to be used to track probing failures so it needs
to remain persistent over multiple porbing events.  This patch
achieves this by doing the following.

* Instead of CLEAR_OFFSET, define CLEAR_BEGIN and CLEAR_END and only
  clear between BEGIN and END.  ering is moved after END.  The split
  of persistent area is to allow hotter items remain at the head.

* ering is explicitly cleared on ata_dev_disable() and when device
  attach succeeds.  So, ering is persistent throug a device's life
  time (unless explicitly cleared of course) and also through periods
  inbetween disablement of an attached device and successful detection
  of the next one.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJeff Garzik <jgarzik@redhat.com>

99cf610a

ide/libata: fix ata_id_is_cfa() (take 4) · 2999b58b

由 Sergei Shtylyov 提交于 2月 01, 2009

When checking for the CFA feature set support, ata_id_is_cfa() tests bit 2 in
word 82 of the identify data instead the word 83; it also checks the ATA/PI
version support in the word 80 (which the CompactFlash specifications have as
reserved), this having no slightest chance to work on the modern CF cards that
don't have 0x848A in the word 0...
Signed-off-by: NSergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: NJeff Garzik <jgarzik@redhat.com>

2999b58b

modules: Use a better scheme for refcounting · 720eba31

由 Eric Dumazet 提交于 2月 03, 2009

Current refcounting for modules (done if CONFIG_MODULE_UNLOAD=y) is
using a lot of memory.

Each 'struct module' contains an [NR_CPUS] array of full cache lines.

This patch uses existing infrastructure (percpu_modalloc() &
percpu_modfree()) to allocate percpu space for the refcount storage.

Instead of wasting NR_CPUS*128 bytes (on i386), we now use
nr_cpu_ids*sizeof(local_t) bytes.

On a typical distro, where NR_CPUS=8, shiping 2000 modules, we reduce
size of module files by about 2 Mbytes. (1Kb per module)

Instead of having all refcounters in the same memory node - with TLB misses
because of vmalloc() - this new implementation permits to have better
NUMA properties, since each  CPU will use storage on its preferred node,
thanks to percpu storage.
Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

720eba31

net: Fix userland breakage wrt. linux/if_tunnel.h · 0afd4a21

由 David S. Miller 提交于 2月 02, 2009

Reported by Andrew Walrond <andrew@walrond.org>

Changeset c19e654d
("gre: Add netlink interface") added an include
of linux/ip.h to linux/if_tunnel.h

We can't really let that get exposed to userspace
because this conflicts with types defined in netinet/ip.h
which userland is almost certainly going to have included
either explicitly or implicitly.

So guard this include with a __KERNEL__ ifdef.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0afd4a21

02 2月, 2009 3 次提交

bio.h: If they MUST be inlined, then use __always_inline · c52440a6

由 Alberto Bertogli 提交于 2月 02, 2009

bvec_kmap_irq() and bvec_kunmap_irq() comments say they MUST be inlined,
so mark them as __always_inline.
Signed-off-by: NAlberto Bertogli <albertito@blitiri.com.ar>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

c52440a6

Fix misleading comment in bio.h · 20b636bf

由 Alberto Bertogli 提交于 2月 02, 2009

The comment says "remember to add offset!", but the function already adds
it.
Signed-off-by: NAlberto Bertogli <albertito@blitiri.com.ar>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

20b636bf

J
block: fix inconsistent parenthesisation of QUEUE_FLAG_DEFAULT · 0648e10d
由 Jens Axboe 提交于 2月 02, 2009
```
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
```
0648e10d

31 1月, 2009 3 次提交

linker script: use separate simpler definition for PERCPU() · 3ac6cffe

由 Tejun Heo 提交于 1月 30, 2009

Impact: fix linker screwup on x86_32

Recent x86_64 zerobased patches introduced PERCPU_VADDR() to put
.data.percpu to a predefined address and re-defined PERCPU() in terms
of it.  The new macro defined one extra symbol, __per_cpu_load, for
LMA of the section so that the init data could be accessed.  This new
symbol introduced the following problems to x86_32.

1. If __per_cpu_load is defined outside of .data.percpu as an absolute
   symbol, relocation generation for relocatable kernel fails due to
   absolute relocation.

2. If __per_cpu_load is put inside .data.percpu with absolute address
   assignment to work around #1, linker gets confused and under
   certain configurations ends up relocating the symbol against
   .data.percpu such that the load address gets added on top of
   already set load address.

As x86_32 doesn't use predefined address for .data.percpu, there's no
need for it to care about the possibility of __per_cpu_load being
different from __per_cpu_start.

This patch defines PERCPU() separately so that __per_cpu_load is
defined inside .data.percpu so that everything is ordinary
linking-wise.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

3ac6cffe

hrtimers: allow the hot-unplugging of all cpus · 94df7de0

由 Sebastien Dugue 提交于 12月 01, 2008

Impact: fix CPU hotplug hang on Power6 testbox

On architectures that support offlining all cpus (at least powerpc/pseries),
hot-unpluging the tick_do_timer_cpu can result in a system hang.

This comes from the fact that if the cpu going down happens to be the
cpu doing the tick, then as the tick_do_timer_cpu handover happens after the
cpu is dead (via the CPU_DEAD notification), we're left without ticks,
jiffies are frozen and any task relying on timers (msleep, ...) is stuck.
That's particularly the case for the cpu looping in __cpu_die() waiting
for the dying cpu to be dead.

This patch addresses this by having the tick_do_timer_cpu handover happen
earlier during the CPU_DYING notification. For this, a new clockevent
notification type is introduced (CLOCK_EVT_NOTIFY_CPU_DYING) which is triggered
in hrtimer_cpu_notify().
Signed-off-by: NSebastien Dugue <sebastien.dugue@bull.net>
Cc: <stable@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

94df7de0

headers_check fix: linux/rtnetlink.h · 541c94f1

由 Jaswinder Singh Rajput 提交于 1月 30, 2009

fix the following 'make headers_check' warning:

usr/include/linux/rtnetlink.h:328: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>

541c94f1