提交 · fcb43042ef55d2f46b0efa5d7746967cef38f056 · xiphi1978 / linux

30 6月, 2008 1 次提交

由 Zhang, Yanmin 提交于 6月 24, 2008

Vegard Nossum reported crashes during cpu hotplug tests:

  http://marc.info/?l=linux-kernel&m=121413950227884&w=4

In function _cpu_up, the panic happens when calling
__raw_notifier_call_chain at the second time. Kernel doesn't panic when
calling it at the first time. If just say because of nr_cpu_ids, that's
not right.

By checking the source code, I found that function do_boot_cpu is the culprit.
Consider below call chain:
 _cpu_up=>__cpu_up=>smp_ops.cpu_up=>native_cpu_up=>do_boot_cpu.

So do_boot_cpu is called in the end. In do_boot_cpu, if
boot_error==true, cpu_clear(cpu, cpu_possible_map) is executed. So later
on, when _cpu_up calls __raw_notifier_call_chain at the second time to
report CPU_UP_CANCELED, because this cpu is already cleared from
cpu_possible_map, get_cpu_sysdev returns NULL.

Many resources are related to cpu_possible_map, so it's better not to
change it.

Below patch against 2.6.26-rc7 fixes it by removing the bit clearing in
cpu_possible_map.
Signed-off-by: NZhang Yanmin <yanmin_zhang@linux.intel.com>
Tested-by: NVegard Nossum <vegard.nossum@gmail.com>
Acked-by: NRusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

fcb43042

26 6月, 2008 2 次提交

x86: section/warning fixes · 0b1faeef

由 Daniel J Blueman 提交于 6月 15, 2008

WARNING: arch/x86/mm/built-in.o(.text+0x3a1): Section mismatch in
reference from the function set_pte_phys() to the function
.init.text:spp_getpage()
The function set_pte_phys() references
the function __init spp_getpage().
This is often because set_pte_phys lacks a __init
annotation or the annotation of spp_getpage is wrong.

arch/x86/mm/init_64.c: In function 'early_memtest':
arch/x86/mm/init_64.c:520: warning: passing argument 2 of
'find_e820_area_size' from incompatible pointer type
Signed-off-by: NDaniel J Blueman <daniel.blueman@gmail.com>
Cc: "Linus Torvalds" <torvalds@linux-foundation.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

0b1faeef

x86: shift bits the right way in native_read_tscp · 41aefdcc

由 Max Asbock 提交于 6月 25, 2008

native_read_tscp shifts the bits in the high order value in the
wrong direction, the attached patch fixes that.
Signed-off-by: NMax Asbock <masbock@linux.vnet.ibm.com>
Acked-by: NGlauber Costa <gcosta@redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

41aefdcc

24 6月, 2008 1 次提交

xen: remove support for non-PAE 32-bit · 28499143

由 Jeremy Fitzhardinge 提交于 5月 09, 2008

Non-PAE operation has been deprecated in Xen for a while, and is
rarely tested or used.  xen-unstable has now officially dropped
non-PAE support.  Since Xen/pvops' non-PAE support has also been
broken for a while, we may as well completely drop it altogether.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

28499143

20 6月, 2008 4 次提交

xen: don't drop NX bit · ebb9cfe2

由 Jeremy Fitzhardinge 提交于 6月 16, 2008

Because NX is now enforced properly, we must put the hypercall page
into the .text segment so that it is executable.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stable Kernel <stable@kernel.org>
Cc: the arch/x86 maintainers <x86@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

ebb9cfe2

xen: mask unwanted pte bits in __supported_pte_mask · 05345b0f

由 Jeremy Fitzhardinge 提交于 6月 16, 2008

[ Stable: this isn't a bugfix in itself, but it's a pre-requiste
  for "xen: don't drop NX bit" ]
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stable Kernel <stable@kernel.org>
Cc: the arch/x86 maintainers <x86@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

05345b0f

xen: Use wmb instead of rmb in xen_evtchn_do_upcall(). · 46539383

由 Isaku Yamahata 提交于 6月 16, 2008

This patch is ported one from 534:77db69c38249 of linux-2.6.18-xen.hg.
Use wmb instead of rmb to enforce ordering between
evtchn_upcall_pending and evtchn_pending_sel stores
in xen_evtchn_do_upcall().

Cc: Samuel Thibault <samuel.thibault@eu.citrix.com>
Signed-off-by: NIsaku Yamahata <yamahata@valinux.co.jp>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: the arch/x86 maintainers <x86@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

46539383

x86: fix NULL pointer deref in __switch_to · 54481cf8

由 Suresh Siddha 提交于 6月 19, 2008

I am able to reproduce the oops reported by Simon in __switch_to() with
lguest.

My debug showed that there is at least one lguest specific
issue (which should be present in 2.6.25 and before aswell) and it got
exposed with a kernel oops with the recent fpu dynamic allocation patches.

In addition to the previous possible scenario (with fpu_counter), in the
presence of lguest, it is possible that the cpu's TS bit it still set and the
lguest launcher task's thread_info has TS_USEDFPU still set.

This is because of the way the lguest launcher handling the guest's TS bit.
(look at lguest_set_ts() in lguest_arch_run_guest()). This can result
in a DNA fault while doing unlazy_fpu() in __switch_to(). This will
end up causing a DNA fault in the context of new process thats
getting context switched in (as opossed to handling DNA fault in the context
of lguest launcher/helper process).

This is wrong in both pre and post 2.6.25 kernels. In the recent
2.6.26-rc series, this is showing up as NULL pointer dereferences or
sleeping function called from atomic context(__switch_to()), as
we free and dynamically allocate the FPU context for the newly
created threads. Older kernels might show some FPU corruption for processes
running inside of lguest.

With the appended patch, my test system is running for more than 50 mins
now. So atleast some of your oops (hopefully all!) should get fixed.
Please give it a try. I will spend more time with this fix tomorrow.
Reported-by: NSimon Holm Thøgersen <odie@cs.aau.dk>
Reported-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

54481cf8

19 6月, 2008 22 次提交

x86, geode: add a VSA2 ID for General Software · ffe6e1da

由 Jordan Crouse 提交于 6月 18, 2008

General Software writes their own VSA2 module for their version
of the Geode BIOS, which returns a different ID then the standard
VSA2.  This was causing the framebuffer driver to break for most
GSW boards.
Signed-off-by: NJordan Crouse <jordan.crouse@amd.com>
Cc: tglx@linutronix.de
Cc: linux-geode@lists.infradead.org
Signed-off-by: NIngo Molnar <mingo@elte.hu>

ffe6e1da

x86: use BOOTMEM_EXCLUSIVE on 32-bit · d3942cff

由 Bernhard Walle 提交于 6月 08, 2008

This patch uses the BOOTMEM_EXCLUSIVE for crashkernel reservation also for
i386 and prints a error message on failure.

The patch is still for 2.6.26 since it is only bug fixing. The unification
of reserve_crashkernel() between i386 and x86_64 should be done for 2.6.27.
Signed-off-by: NBernhard Walle <bwalle@suse.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Cc: <stable@kernel.org>

d3942cff

x86, 32-bit: fix boot failure on TSC-less processors · df17b1d9

由 Mikael Pettersson 提交于 6月 15, 2008

Booting 2.6.26-rc6 on my 486 DX/4 fails with a "BUG: Int 6"
(invalid opcode) and a kernel halt immediately after the
kernel has been uncompressed. The BUG shows EIP pointing
to an rdtsc instruction in native_read_tsc(), invoked from
native_sched_clock().

(This error occurs so early that not even the serial console
can capture it.)

A bisection showed that this bug first occurs in 2.6.26-rc3-git7,
via commit 9ccc906c:

>x86: distangle user disabled TSC from unstable
>
>tsc_enabled is set to 0 from the command line switch "notsc" and from
>the mark_tsc_unstable code. Seperate those functionalities and replace
>tsc_enable with tsc_disable. This makes also the native_sched_clock()
>decision when to use TSC understandable.
>
>Preparatory patch to solve the sched_clock() issue on 32 bit.
>
>Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

The core reason for this bug is that native_sched_clock() gets
called before tsc_init().

Before the commit above, tsc_32.c used a "tsc_enabled" variable
which defaulted to 0 == disabled, and which only got enabled late
in tsc_init(). Thus early calls to native_sched_clock() would skip
the TSC and use jiffies instead.

After the commit above, tsc_32.c uses a "tsc_disabled" variable
which defaults to 0, meaning that the TSC is Ok to use. Early calls
to native_sched_clock() now erroneously try to use the TSC on
!cpu_has_tsc processors, leading to invalid opcode exceptions.

My proposed fix is to initialise tsc_disabled to a "soft disabled"
state distinct from the hard disabled state set up by the "notsc"
kernel option. This fixes the native_sched_clock() problem. It also
allows tsc_init() to be simplified: instead of setting tsc_disabled = 1
on every error return, we just set tsc_disabled = 0 once when all
checks have succeeded.

I've verified that this lets my 486 boot again. I've also verified
that a Core2 machine still uses the TSC as clocksource after the patch.
Signed-off-by: NMikael Pettersson <mikpe@it.uu.se>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

df17b1d9

x86: fix NULL pointer deref in __switch_to · 75118a82

由 Suresh Siddha 提交于 6月 13, 2008

Patrick McHardy reported a crash:

> > I get this oops once a day, its apparently triggered by something
> > run by cron, but the process is a different one each time.
> >
> > Kernel is -git from yesterday shortly before the -rc6 release
> > (last commit is the usb-2.6 merge, the x86 patches are missing),
> > .config is attached.
> >
> > I'll retry with current -git, but the patches that have gone in
> > since I last updated don't look related.
> >
> > [62060.043009] BUG: unable to handle kernel NULL pointer dereference at
> > 000001ff
> > [62060.043009] IP: [<c0102a9b>] __switch_to+0x2f/0x118
> > [62060.043009] *pde = 00000000
> > [62060.043009] Oops: 0002 [#1] PREEMPT

Vegard Nossum analyzed it:

> This decodes to
>
>    0:   0f ae 00                fxsave (%eax)
>
> so it's related to the floating-point context. This is the exact
> location of the crash:
>
> $ addr2line -e arch/x86/kernel/process_32.o -i ab0
> include/asm/i387.h:232
> include/asm/i387.h:262
> arch/x86/kernel/process_32.c:595
>
> ...so it looks like prev_task->thread.xstate->fxsave has become NULL.
> Or maybe it never had any other value.

Somehow (as described below) TS_USEDFPU is set but the fpu is not
allocated or freed.

Another possible FPU pre-emption issue with the sleazy FPU optimization
which was benign before but not so anymore, with the dynamic FPU allocation
patch.

New task is getting exec'd and it is prempted at the below point.

flush_thread() {
	...
	/*
	* Forget coprocessor state..
	*/
	clear_fpu(tsk);
		<----- Preemption point
	clear_used_math();
	...
}

Now when it context switches in again, as the used_math() is still set
and fpu_counter can be > 5, we will do a math_state_restore() which sets
the task's TS_USEDFPU. After it continues from the above preemption point
it does clear_used_math() and much later free_thread_xstate().

Now, at the next context switch, it is quite possible that xstate is
null, used_math() is not set and TS_USEDFPU is still set. This will
trigger unlazy_fpu() causing kernel oops.

Fix this  by clearing tsk's fpu_counter before clearing task's fpu.
Reported-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

75118a82

x86: set PAE PHYSICAL_MASK_SHIFT to 44 bits. · ad524d46

由 Jeremy Fitzhardinge 提交于 6月 06, 2008

When a 64-bit x86 processor runs in 32-bit PAE mode, a pte can
potentially have the same number of physical address bits as the
64-bit host ("Enhanced Legacy PAE Paging").  This means, in theory,
we could have up to 52 bits of physical address in a pte.

The 32-bit kernel uses a 32-bit unsigned long to represent a pfn.
This means that it can only represent physical addresses up to 32+12=44
bits wide.  Rather than widening pfns everywhere, just set 2^44 as the
Linux x86_32-PAE architectural limit for physical address size.

This is a bugfix for two cases:
1. running a 32-bit PAE kernel on a machine with
  more than 64GB RAM.
2. running a 32-bit PAE Xen guest on a host machine with
  more than 64GB RAM

In both cases, a pte could need to have more than 36 bits of physical,
and masking it to 36-bits will cause fairly severe havoc.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Jan Beulich <jbeulich@novell.com>
Cc: <stable@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

ad524d46

agp: brown paper bag patch - put back two lines that got lost · 9bedbcb2

由 Dave Airlie 提交于 6月 19, 2008

Commit 62c96b9d ("agp/intel: cleanup
some serious whitespace badness") didn't just fix whitespace.  It also
lost two lines.

Noticed by Linus. No more whitespace diffs for me.
Signed-off-by: NDave Airlie <airlied@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9bedbcb2

Merge branch 'agp-patches' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/agp-2.6 · 3506ba7b

由 Linus Torvalds 提交于 6月 18, 2008

* 'agp-patches' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/agp-2.6:
  agp/intel: cleanup some serious whitespace badness
  [AGP] intel_agp: Add support for Intel 4 series chipsets
  [AGP] intel_agp: extra stolen mem size available for IGD_GM chipset
  agp: more boolean conversions.
  drivers/char/agp - use bool
  agp: two-stage page destruction issue
  agp/via: fixup pci ids

3506ba7b

D
agp/intel: cleanup some serious whitespace badness · 62c96b9d
由 Dave Airlie 提交于 6月 19, 2008
```
Signed-off-by: NDave Airlie <airlied@redhat.com>
```
62c96b9d
Z
[AGP] intel_agp: Add support for Intel 4 series chipsets · 25ce77ab
由 Zhenyu Wang 提交于 6月 19, 2008
```
Signed-off-by: NZhenyu Wang <zhenyu.z.wang@intel.com>
Signed-off-by: NDave Airlie <airlied@redhat.com>
```
25ce77ab

[AGP] intel_agp: extra stolen mem size available for IGD_GM chipset · 598d1448

由 Zhenyu Wang 提交于 6月 19, 2008

This adds missing stolen memory size detect for IGD_GM, be sure to
detect right size as current X intel driver (2.3.2) which has already
worked out.
Signed-off-by: NZhenyu Wang <zhenyu.z.wang@intel.com>
Signed-off-by: NDave Airlie <airlied@redhat.com>

598d1448

D
agp: more boolean conversions. · 9516b030
由 Dave Airlie 提交于 6月 19, 2008
```
Signed-off-by: NDave Airlie <airlied@redhat.com>
```
9516b030

drivers/char/agp - use bool · c7258012

由 Joe Perches 提交于 3月 26, 2008

Use boolean in AGP instead of having own TRUE/FALSE

--
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDave Airlie <airlied@redhat.com>

c7258012

agp: two-stage page destruction issue · da503fa6

由 Jan Beulich 提交于 6月 18, 2008

besides it apparently being useful only in 2.6.24 (the changes in 2.6.25
really mean that it could be converted back to a single-stage mechanism),
I'm seeing an issue in Xen Dom0 kernels, which is caused by the calling
of gart_to_virt() in the second stage invocations of the destroy function.
I think that besides this being a real issue with Xen (where
unmap_page_from_agp() is not just a page table attribute change), this
also is invalid from a theoretical perspective: One should not assume that
gart_to_virt() is still valid after unmapping a page. So minimally (keeping
the 2-stage mechanism) a patch like the one below would be needed.

Jan
Signed-off-by: NDave Airlie <airlied@redhat.com>

da503fa6

agp/via: fixup pci ids · dcd981a7

由 Greg KH 提交于 6月 19, 2008

add a new PCI ID and remove an old dodgy one, include the explaination
in the commented code so nobody readds later.

(davej also sent the pci id addition).
Signed-off-by: NDave Airlie <airlied@redhat.com>

dcd981a7

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband · f9d1c6ca

由 Linus Torvalds 提交于 6月 18, 2008

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  IB/uverbs: Fix check of is_closed flag check in ib_uverbs_async_handler()
  RDMA/nes: Fix off-by-one in nes_reg_user_mr() error path

f9d1c6ca

IB/uverbs: Fix check of is_closed flag check in ib_uverbs_async_handler() · fb77bcef

由 Jack Morgenstein 提交于 6月 18, 2008

Commit 1ae5c187 ("IB/uverbs: Don't store struct file * for event
files") changed the way that closed files are handled in the uverbs
code.  However, after the conversion, is_closed flag is checked
incorrectly in ib_uverbs_async_handler().  As a result, no async
events are ever passed to applications.

Found by: Ronni Zimmerman <ronniz@mellanox.co.il>
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

fb77bcef

Merge git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog · a8051fde

由 Linus Torvalds 提交于 6月 18, 2008

* git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog:
  Revert "[WATCHDOG] hpwdt: Fix NMI handling."
  [WATCHDOG] hpwdt: Add CFLAGS to get driver working
  Revert "[WATCHDOG] make watchdog/hpwdt.c:asminline_call() static"

a8051fde

Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6 · 5dfd0621

由 Linus Torvalds 提交于 6月 18, 2008

* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6:
  [SCSI] dpt_i2o: Add PROC_IA64 define
  [SCSI] scsi_host regression: fix scsi host leak
  [SCSI] sr: fix corrupt CD data after media change and delay

5dfd0621

Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc · f32c23f5

由 Linus Torvalds 提交于 6月 18, 2008

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
  [POWERPC] Clear sub-page HPTE present bits when demoting page size
  [POWERPC] 4xx: Clear new TLB cache attribute bits in Data Storage vector

f32c23f5

L
Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6 · e8995364
由 Linus Torvalds 提交于 6月 18, 2008
```
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6:
  udf: restore UDFFS_DEBUG to being undefined by default
```
e8995364

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 · d83b14c0

由 Linus Torvalds 提交于 6月 18, 2008

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (43 commits)
  netlink: genl: fix circular locking
  Revert "mac80211: Use skb_header_cloned() on TX path."
  af_unix: fix 'poll for write'/ connected DGRAM sockets
  tun: Proper handling of IPv6 header in tun driver when TUN_NO_PI is set
  atl1: relax eeprom mac address error check
  net/enc28j60: low power mode
  net/enc28j60: section fix
  sky2: 88E8040T pci device id
  netxen: download firmware in pci probe
  netxen: cleanup debug messages
  netxen: remove global physical_port array
  netxen: fix portnum for hp mezz cards
  ibm_newemac: select CRC32 in Kconfig
  xfrm: fix fragmentation for ipv4 xfrm tunnel
  netfilter: nf_conntrack_h323: fix module unload crash
  netfilter: nf_conntrack_h323: fix memory leak in module initialization error path
  netfilter: nf_nat: fix RCU races
  atm: [he] send idle cells instead of unassigned when in SDH mode
  atm: [he] limit queries to the device's register space
  atm: [br2864] fix routed vcmux support
  ...

d83b14c0

Revert "[WATCHDOG] hpwdt: Fix NMI handling." · fdf7be6f

由 Wim Van Sebroeck 提交于 6月 18, 2008

The old setup works better.
Signed-off-by: NThomas Mingarelli <Thomas.Mingarelli@hp.com>
Signed-off-by: NWim Van Sebroeck <wim@iguana.be>

fdf7be6f

18 6月, 2008 10 次提交

[POWERPC] Clear sub-page HPTE present bits when demoting page size · 65ba6cdc

由 Paul Mackerras 提交于 6月 18, 2008

When we demote a slice from 64k to 4k, and we are about to insert an
HPTE for a 4k subpage and we notice that there is an existing 64k
HPTE, we first invalidate that HPTE before inserting the new 4k
subpage HPTE. Since the bits that encode which hash bucket the old
HPTE was in overlap with the bits that encode which of the 16 subpages
have HPTEs, we need to clear out the subpage HPTE-present bits before
starting to insert HPTEs for the 4k subpages. If we don't do that, we
can erroneously think that a subpage already has an HPTE when it
doesn't.

That in itself wouldn't be such a problem except that when we go to
update the HPTE that we think is present on machines with a
hypervisor, the hypervisor can tell us that the HPTE we think is there
is actually there even though it isn't, which can lead to a process
getting stuck in a loop, continually faulting. The reason for the
confusion is that the AVPN (abbreviated virtual page number) we are
looking for in the HPTE for a 4k subpage can actually match the AVPN
in a stale HPTE for another 64k page. For example, the HPTE for
the 4k subpage at 0x84000f000 will be in the same hash bucket and have
the same AVPN as the HPTE for the 64k page at 0x8400f0000.

This fixes the code to clear out the subpage HPTE-present bits.
Signed-off-by: NPaul Mackerras <paulus@samba.org>

65ba6cdc

[POWERPC] 4xx: Clear new TLB cache attribute bits in Data Storage vector · b17879f7

由 Josh Boyer 提交于 6月 18, 2008

A recent commit added support for the new 440x6 and 464 cores that have the
added WL1, IL1I, IL1D, IL2I, and ILD2 bits for the caching attributes in the
TLBs. The new bits were cleared in the finish_tlb_load function, however a
similar bit of code was missed in the DataStorage interrupt vector.
Signed-off-by: NJosh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

b17879f7

netlink: genl: fix circular locking · 6d1a3fb5

由 Patrick McHardy 提交于 6月 18, 2008

genetlink has a circular locking dependency when dumping the registered
families:

- dump start:
genl_rcv()            : take genl_mutex
genl_rcv_msg()        : call netlink_dump_start() while holding genl_mutex
netlink_dump_start(),
netlink_dump()        : take nlk->cb_mutex
ctrl_dumpfamily()     : try to detect this case and not take genl_mutex a
                        second time

- dump continuance:
netlink_rcv()         : call netlink_dump
netlink_dump          : take nlk->cb_mutex
ctrl_dumpfamily()     : take genl_mutex

Register genl_lock as callback mutex with netlink to fix this. This slightly
widens an already existing module unload race, the genl ops used during the
dump might go away when the module is unloaded. Thomas Graf is working on a
seperate fix for this.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6d1a3fb5

Revert "mac80211: Use skb_header_cloned() on TX path." · 3a5be7d4

由 David S. Miller 提交于 6月 18, 2008

This reverts commit 608961a5.

The problem is that the mac80211 stack not only needs to be able to
muck with the link-level headers, it also might need to mangle all of
the packet data if doing sw wireless encryption.

This fixes kernel bugzilla #10903.  Thanks to Didier Raboud (for the
bugzilla report), Andrew Prince (for bisecting), Johannes Berg (for
bringing this bisection analysis to my attention), and Ilpo (for
trying to analyze this purely from the TCP side).

In 2.6.27 we can take another stab at this, by using something like
skb_cow_data() when the TX path of mac80211 ends up with a non-NULL
tx->key.  The ESP protocol code in the IPSEC stack can be used as a
model for implementation.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3a5be7d4

af_unix: fix 'poll for write'/ connected DGRAM sockets · 3c73419c

由 Rainer Weikusat 提交于 6月 17, 2008

The unix_dgram_sendmsg routine implements a (somewhat crude)
form of receiver-imposed flow control by comparing the length of the
receive queue of the 'peer socket' with the max_ack_backlog value
stored in the corresponding sock structure, either blocking
the thread which caused the send-routine to be called or returning
EAGAIN. This routine is used by both SOCK_DGRAM and SOCK_SEQPACKET
sockets. The poll-implementation for these socket types is
datagram_poll from core/datagram.c. A socket is deemed to be writeable
by this routine when the memory presently consumed by datagrams
owned by it is less than the configured socket send buffer size. This
is always wrong for connected PF_UNIX non-stream sockets when the
abovementioned receive queue is currently considered to be full.
'poll' will then return, indicating that the socket is writeable, but
a subsequent write result in EAGAIN, effectively causing an
(usual) application to 'poll for writeability by repeated send request
with O_NONBLOCK set' until it has consumed its time quantum.

The change below uses a suitably modified variant of the datagram_poll
routines for both type of PF_UNIX sockets, which tests if the
recv-queue of the peer a socket is connected to is presently
considered to be 'full' as part of the 'is this socket
writeable'-checking code. The socket being polled is additionally
put onto the peer_wait wait queue associated with its peer, because the
unix_dgram_sendmsg routine does a wake up on this queue after a
datagram was received and the 'other wakeup call' is done implicitly
as part of skb destruction, meaning, a process blocked in poll
because of a full peer receive queue could otherwise sleep forever
if no datagram owned by its socket was already sitting on this queue.
Among this change is a small (inline) helper routine named
'unix_recvq_full', which consolidates the actual testing code (in three
different places) into a single location.
Signed-off-by: NRainer Weikusat <rweikusat@mssgmbh.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3c73419c

D

Merge branch 'davem-fixes' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 · 4552e119
由 David S. Miller 提交于 6月 17, 2008

4552e119

tun: Proper handling of IPv6 header in tun driver when TUN_NO_PI is set · f09f7ee2

由 Ang Way Chuang 提交于 6月 17, 2008

By default, tun.c running in TUN_TUN_DEV mode will set the protocol of
packet to IPv4 if TUN_NO_PI is set. My program failed to work when I
assumed that the driver will check the first nibble of packet,
determine IP version and set the appropriate protocol.
Signed-off-by: NAng Way Chuang <wcang@nav6.org>
Acked-by: NMax Krasnyansky <maxk@qualcomm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f09f7ee2

atl1: relax eeprom mac address error check · 58c7821c

由 Radu Cristescu 提交于 6月 12, 2008

The atl1 driver tries to determine the MAC address thusly:

	- If an EEPROM exists, read the MAC address from EEPROM and
	  validate it.
	- If an EEPROM doesn't exist, try to read a MAC address from
	  SPI flash.
	- If that fails, try to read a MAC address directly from the
	  MAC Station Address register.
	- If that fails, assign a random MAC address provided by the
	  kernel.

We now have a report of a system fitted with an EEPROM containing all
zeros where we expect the MAC address to be, and we currently handle
this as an error condition.  Turns out, on this system the BIOS writes
a valid MAC address to the NIC's MAC Station Address register, but we
never try to read it because we return an error when we find the all-
zeros address in EEPROM.

This patch relaxes the error check and continues looking for a MAC
address even if it finds an illegal one in EEPROM.
Signed-off-by: NRadu Cristescu <advantis@gmx.net>
Signed-off-by: NJay Cliburn <jacliburn@bellsouth.net>
Signed-off-by: NJeff Garzik <jgarzik@redhat.com>

58c7821c

net/enc28j60: low power mode · 7dac6f8d

由 David Brownell 提交于 6月 12, 2008

Keep enc28j60 chips in low-power mode when they're not in use.
At typically 120 mA, these chips run hot even when idle; this
low power mode cuts that power usage by a factor of around 100.

This version provides a generic routine to poll a register until
its masked value equals some value ... e.g. bit set or cleared.
It's basically what the previous wait_phy_ready() did, but this
version is generalized to support the handshaking needed to
enter and exit low power mode.
Signed-off-by: NDavid Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: NClaudio Lanconelli <lanconelli.claudio@eptar.com>
Signed-off-by: NJeff Garzik <jgarzik@redhat.com>

7dac6f8d

net/enc28j60: section fix · 6fd65882

由 David Brownell 提交于 6月 12, 2008

Minor bugfixes to the enc28j60 driver ... wrong section marking,
indentation, and bogus use of spi_bus_type.
Signed-off-by: NDavid Brownell <dbrownell@users.sourceforge.net>
Acked-by: NClaudio Lanconelli <lanconelli.claudio@eptar.com>
Signed-off-by: NJeff Garzik <jgarzik@redhat.com>

6fd65882