提交 · 75e1056f5c57050415b64cb761a3acc35d91f013 · openeuler / raspberrypi-kernel

19 10月, 2010 1 次提交

sched: Fix softirq time accounting · 75e1056f

由 Venkatesh Pallipadi 提交于 10月 04, 2010

Peter Zijlstra found a bug in the way softirq time is accounted in
VIRT_CPU_ACCOUNTING on this thread:

http://lkml.indiana.edu/hypermail//linux/kernel/1009.2/01366.html

The problem is, softirq processing uses local_bh_disable internally. There
is no way, later in the flow, to differentiate between whether softirq is
being processed or is it just that bh has been disabled. So, a hardirq when bh
is disabled results in time being wrongly accounted as softirq.

Looking at the code a bit more, the problem exists in !VIRT_CPU_ACCOUNTING
as well. As account_system_time() in normal tick based accouting also uses
softirq_count, which will be set even when not in softirq with bh disabled.

Peter also suggested solution of using 2*SOFTIRQ_OFFSET as irq count
for local_bh_{disable,enable} and using just SOFTIRQ_OFFSET while softirq
processing. The patch below does that and adds API in_serving_softirq() which
returns whether we are currently processing softirq or not.

Also changes one of the usages of softirq_count in net/sched/cls_cgroup.c
to in_serving_softirq.

Looks like many usages of in_softirq really want in_serving_softirq. Those
changes can be made individually on a case by case basis.
Signed-off-by: NVenkatesh Pallipadi <venki@google.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1286237003-12406-2-git-send-email-venki@google.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

75e1056f

09 10月, 2010 1 次提交

net: clear heap allocation for ETHTOOL_GRXCLSRLALL · ae6df5f9

由 Kees Cook 提交于 10月 07, 2010

Calling ETHTOOL_GRXCLSRLALL with a large rule_cnt will allocate kernel
heap without clearing it. For the one driver (niu) that implements it,
it will leave the unused portion of heap unchanged and copy the full
contents back to userspace.
Signed-off-by: NKees Cook <kees.cook@canonical.com>
Acked-by: NBen Hutchings <bhutchings@solarflare.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ae6df5f9

07 10月, 2010 2 次提交

Revert "mac80211: use netif_receive_skb in ieee80211_tx_status callpath" · 4efe7f51

由 John W. Linville 提交于 10月 07, 2010

This reverts commit 5ed3bc72.

It turns-out that not all drivers are calling ieee80211_tx_status from a
compatible context.  Revert this for now and try again later...
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

4efe7f51

mac80211: delete AddBA response timer · 44271488

由 Johannes Berg 提交于 10月 05, 2010

We never delete the addBA response timer, which
is typically fine, but if the station it belongs
to is deleted very quickly after starting the BA
session, before the peer had a chance to reply,
the timer may fire after the station struct has
been freed already. Therefore, we need to delete
the timer in a suitable spot -- best when the
session is being stopped (which will happen even
then) in which case the delete will be a no-op
most of the time.

I've reproduced the scenario and tested the fix.

This fixes the crash reported at
http://mid.gmane.org/4CAB6F96.6090701@candelatech.com

Cc: stable@kernel.org
Reported-by: NBen Greear <greearb@candelatech.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

44271488

06 10月, 2010 1 次提交

caif: fix two caif_connect() bugs · 79315068

由 Eric Dumazet 提交于 10月 04, 2010

caif_connect() might dereference a netdevice after dev_put() it.

It also doesnt check dev_get_by_index() return value and could
dereference a NULL pointer.

Fix it, using RCU to avoid taking a reference.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
CC: Sjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

79315068

05 10月, 2010 2 次提交

cls_u32: signedness bug · 4e18b3ed

由 Dan Carpenter 提交于 10月 04, 2010

skb_headroom() is unsigned so "skb_headroom(skb) + toff" is also
unsigned and can't be less than zero.  This test was added in 66d50d25:
"u32: negative offset fix"  It was supposed to fix a regression.
Signed-off-by: NDan Carpenter <error27@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4e18b3ed

Bluetooth: Disallow to change L2CAP_OPTIONS values when connected · eaa71b31

由 Gustavo F. Padovan 提交于 10月 04, 2010

L2CAP doesn't permit change like MTU, FCS, TxWindow values while the
connection is alive, we can only set that before the
connection/configuration process. That can lead to bugs in the L2CAP
operation.
Signed-off-by: NGustavo F. Padovan <padovan@profusion.mobi>

eaa71b31

04 10月, 2010 6 次提交

sctp: Fix out-of-bounds reading in sctp_asoc_get_hmac() · 51e97a12

由 Dan Rosenberg 提交于 10月 01, 2010

The sctp_asoc_get_hmac() function iterates through a peer's hmac_ids
array and attempts to ensure that only a supported hmac entry is
returned. The current code fails to do this properly - if the last id
in the array is out of range (greater than SCTP_AUTH_HMAC_ID_MAX), the
id integer remains set after exiting the loop, and the address of an
out-of-bounds entry will be returned and subsequently used in the parent
function, causing potentially ugly memory corruption. This patch resets
the id integer to 0 on encountering an invalid id so that NULL will be
returned after finishing the loop if no valid ids are found.
Signed-off-by: NDan Rosenberg <drosenberg@vsecurity.com>
Acked-by: NVlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

51e97a12

sctp: prevent reading out-of-bounds memory · d7e0d19a

由 Dan Rosenberg 提交于 10月 01, 2010

Two user-controlled allocations in SCTP are subsequently dereferenced as
sockaddr structs, without checking if the dereferenced struct members fall
beyond the end of the allocated chunk.  There doesn't appear to be any
information leakage here based on how these members are used and
additional checking, but it's still worth fixing.

[akpm@linux-foundation.org: remove unfashionable newlines, fix gmail tab->space conversion]
Signed-off-by: NDan Rosenberg <dan.j.rosenberg@gmail.com>
Acked-by: NVlad Yasevich <vladislav.yasevich@hp.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d7e0d19a

ipv4: correct IGMP behavior on v3 query during v2-compatibility mode · 5b7c8406

由 David Stevens 提交于 9月 30, 2010

A recent patch to allow IGMPv2 responses to IGMPv3 queries
bypasses length checks for valid query lengths, incorrectly
resets the v2_seen timer, and does not support IGMPv1.

The following patch responds with a v2 report as required
by IGMPv2 while correcting the other problems introduced
by the patch.
Signed-Off-By: NDavid L Stevens <dlstevens@us.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5b7c8406

Revert "ipv4: Make INET_LRO a bool instead of tristate." · c5d35571

由 Ben Hutchings 提交于 10月 03, 2010

This reverts commit e81963b1.

LRO is now deprecated in favour of GRO, and only a few drivers use it,
so it is desirable to build it as a module in distribution kernels.

The original change to prevent building it as a module was made in an
attempt to avoid the case where some dependents are set to y and some
to m, and INET_LRO can be set to m rather than y.  However, the
Kconfig system will reliably set INET_LRO=y in this case.
Signed-off-by: NBen Hutchings <ben@decadent.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c5d35571

net: Fix the condition passed to sk_wait_event() · 482964e5

由 Nagendra Tomar 提交于 10月 02, 2010

This patch fixes the condition (3rd arg) passed to sk_wait_event() in
sk_stream_wait_memory(). The incorrect check in sk_stream_wait_memory()
causes the following soft lockup in tcp_sendmsg() when the global tcp
memory pool has exhausted.

>>> snip <<<

localhost kernel: BUG: soft lockup - CPU#3 stuck for 11s! [sshd:6429]
localhost kernel: CPU 3:
localhost kernel: RIP: 0010:[sk_stream_wait_memory+0xcd/0x200]  [sk_stream_wait_memory+0xcd/0x200] sk_stream_wait_memory+0xcd/0x200
localhost kernel:
localhost kernel: Call Trace:
localhost kernel:  [sk_stream_wait_memory+0x1b1/0x200] sk_stream_wait_memory+0x1b1/0x200
localhost kernel:  [<ffffffff802557c0>] autoremove_wake_function+0x0/0x40
localhost kernel:  [ipv6:tcp_sendmsg+0x6e6/0xe90] tcp_sendmsg+0x6e6/0xce0
localhost kernel:  [sock_aio_write+0x126/0x140] sock_aio_write+0x126/0x140
localhost kernel:  [xfs:do_sync_write+0xf1/0x130] do_sync_write+0xf1/0x130
localhost kernel:  [<ffffffff802557c0>] autoremove_wake_function+0x0/0x40
localhost kernel:  [hrtimer_start+0xe3/0x170] hrtimer_start+0xe3/0x170
localhost kernel:  [vfs_write+0x185/0x190] vfs_write+0x185/0x190
localhost kernel:  [sys_write+0x50/0x90] sys_write+0x50/0x90
localhost kernel:  [system_call+0x7e/0x83] system_call+0x7e/0x83

>>> snip <<<

What is happening is, that the sk_wait_event() condition passed from
sk_stream_wait_memory() evaluates to true for the case of tcp global memory
exhaustion. This is because both sk_stream_memory_free() and vm_wait are true
which causes sk_wait_event() to *not* call schedule_timeout().
Hence sk_stream_wait_memory() returns immediately to the caller w/o sleeping.
This causes the caller to again try allocation, which again fails and again
calls sk_stream_wait_memory(), and so on.

[ Bug introduced by commit c1cbe4b7
  ("[NET]: Avoid atomic xchg() for non-error case") -DaveM ]
Signed-off-by: NNagendra Singh Tomar <tomer_iisc@yahoo.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

482964e5

net: Fix IPv6 PMTU disc. w/ asymmetric routes · ae878ae2

由 Maciej Żenczykowski 提交于 10月 03, 2010

Signed-off-by: NMaciej Żenczykowski <maze@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ae878ae2

01 10月, 2010 1 次提交

vlan: dont drop packets from unknown vlans in promiscuous mode · 173e79fb

由 Eric Dumazet 提交于 9月 30, 2010

Roger Luethi noticed packets for unknown VLANs getting silently dropped
even in promiscuous mode.

Check for promiscuous mode in __vlan_hwaccel_rx() and vlan_gro_common()
before drops.

As suggested by Patrick, mark such packets to have skb->pkt_type set to
PACKET_OTHERHOST to make sure they are dropped by IP stack.
Reported-by: NRoger Luethi <rl@hellgate.ch>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
CC: Patrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

173e79fb

30 9月, 2010 6 次提交

Revert "Bluetooth: Don't accept ConfigReq if we aren't in the BT_CONFIG state" · b0239c80

由 Gustavo F. Padovan 提交于 9月 08, 2010

This reverts commit 8cb8e6f1.

That commit introduced a regression with the Bluetooth Profile Tuning
Suite(PTS), Reverting this make sure that L2CAP is in a qualificable
state.
Signed-off-by: NGustavo F. Padovan <padovan@profusion.mobi>

b0239c80

Bluetooth: Fix inconsistent lock state with RFCOMM · fad003b6

由 Gustavo F. Padovan 提交于 8月 14, 2010

When receiving a rfcomm connection with the old dund deamon a
inconsistent lock state happens. That's because interrupts were already
disabled by l2cap_conn_start() when rfcomm_sk_state_change() try to lock
the spin_lock.

As result we may have a inconsistent lock state for l2cap_conn_start()
after rfcomm_sk_state_change() calls bh_lock_sock() and disable interrupts
as well.

[ 2833.151999]
[ 2833.151999] =================================
[ 2833.151999] [ INFO: inconsistent lock state ]
[ 2833.151999] 2.6.36-rc3 #2
[ 2833.151999] ---------------------------------
[ 2833.151999] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
[ 2833.151999] krfcommd/2306 [HC0[0]:SC0[0]:HE1:SE1] takes:
[ 2833.151999]  (slock-AF_BLUETOOTH){+.?...}, at: [<ffffffffa00bcb56>] rfcomm_sk_state_change+0x46/0x170 [rfcomm]
[ 2833.151999] {IN-SOFTIRQ-W} state was registered at:
[ 2833.151999]   [<ffffffff81094346>] __lock_acquire+0x5b6/0x1560
[ 2833.151999]   [<ffffffff8109534a>] lock_acquire+0x5a/0x70
[ 2833.151999]   [<ffffffff81392b6c>] _raw_spin_lock+0x2c/0x40
[ 2833.151999]   [<ffffffffa00a5092>] l2cap_conn_start+0x92/0x640 [l2cap]
[ 2833.151999]   [<ffffffffa00a6a3f>] l2cap_sig_channel+0x6bf/0x1320 [l2cap]
[ 2833.151999]   [<ffffffffa00a9173>] l2cap_recv_frame+0x133/0x770 [l2cap]
[ 2833.151999]   [<ffffffffa00a997b>] l2cap_recv_acldata+0x1cb/0x390 [l2cap]
[ 2833.151999]   [<ffffffffa000db4b>] hci_rx_task+0x2ab/0x450 [bluetooth]
[ 2833.151999]   [<ffffffff8106b22b>] tasklet_action+0xcb/0xe0
[ 2833.151999]   [<ffffffff8106b91e>] __do_softirq+0xae/0x150
[ 2833.151999]   [<ffffffff8102bc0c>] call_softirq+0x1c/0x30
[ 2833.151999]   [<ffffffff8102ddb5>] do_softirq+0x75/0xb0
[ 2833.151999]   [<ffffffff8106b56d>] irq_exit+0x8d/0xa0
[ 2833.151999]   [<ffffffff8104484b>] smp_apic_timer_interrupt+0x6b/0xa0
[ 2833.151999]   [<ffffffff8102b6d3>] apic_timer_interrupt+0x13/0x20
[ 2833.151999]   [<ffffffff81029dfa>] cpu_idle+0x5a/0xb0
[ 2833.151999]   [<ffffffff81381ded>] rest_init+0xad/0xc0
[ 2833.151999]   [<ffffffff817ebc4d>] start_kernel+0x2dd/0x2e8
[ 2833.151999]   [<ffffffff817eb2e6>] x86_64_start_reservations+0xf6/0xfa
[ 2833.151999]   [<ffffffff817eb3ce>] x86_64_start_kernel+0xe4/0xeb
[ 2833.151999] irq event stamp: 731
[ 2833.151999] hardirqs last  enabled at (731): [<ffffffff8106b762>] local_bh_enable_ip+0x82/0xe0
[ 2833.151999] hardirqs last disabled at (729): [<ffffffff8106b93e>] __do_softirq+0xce/0x150
[ 2833.151999] softirqs last  enabled at (730): [<ffffffff8106b96e>] __do_softirq+0xfe/0x150
[ 2833.151999] softirqs last disabled at (711): [<ffffffff8102bc0c>] call_softirq+0x1c/0x30
[ 2833.151999]
[ 2833.151999] other info that might help us debug this:
[ 2833.151999] 2 locks held by krfcommd/2306:
[ 2833.151999]  #0:  (rfcomm_mutex){+.+.+.}, at: [<ffffffffa00bb744>] rfcomm_run+0x174/0xb20 [rfcomm]
[ 2833.151999]  #1:  (&(&d->lock)->rlock){+.+...}, at: [<ffffffffa00b9223>] rfcomm_dlc_accept+0x53/0x100 [rfcomm]
[ 2833.151999]
[ 2833.151999] stack backtrace:
[ 2833.151999] Pid: 2306, comm: krfcommd Tainted: G        W   2.6.36-rc3 #2
[ 2833.151999] Call Trace:
[ 2833.151999]  [<ffffffff810928e1>] print_usage_bug+0x171/0x180
[ 2833.151999]  [<ffffffff810936c3>] mark_lock+0x333/0x400
[ 2833.151999]  [<ffffffff810943ca>] __lock_acquire+0x63a/0x1560
[ 2833.151999]  [<ffffffff810948b5>] ? __lock_acquire+0xb25/0x1560
[ 2833.151999]  [<ffffffff8109534a>] lock_acquire+0x5a/0x70
[ 2833.151999]  [<ffffffffa00bcb56>] ? rfcomm_sk_state_change+0x46/0x170 [rfcomm]
[ 2833.151999]  [<ffffffff81392b6c>] _raw_spin_lock+0x2c/0x40
[ 2833.151999]  [<ffffffffa00bcb56>] ? rfcomm_sk_state_change+0x46/0x170 [rfcomm]
[ 2833.151999]  [<ffffffffa00bcb56>] rfcomm_sk_state_change+0x46/0x170 [rfcomm]
[ 2833.151999]  [<ffffffffa00b9239>] rfcomm_dlc_accept+0x69/0x100 [rfcomm]
[ 2833.151999]  [<ffffffffa00b9a49>] rfcomm_check_accept+0x59/0xd0 [rfcomm]
[ 2833.151999]  [<ffffffffa00bacab>] rfcomm_recv_frame+0x9fb/0x1320 [rfcomm]
[ 2833.151999]  [<ffffffff813932bb>] ? _raw_spin_unlock_irqrestore+0x3b/0x60
[ 2833.151999]  [<ffffffff81093acd>] ? trace_hardirqs_on_caller+0x13d/0x180
[ 2833.151999]  [<ffffffff81093b1d>] ? trace_hardirqs_on+0xd/0x10
[ 2833.151999]  [<ffffffffa00bb7f1>] rfcomm_run+0x221/0xb20 [rfcomm]
[ 2833.151999]  [<ffffffff813905e7>] ? schedule+0x287/0x780
[ 2833.151999]  [<ffffffffa00bb5d0>] ? rfcomm_run+0x0/0xb20 [rfcomm]
[ 2833.151999]  [<ffffffff81081026>] kthread+0x96/0xa0
[ 2833.151999]  [<ffffffff8102bb14>] kernel_thread_helper+0x4/0x10
[ 2833.151999]  [<ffffffff813936bc>] ? restore_args+0x0/0x30
[ 2833.151999]  [<ffffffff81080f90>] ? kthread+0x0/0xa0
[ 2833.151999]  [<ffffffff8102bb10>] ? kernel_thread_helper+0x0/0x10
Signed-off-by: NGustavo F. Padovan <padovan@profusion.mobi>

fad003b6

Bluetooth: Simplify L2CAP Streaming mode sending · ccbb84af

由 Gustavo F. Padovan 提交于 8月 30, 2010

As we don't have any error control on the Streaming mode, i.e., we don't
need to keep a copy of the skb for later resending we don't need to
call skb_clone() on it.
Then we can go one further here, and dequeue the skb before sending it,
that also means we don't need to look to sk->sk_send_head anymore.

The patch saves memory and time when sending Streaming mode data, so
it is good to mainline.
Signed-off-by: NGustavo F. Padovan <padovan@profusion.mobi>

ccbb84af

Bluetooth: fix MTU L2CAP configuration parameter · 8183b775

由 Andrei Emeltchenko 提交于 9月 01, 2010

When receiving L2CAP negative configuration response with respect
to MTU parameter we modify wrong field. MTU here means proposed
value of MTU that the remote device intends to transmit. So for local
L2CAP socket it is pi->imtu.
Signed-off-by: NAndrei Emeltchenko <andrei.emeltchenko@nokia.com>
Acked-by: NVille Tervo <ville.tervo@nokia.com>
Signed-off-by: NGustavo F. Padovan <padovan@profusion.mobi>

8183b775

Bluetooth: Only enable L2CAP FCS for ERTM or streaming · 8c462b60

由 Mat Martineau 提交于 8月 24, 2010

This fixes a bug which caused the FCS setting to show L2CAP_FCS_CRC16
with L2CAP modes other than ERTM or streaming. At present, this only
affects the FCS value shown with getsockopt() for basic mode.
Signed-off-by: NMat Martineau <mathewm@codeaurora.org>
Signed-off-by: NGustavo F. Padovan <padovan@profusion.mobi>

8c462b60

Phonet: Correct header retrieval after pskb_may_pull · a91e7d47

由 Kumar Sanghvi 提交于 9月 27, 2010

Retrieve the header after doing pskb_may_pull since, pskb_may_pull
could change the buffer structure.

This is based on the comment given by Eric Dumazet on Phonet
Pipe controller patch for a similar problem.
Signed-off-by: NKumar Sanghvi <kumar.sanghvi@stericsson.com>
Acked-by: NLinus Walleij <linus.walleij@stericsson.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Acked-by: NRémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a91e7d47

29 9月, 2010 2 次提交

ip_gre: Fix dependencies wrt. ipv6. · 68c1f3a9

由 David S. Miller 提交于 9月 28, 2010

The GRE tunnel driver needs to invoke icmpv6 helpers in the
ipv6 stack when ipv6 support is enabled.

Therefore if IPV6 is enabled, we have to enforce that GRE's
enabling (modular or static) matches that of ipv6.
Reported-by: NPatrick McHardy <kaber@trash.net>
Reported-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

68c1f3a9

net-2.6: SYN retransmits: Add new parameter to retransmits_timed_out() · 4d22f7d3

由 Damian Lukowski 提交于 9月 28, 2010

Fixes kernel Bugzilla Bug 18952

This patch adds a syn_set parameter to the retransmits_timed_out()
routine and updates its callers. If not set, TCP_RTO_MIN is taken
as the calculation basis as before. If set, TCP_TIMEOUT_INIT is
used instead, so that sysctl_syn_retries represents the actual
amount of SYN retransmissions in case no SYNACKs are received when
establishing a new connection.
Signed-off-by: NDamian Lukowski <damian@tvk.rwth-aachen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4d22f7d3

28 9月, 2010 4 次提交

tcp: Fix >4GB writes on 64-bit. · 01db403c

由 David S. Miller 提交于 9月 27, 2010

Fixes kernel bugzilla #16603

tcp_sendmsg() truncates iov_len to an 'int' which a 4GB write to write
zero bytes, for example.

There is also the problem higher up of how verify_iovec() works.  It
wants to prevent the total length from looking like an error return
value.

However it does this using 'int', but syscalls return 'long' (and
thus signed 64-bit on 64-bit machines).  So it could trigger
false-positives on 64-bit as written.  So fix it to use 'long'.
Reported-by: NOlaf Bonorden <bono@onlinehome.de>
Reported-by: NDaniel Büse <dbuese@gmx.de>
Reported-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

01db403c

net/9p: Mount only matching virtio channels · 0b20406c

由 Sven Eckelmann 提交于 9月 27, 2010

p9_virtio_create will only compare the the channel's tag characters
against the device name till the end of the channel's tag but not till
the end of the device name. This means that if a user defines channels
with the tags foo and foobar then he would mount foo when he requested
foonot and may mount foo when he requested foobar.

Thus it is necessary to check both string lengths against each other in
case of a successful partial string match.
Signed-off-by: NSven Eckelmann <sven.eckelmann@gmx.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0b20406c

ipv6: add IPv6 to neighbour table overflow warning · 7e1b33e5

由 Ulrich Weber 提交于 9月 27, 2010

IPv4 and IPv6 have separate neighbour tables, so
the warning messages should be distinguishable.

[ Add a suitable message prefix on the ipv4 side as well -DaveM ]
Signed-off-by: NUlrich Weber <uweber@astaro.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7e1b33e5

tcp: fix TSO FACK loss marking in tcp_mark_head_lost · b3de7559

由 Yuchung Cheng 提交于 9月 24, 2010

When TCP uses FACK algorithm to mark lost packets in
tcp_mark_head_lost(), if the number of packets in the (TSO) skb is
greater than the number of packets that should be marked lost, TCP
incorrectly exits the loop and marks no packets lost in the skb. This
underestimates tp->lost_out and affects the recovery/retransmission.
This patch fargments the skb and marks the correct amount of packets
lost.
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Acked-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b3de7559

27 9月, 2010 3 次提交

net/9p: fix memory handling/allocation in rdma_request() · 1d6400c7

由 Davidlohr Bueso 提交于 9月 13, 2010

Return -ENOMEM when erroring on kmalloc and fix memory leaks when returning on error.
Signed-off-by: NDavidlohr Bueso <dave@gnu.org>
Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NEric Van Hensbergen <ericvh@gmail.com>

1d6400c7

ipv6: add a missing unregister_pernet_subsys call · 2cc6d2bf

由 Neil Horman 提交于 9月 24, 2010

Clean up a missing exit path in the ipv6 module init routines.  In
addrconf_init we call ipv6_addr_label_init which calls register_pernet_subsys
for the ipv6_addr_label_ops structure.  But if module loading fails, or if the
ipv6 module is removed, there is no corresponding unregister_pernet_subsys call,
which leaves a now-bogus address on the pernet_list, leading to oopses in
subsequent registrations.  This patch cleans up both the failed load path and
the unload path.  Tested by myself with good results.
Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>

 include/net/addrconf.h |    1 +
 net/ipv6/addrconf.c    |   11 ++++++++---
 net/ipv6/addrlabel.c   |    5 +++++
 3 files changed, 14 insertions(+), 3 deletions(-)
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2cc6d2bf

br2684: fix scheduling while atomic · a3d6713f

由 Karl Hiramoto 提交于 9月 23, 2010

You can't call atomic_notifier_chain_unregister() while in atomic context.

Fix, call un/register_atmdevice_notifier in module __init and __exit.

Bug report:
http://comments.gmane.org/gmane.linux.network/172603Reported-by: NMikko Vinni <mmvinni@yahoo.com>
Tested-by: NMikko Vinni <mmvinni@yahoo.com>
Signed-off-by: NKarl Hiramoto <karl@hiramoto.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a3d6713f

25 9月, 2010 2 次提交

net: fix a lockdep splat · f064af1e

由 Eric Dumazet 提交于 9月 22, 2010

We have for each socket :

One spinlock (sk_slock.slock)
One rwlock (sk_callback_lock)

Possible scenarios are :

(A) (this is used in net/sunrpc/xprtsock.c)
read_lock(&sk->sk_callback_lock) (without blocking BH)
<BH>
spin_lock(&sk->sk_slock.slock);
...
read_lock(&sk->sk_callback_lock);
...

(B)
write_lock_bh(&sk->sk_callback_lock)
stuff
write_unlock_bh(&sk->sk_callback_lock)

(C)
spin_lock_bh(&sk->sk_slock)
...
write_lock_bh(&sk->sk_callback_lock)
stuff
write_unlock_bh(&sk->sk_callback_lock)
spin_unlock_bh(&sk->sk_slock)

This (C) case conflicts with (A) :

CPU1 [A]                         CPU2 [C]
read_lock(callback_lock)
<BH>                             spin_lock_bh(slock)
<wait to spin_lock(slock)>
                                 <wait to write_lock_bh(callback_lock)>

We have one problematic (C) use case in inet_csk_listen_stop() :

local_bh_disable();
bh_lock_sock(child); // spin_lock_bh(&sk->sk_slock)
WARN_ON(sock_owned_by_user(child));
...
sock_orphan(child); // write_lock_bh(&sk->sk_callback_lock)

lockdep is not happy with this, as reported by Tetsuo Handa

It seems only way to deal with this is to use read_lock_bh(callbacklock)
everywhere.

Thanks to Jarek for pointing a bug in my first attempt and suggesting
this solution.
Reported-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Tested-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
CC: Jarek Poplawski <jarkao2@gmail.com>
Tested-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f064af1e

mac80211: fix use-after-free · cd87a2d3

由 Johannes Berg 提交于 9月 24, 2010

commit 8c0c709e
Author: Johannes Berg <johannes@sipsolutions.net>
Date:   Wed Nov 25 17:46:15 2009 +0100

    mac80211: move cmntr flag out of rx flags

moved the CMTR flag into the skb's status, and
in doing so introduced a use-after-free -- when
the skb has been handed to cooked monitors the
status setting will touch now invalid memory.

Additionally, moving it there has effectively
discarded the optimisation -- since the bit is
only ever set on freed SKBs, and those were a
copy, it could never be checked.

For the current release, fixing this properly
is a bit too involved, so let's just remove the
problematic code and leave userspace with one
copy of each frame for each virtual interface.

Cc: stable@kernel.org [2.6.33+]
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

cd87a2d3

23 9月, 2010 7 次提交

xfrm4: strip ECN bits from tos field · 94e22389

由 Ulrich Weber 提交于 9月 22, 2010

otherwise ECT(1) bit will get interpreted as RTO_ONLINK
and routing will fail with XfrmOutBundleGenError.
Signed-off-by: NUlrich Weber <uweber@astaro.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

94e22389

netfilter: nf_conntrack_defrag: check socket type before touching nodefrag flag · cbdd769a

由 Jiri Olsa 提交于 9月 21, 2010

we need to check proper socket type within ipv4_conntrack_defrag
function before referencing the nodefrag flag.

For example the tun driver receive path produces skbs with
AF_UNSPEC socket type, and so current code is causing unwanted
fragmented packets going out.
Signed-off-by: NJiri Olsa <jolsa@redhat.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cbdd769a

netfilter: nf_nat_snmp: fix checksum calculation (v4) · d6120b8a

由 Patrick McHardy 提交于 9月 21, 2010

Fix checksum calculation in nf_nat_snmp_basic.

Based on patches by Clark Wang <wtweeker@163.com> and
Stephen Hemminger <shemminger@vyatta.com>.

https://bugzilla.kernel.org/show_bug.cgi?id=17622Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d6120b8a

netfilter: fix a race in nf_ct_ext_create() · 15cdeada

由 Eric Dumazet 提交于 9月 21, 2010

As soon as rcu_read_unlock() is called, there is no guarantee current
thread can safely derefence t pointer, rcu protected.

Fix is to copy t->alloc_size in a temporary variable.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Reviewed-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

15cdeada

netfilter: fix ipt_REJECT TCP RST routing for indev == outdev · b46ffb85

由 Changli Gao 提交于 9月 21, 2010

ip_route_me_harder can't create the route cache when the outdev is the same
with the indev for the skbs whichout a valid protocol set.

__mkroute_input functions has this check:
1998         if (skb->protocol != htons(ETH_P_IP)) {
1999                 /* Not IP (i.e. ARP). Do not create route, if it is
2000                  * invalid for proxy arp. DNAT routes are always valid.
2001                  *
2002                  * Proxy arp feature have been extended to allow, ARP
2003                  * replies back to the same interface, to support
2004                  * Private VLAN switch technologies. See arp.c.
2005                  */
2006                 if (out_dev == in_dev &&
2007                     IN_DEV_PROXY_ARP_PVLAN(in_dev) == 0) {
2008                         err = -EINVAL;
2009                         goto cleanup;
2010                 }
2011         }

This patch gives the new skb a valid protocol to bypass this check. In order
to make ipt_REJECT work with bridges, you also need to enable ip_forward.

This patch also fixes a regression. When we used skb_copy_expand(), we
didn't have this issue stated above, as the protocol was properly set.
Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b46ffb85

netfilter: nf_ct_sip: default to NF_ACCEPT in sip_help_tcp() · 7874896a

由 Simon Horman 提交于 9月 21, 2010

I initially noticed this because of the compiler warning below, but it
does seem to be a valid concern in the case where ct_sip_get_header()
returns 0 in the first iteration of the while loop.

net/netfilter/nf_conntrack_sip.c: In function 'sip_help_tcp':
net/netfilter/nf_conntrack_sip.c:1379: warning: 'ret' may be used uninitialized in this function
Signed-off-by: NSimon Horman <horms@verge.net.au>
[Patrick: changed NF_DROP to NF_ACCEPT]
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7874896a

netfilter: tproxy: nf_tproxy_assign_sock() can handle tw sockets · d485d500

由 Eric Dumazet 提交于 9月 21, 2010

transparent field of a socket is either inet_twsk(sk)->tw_transparent
for timewait sockets, or inet_sk(sk)->transparent for other sockets
(TCP/UDP).
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d485d500

22 9月, 2010 1 次提交

ip: fix truesize mismatch in ip fragmentation · 3d13008e

由 Eric Dumazet 提交于 9月 21, 2010

Special care should be taken when slow path is hit in ip_fragment() :

When walking through frags, we transfert truesize ownership from skb to
frags. Then if we hit a slow_path condition, we must undo this or risk
uncharging frags->truesize twice, and in the end, having negative socket
sk_wmem_alloc counter, or even freeing socket sooner than expected.

Many thanks to Nick Bowler, who provided a very clean bug report and
test program.

Thanks to Jarek for reviewing my first patch and providing a V2

While Nick bisection pointed to commit 2b85a34e (net: No more
expensive sock_hold()/sock_put() on each tx), underlying bug is older
(2.6.12-rc5)

A side effect is to extend work done in commit b2722b1c
(ip_fragment: also adjust skb->truesize for packets not owned by a
socket) to ipv6 as well.
Reported-and-bisected-by: NNick Bowler <nbowler@elliptictech.com>
Tested-by: NNick Bowler <nbowler@elliptictech.com>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
CC: Jarek Poplawski <jarkao2@gmail.com>
CC: Patrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3d13008e

21 9月, 2010 1 次提交

tcp: Fix race in tcp_poll · a4d25803

由 Tom Marshall 提交于 9月 20, 2010

If a RST comes in immediately after checking sk->sk_err, tcp_poll will
return POLLIN but not POLLOUT.  Fix this by checking sk->sk_err at the end
of tcp_poll.  Additionally, ensure the correct order of operations on SMP
machines with memory barriers.
Signed-off-by: NTom Marshall <tdm.code@gmail.com>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a4d25803