提交 · 345056af41feeda506a8993474b9cbb2c66bc9fb · openeuler / raspberrypi-kernel

28 10月, 2009 2 次提交

sfc: Set ip_summed correctly for page buffers passed to GRO · 345056af

由 Ben Hutchings 提交于 10月 28, 2009

Page buffers containing packets with an incorrect checksum or using a
protocol not handled by hardware checksum offload were previously not
passed to LRO.  The conversion to GRO changed this, but did not set
the ip_summed value accordingly.
Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

345056af

cnic: Fix L2CTX_STATUSB_NUM offset in context memory. · d0549382

由 Michael Chan 提交于 10月 28, 2009

The BNX2_L2CTX_STATUSB_NUM definition needs to be changed to match
the recent firmware update:

commit 078b0735
bnx2: Update firmware to 5.0.0.j3.

Without the fix, bnx2 can crash intermittently in bnx2_rx_int() when
iSCSI is enabled.
Signed-off-by: NMichael Chan <mchan@broadcom.com>
Signed-off-by: NBenjamin Li <benli@broadcom.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d0549382

27 10月, 2009 10 次提交

sh_eth: Add asm/cacheflush.h · f568a926

由 Nobuhiro Iwamatsu 提交于 10月 26, 2009

Add include asm/cacheflush.h,  because declaration of __flush_purge_region
moved to asm/cacheflush.h.
Signed-off-by: NNobuhiro Iwamatsu <iwamatsu@nigauri.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f568a926

PPPoE: Fix flush/close races. · fb64bb56

由 Michal Ostrowski 提交于 10月 26, 2009

Be more careful about the state of pointers during tear-down.
The "pppoe_dev" field can only be looked at safely while holding socket locks.
This subsequently allows for the flush_lock to be killed.

We depend on the PPPOX_CONNECTED state to tell us that that those fields are
valid, so whoever clears that state (pppox_unbind_sock()) is responsible for
the dev_put() call.

We also have to ensure that we delete_item() on all sockets before they are
cleaned up.

The need for these changes has been exposed by scenarios wherein namespace
bindings of ethernet devices change while there are ongoing PPPoE sessions,
which resulted in oopses due to unusual socket connection termination paths,
exposing these issues.
Signed-off-by: NMichal Ostrowski <mostrows@gmail.com>
Reviewed-by: NCyril Gorcunov <gorcunov@gmail.com>
Reported-by: NDenys Fedoryschenko <denys@visp.net.lb>
Tested-by: NDenys Fedoryschenko <denys@visp.net.lb>

fb64bb56

e1000e: allow for swflag to be held over consecutive PHY accesses · 5ccdcecb

由 Bruce Allan 提交于 10月 26, 2009

PCH-based parts (82577/82578) and some ICH8-based parts (82566) need to
hold the swflag (sw/fw/hw hardware semaphore) over consecutive PHY accesses
in order to perform sw-driven PHY configuration during initialization to
workaround known hardware issues (see follow-on patch). This patch
provides new PHY read/write functions (and function pointers) that will
allow accessing the PHY registers assuming the swflag has already been
acquired. The actual PHY register access code has moved into helper
functions that are called with a flag indicating whether or not the swflag
has already been acquired and acquires/releases it if not.

The functions called from within the updated PHY access functions had to be
updated to assume the swflag was already acquired, and other functions that
called those functions were also updated to acquire/release the swflag.
Signed-off-by: NBruce Allan <bruce.w.allan@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5ccdcecb

e1000e: separate mutex usage between NVM and PHY/CSR register for ICHx/PCH · ca15df58

由 Bruce Allan 提交于 10月 26, 2009

Accesses to NVM and PHY/CSR registers on ICHx/PCH-based parts are protected
from concurrent accesses with a mutex that is acquired when the access is
initiated and released when the access has completed. However, the two
types of accesses should not be protected by the same mutex because the
driver may have to access the NVM while already holding the mutex over
several consecutive PHY/CSR accesses which would result in livelock.
Signed-off-by: NBruce Allan <bruce.w.allan@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ca15df58

e1000e: 82577/82578 requires a different method to configure LPLU · fa2ce13c

由 Bruce Allan 提交于 10月 26, 2009

Unlike previous ICHx-based parts, the PCH-based parts (82577/82578) require
LPLU (Low Power Link Up, or "reverse auto-negotiation") to be configured in
the PHY rather than the MAC.
Signed-off-by: NBruce Allan <bruce.w.allan@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fa2ce13c

e1000e: increase swflag acquisition timeout for ICHx/PCH · 53ac5a88

由 Bruce Allan 提交于 10月 26, 2009

In some conditions (e.g. when AMT is enabled on the system), it is possible
to take an extended period of time to for the driver to acquire the sw/fw/hw
hardware semaphore used to protect against concurrent access of a shared
resource (e.g. PHY registers). This could cause PHY registers to not get
configured properly resulting in link issues.
Signed-off-by: NBruce Allan <bruce.w.allan@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

53ac5a88

e1000e: clear PHY wakeup bit after LCD reset on 82577/82578 · db2932ec

由 Bruce Allan 提交于 10月 26, 2009

Performing a dummy read of the PHY Wakeup Control (WUC) register clears the
wakeup enable bit set by an PHY reset.  If this bit remains set, link
problems may occur.
Signed-off-by: NBruce Allan <bruce.w.allan@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

db2932ec

igbvf: fix memory leak when ring size changed while interface down · 39305965

由 Alexander Duyck 提交于 10月 26, 2009

This patch resolves a memory leak which occurs while changing the ring size
while the interface is down.
Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

39305965

ixgbe: fix memory leak when resizing rings while interface is down · 759884b4

由 Alexander Duyck 提交于 10月 26, 2009

This patch resolves a memory leak that occurs when you resize the rings via
the ethtool -G option while the interface is down.
Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

759884b4

igb: fix memory leak when setting ring size while interface is down · 6d9f4fc4

由 Alexander Duyck 提交于 10月 26, 2009

Changing ring sizes while the interface was down was causing a double
allocation of the receive and transmit rings. This issue is amplified when
there are multiple rings enabled. To prevent this we need to add an
additional check which will just update the ring counts when the interface
is not up and skip the allocation steps.
Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6d9f4fc4

24 10月, 2009 4 次提交

bonding: Modify hash transmit policies to use the packet's source MAC address · d3da6831

由 Jasper Spaans 提交于 10月 23, 2009

Modify bonding hash transmit policies to use the psource MAC address of
the packet instead of the MAC address configured for the bonding device.

The old sitation conflicts with the documentation.
Signed-off-by: NJasper Spaans <spaans@fox-it.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NJay Vosburgh <fubar@us.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d3da6831

pktgen: Dont leak kernel memory · 66ed1e5e

由 Eric Dumazet 提交于 10月 24, 2009

While playing with pktgen, I realized IP ID was not filled and a
random value was taken, possibly leaking 2 bytes of kernel memory.
 
We can use an increasing ID, this can help diagnostics anyway.

Also clear packet payload, instead of leaking kernel memory.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

66ed1e5e

DM9000: Fix revision ID for DM9000B · 62e20a62

由 Ben Dooks 提交于 10月 24, 2009

The DM9000B revision ID is 0x1A, not 0x1B as set in the curernt
dm9000.h header.

Fix bug reported by Paolo Zebelloni.
Signed-off-by: NBen Dooks <ben@simtec.co.uk>
Signed-off-by: NSimtec Linux Team <linux@simtec.co.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

62e20a62

r8169: fix Ethernet Hangup for RTL8110SC rev d · 05af2142

由 Simon Wunderlich 提交于 10月 24, 2009

The 8110SC rev d chip on our board shows a regression which the 8110SB chip
did not have. When inbound traffic is overflowing the receive descriptor queue,
"holes" in the ring buffer may occur which lead to a hangup until the buffer
is filled again. The packets are than completely processed, but the ring
remains porous and no packets are processed until the next overflow. Setting
the interface down and up can fix the problem temporary from userspace.

For some reason we don't know, this behaviour is not occuring if the RxVlan
bit for hardware VLAN untagging is set. There is another "Work around for
AMD plateform" in the current code which checks the VLAN status
word in receive descriptors, but does never come to effect when hardware
VLAN support is enabled. We assume that this is a bug in the chip.

The following patch fixes the problem. Without the patch we could reproduce
the hang within minutes (given other devices also generating lots of
interrupts), without we couldn't reproduce within a few days of long term
testing.

This version contains minor style adjustments and is sent with mutt which
will hopefully not destroy the formatting again.
Signed-off-by: NBernhard Schmidt <bernhard.schmidt@saxnet.de>
Signed-off-by: NSimon Wunderlich <simon.wunderlich@saxnet.de>
Acked-by: NFrancois Romieu <romieu@zoreil.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

05af2142

23 10月, 2009 9 次提交

ifb: should not use __dev_get_by_index() without locks · db519144

由 Eric Dumazet 提交于 10月 20, 2009

At this point (ri_tasklet()), RTNL or dev_base_lock are not held,
we must use dev_get_by_index() instead of __dev_get_by_index()
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

db519144

net: au1000_eth: add missing capability.h · bc36b428

由 Manuel Lauss 提交于 10月 17, 2009

fixes the following build failure:
  CC      drivers/net/au1000_eth.o
/drivers/net/au1000_eth.c: In function 'au1000_set_settings':
/drivers/net/au1000_eth.c:623: error: implicit declaration of function 'capable'
/drivers/net/au1000_eth.c:623: error: 'CAP_NET_ADMIN' undeclared (first use in this function)
/drivers/net/au1000_eth.c:623: error: (Each undeclared identifier is reported only once
/drivers/net/au1000_eth.c:623: error: for each function it appears in.
Signed-off-by: NManuel Lauss <manuel.lauss@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bc36b428

myri10ge: improve port type reporting in ethtool · 196f17eb

由 Brice Goglin 提交于 10月 22, 2009

Improve the reporting of myri10ge port type in ethtool,
and update for new boards.
Signed-off-by: NBrice Goglin <brice@myri.com>
Signed-off-by: NAndrew Gallatin <gallatin@myri.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

196f17eb

net: use WARN() for the WARN_ON in commit · c62f4c45

由 Arjan van de Ven 提交于 10月 22, 2009

Commit b6b39e8f (tcp: Try to catch MSG_PEEK bug) added a printk()
to the WARN_ON() that's in tcp.c. This patch changes this combination
to WARN(); the advantage of WARN() is that the printk message shows up
inside the message, so that kerneloops.org will collect the message.

In addition, this gets rid of an extra if() statement.
Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c62f4c45

e1000e: reset the PHY on 82577/82578 when going to Sx · 74eee2e8

由 Bruce Allan 提交于 10月 22, 2009

The PHY on 82577/82578 parts needs a soft reset when transitioning to Sx
state in order for the PHY write which disables gigabit speed to take
effect. Gigabit speed must be disabled in order for the PHY writes to
registers on page 800 (the wakeup control registers) to work as expected
otherwise the system might not wake via WoL.
Signed-off-by: NBruce Allan <bruce.w.allan@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

74eee2e8

isdn: fix possible circular locking dependency · 2bd9af04

由 Xiaotian Feng 提交于 10月 21, 2009

There's a circular locking dependency:

---> isdn_net_get_locked_lp
    --->lock &nd->queue_lock
    --->lock &nd->queue->xmit_lock
    .....................
    ---->unlock &nd->queue_lock

---> isdn_net_writebuf_skb (called with &nd->queue->xmit_lock locked)
    ---->isdn_net_inc_frame_cnt
         ---->isdn_net_device_busy
              ----> lock &nd->queue_lock

This will trigger lockdep warnings:

 =======================================================
 [ INFO: possible circular locking dependency detected ]
 2.6.32-rc4-testing #7
 -------------------------------------------------------
 ipppd/28379 is trying to acquire lock:
 (&netdev->queue_lock){......}, at: [<e62ad0fd>] isdn_net_device_busy+0x2c/0x74 [isdn]

 but task is already holding lock:
 (&netdev->local->xmit_lock){+.....}, at: [<e62aefc2>] isdn_net_write_super+0x3f/0x6e [isdn]

 which lock already depends on the new lock.
 .......

 We don't need to lock nd->queue->xmit_lock to protect single
isdn_net_lp_busy(). This can fix above lockdep warnings.
Reported-and-tested-by: NTilman Schmidt <tilman@imap.cc>
Signed-off-by: NXiaotian Feng <xtfeng@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2bd9af04

netxen: avoid undue board config check · 0dc6d9cb

由 Dhananjay Phadke 提交于 10月 21, 2009

Old code assumed board config version in the flash to be 1.
When this will get changed by tools, driver just refuses to
attach. This is unnecessary since driver does not have to
parse board config structure directly (maintained by firmware).
Signed-off-by: NDhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0dc6d9cb

netxen: fix tx timeout handling on firmware hang · ff8a306d

由 Amit Kumar Salecha 提交于 10月 21, 2009

Clear NX_RESETING bit in netxen_tx_timeout_task() so that
the firmware watchdog task can catch need_reset request
from tx timeout.
Signed-off-by: NAmit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: NDhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ff8a306d

netxen: fix i2c init · 8bee0a91

由 Dhananjay Phadke 提交于 10月 21, 2009

Avoid resetting subsys ID in i2c block. Also remove duplicate
check for address tranlsation error.
Signed-off-by: NDhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8bee0a91

22 10月, 2009 1 次提交

niu: VLAN_ETH_HLEN should be used to make sure that the whole MAC header was... · 845de8af

由 Joyce Yu 提交于 10月 21, 2009

niu: VLAN_ETH_HLEN should be used to make sure that the whole MAC header was copied to the head buffer in the Vlan packets case
Signed-off-by: NJoyce Yu <joyce.yu@sun.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

845de8af

21 10月, 2009 4 次提交

KS8851: Fix ks8851_set_rx_mode() for IFF_MULTICAST · b6a71bfa

由 Ben Dooks 提交于 10月 19, 2009

In ks8851_set_rx_mode() the case handling IFF_MULTICAST was also setting
the RXCR1_AE bit by accident. This meant that all unicast frames where
being accepted by the device. Remove RXCR1_AE from this case.

Note, RXCR1_AE was also masking a problem with setting the MAC address
properly, so needs to be applied after fixing the MAC write order.

Fixes a bug reported by Doong, Ping of Micrel. This version of the
patch avoids setting RXCR1_ME for all cases.
Signed-off-by: NBen Dooks <ben@simtec.co.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b6a71bfa

KS8851: Fix MAC address write order · 160d0fad

由 Ben Dooks 提交于 10月 19, 2009

The MAC address register was being written in the wrong order, so add
a new address macro to convert mac-address byte to register address and
a ks8851_wrreg8() function to write each byte without having to worry
about any difficult byte swapping.

Fixes a bug reported by Doong, Ping of Micrel.
Signed-off-by: NBen Dooks <ben@simtec.co.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

160d0fad

KS8851: Add soft reset at probe time · 57dada68

由 Ben Dooks 提交于 10月 19, 2009

Issue a full soft reset at probe time.

This was reported by Doong Ping of Micrel, but no explanation of why this
is necessary or what bug it is fixing. Add it as it does not seem to hurt
the current driver and ensures that the device is in a known state when we
start setting it up.
Signed-off-by: NBen Dooks <ben@simtec.co.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

57dada68

net: fix section mismatch in fec.c · 78abcb13

由 Steven King 提交于 10月 20, 2009

fec_enet_init is called by both fec_probe and fec_resume, so it
shouldn't be marked as __init.
Signed-off-by: NSteven King <sfking@fdwdc.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

78abcb13

20 10月, 2009 9 次提交

net: Fix struct inet_timewait_sock bitfield annotation · abf90cca

由 Eric Dumazet 提交于 10月 18, 2009

commit 9e337b0f (net: annotate inet_timewait_sock bitfields)
added 4/8 bytes in struct inet_timewait_sock.

Fix this by declaring tw_ipv6_offset in the 'flags' bitfield
The 14 bits hole is named tw_pad to make it cleary apparent.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

abf90cca

tcp: Try to catch MSG_PEEK bug · b6b39e8f

由 Herbert Xu 提交于 10月 19, 2009

This patch tries to print out more information when we hit the
MSG_PEEK bug in tcp_recvmsg.  It's been around since at least
2005 and it's about time that we finally fix it.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b6b39e8f

net: Fix IP_MULTICAST_IF · 55b80503

由 Eric Dumazet 提交于 10月 19, 2009

ipv4/ipv6 setsockopt(IP_MULTICAST_IF) have dubious __dev_get_by_index() calls.

This function should be called only with RTNL or dev_base_lock held, or reader
could see a corrupt hash chain and eventually enter an endless loop.

Fix is to call dev_get_by_index()/dev_put().

If this happens to be performance critical, we could define a new dev_exist_by_index()
function to avoid touching dev refcount.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

55b80503

bluetooth: static lock key fix · 45054dc1

由 Dave Young 提交于 10月 18, 2009

When shutdown ppp connection, lockdep waring about non-static key
will happen, it is caused by the lock is not initialized properly
at that time.

Fix with tuning the lock/skb_queue_head init order

[   94.339261] INFO: trying to register non-static key.
[   94.342509] the code is fine but needs lockdep annotation.
[   94.342509] turning off the locking correctness validator.
[   94.342509] Pid: 0, comm: swapper Not tainted 2.6.31-mm1 #2
[   94.342509] Call Trace:
[   94.342509]  [<c0248fbe>] register_lock_class+0x58/0x241
[   94.342509]  [<c024b5df>] ? __lock_acquire+0xb57/0xb73
[   94.342509]  [<c024ab34>] __lock_acquire+0xac/0xb73
[   94.342509]  [<c024b7fa>] ? lock_release_non_nested+0x17b/0x1de
[   94.342509]  [<c024b662>] lock_acquire+0x67/0x84
[   94.342509]  [<c04cd1eb>] ? skb_dequeue+0x15/0x41
[   94.342509]  [<c054a857>] _spin_lock_irqsave+0x2f/0x3f
[   94.342509]  [<c04cd1eb>] ? skb_dequeue+0x15/0x41
[   94.342509]  [<c04cd1eb>] skb_dequeue+0x15/0x41
[   94.342509]  [<c054a648>] ? _read_unlock+0x1d/0x20
[   94.342509]  [<c04cd641>] skb_queue_purge+0x14/0x1b
[   94.342509]  [<fab94fdc>] l2cap_recv_frame+0xea1/0x115a [l2cap]
[   94.342509]  [<c024b5df>] ? __lock_acquire+0xb57/0xb73
[   94.342509]  [<c0249c04>] ? mark_lock+0x1e/0x1c7
[   94.342509]  [<f8364963>] ? hci_rx_task+0xd2/0x1bc [bluetooth]
[   94.342509]  [<fab95346>] l2cap_recv_acldata+0xb1/0x1c6 [l2cap]
[   94.342509]  [<f8364997>] hci_rx_task+0x106/0x1bc [bluetooth]
[   94.342509]  [<fab95295>] ? l2cap_recv_acldata+0x0/0x1c6 [l2cap]
[   94.342509]  [<c02302c4>] tasklet_action+0x69/0xc1
[   94.342509]  [<c022fbef>] __do_softirq+0x94/0x11e
[   94.342509]  [<c022fcaf>] do_softirq+0x36/0x5a
[   94.342509]  [<c022fe14>] irq_exit+0x35/0x68
[   94.342509]  [<c0204ced>] do_IRQ+0x72/0x89
[   94.342509]  [<c02038ee>] common_interrupt+0x2e/0x34
[   94.342509]  [<c024007b>] ? pm_qos_add_requirement+0x63/0x9d
[   94.342509]  [<c038e8a5>] ? acpi_idle_enter_bm+0x209/0x238
[   94.342509]  [<c049d238>] cpuidle_idle_call+0x5c/0x94
[   94.342509]  [<c02023f8>] cpu_idle+0x4e/0x6f
[   94.342509]  [<c0534153>] rest_init+0x53/0x55
[   94.342509]  [<c0781894>] start_kernel+0x2f0/0x2f5
[   94.342509]  [<c0781091>] i386_start_kernel+0x91/0x96
Reported-by: NOliver Hartkopp <oliver@hartkopp.net>
Signed-off-by: NDave Young <hidave.darkstar@gmail.com>
Tested-by: NOliver Hartkopp <oliver@hartkopp.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

45054dc1

bluetooth: scheduling while atomic bug fix · f74c77cb

由 Dave Young 提交于 10月 18, 2009

Due to driver core changes dev_set_drvdata will call kzalloc which should be
in might_sleep context, but hci_conn_add will be called in atomic context

Like dev_set_name move dev_set_drvdata to work queue function.

oops as following:

Oct 2 17:41:59 darkstar kernel: [ 438.001341] BUG: sleeping function called from invalid context at mm/slqb.c:1546
Oct 2 17:41:59 darkstar kernel: [ 438.001345] in_atomic(): 1, irqs_disabled(): 0, pid: 2133, name: sdptool
Oct 2 17:41:59 darkstar kernel: [ 438.001348] 2 locks held by sdptool/2133:
Oct 2 17:41:59 darkstar kernel: [ 438.001350] #0: (sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP){+.+.+.}, at: [<faa1d2f5>] lock_sock+0xa/0xc [l2cap]
Oct 2 17:41:59 darkstar kernel: [ 438.001360] #1: (&hdev->lock){+.-.+.}, at: [<faa20e16>] l2cap_sock_connect+0x103/0x26b [l2cap]
Oct 2 17:41:59 darkstar kernel: [ 438.001371] Pid: 2133, comm: sdptool Not tainted 2.6.31-mm1 #2
Oct 2 17:41:59 darkstar kernel: [ 438.001373] Call Trace:
Oct 2 17:41:59 darkstar kernel: [ 438.001381] [<c022433f>] __might_sleep+0xde/0xe5
Oct 2 17:41:59 darkstar kernel: [ 438.001386] [<c0298843>] __kmalloc+0x4a/0x15a
Oct 2 17:41:59 darkstar kernel: [ 438.001392] [<c03f0065>] ? kzalloc+0xb/0xd
Oct 2 17:41:59 darkstar kernel: [ 438.001396] [<c03f0065>] kzalloc+0xb/0xd
Oct 2 17:41:59 darkstar kernel: [ 438.001400] [<c03f04ff>] device_private_init+0x15/0x3d
Oct 2 17:41:59 darkstar kernel: [ 438.001405] [<c03f24c5>] dev_set_drvdata+0x18/0x26
Oct 2 17:41:59 darkstar kernel: [ 438.001414] [<fa51fff7>] hci_conn_init_sysfs+0x40/0xd9 [bluetooth]
Oct 2 17:41:59 darkstar kernel: [ 438.001422] [<fa51cdc0>] ? hci_conn_add+0x128/0x186 [bluetooth]
Oct 2 17:41:59 darkstar kernel: [ 438.001429] [<fa51ce0f>] hci_conn_add+0x177/0x186 [bluetooth]
Oct 2 17:41:59 darkstar kernel: [ 438.001437] [<fa51cf8a>] hci_connect+0x3c/0xfb [bluetooth]
Oct 2 17:41:59 darkstar kernel: [ 438.001442] [<faa20e87>] l2cap_sock_connect+0x174/0x26b [l2cap]
Oct 2 17:41:59 darkstar kernel: [ 438.001448] [<c04c8df5>] sys_connect+0x60/0x7a
Oct 2 17:41:59 darkstar kernel: [ 438.001453] [<c024b703>] ? lock_release_non_nested+0x84/0x1de
Oct 2 17:41:59 darkstar kernel: [ 438.001458] [<c028804b>] ? might_fault+0x47/0x81
Oct 2 17:41:59 darkstar kernel: [ 438.001462] [<c028804b>] ? might_fault+0x47/0x81
Oct 2 17:41:59 darkstar kernel: [ 438.001468] [<c033361f>] ? __copy_from_user_ll+0x11/0xce
Oct 2 17:41:59 darkstar kernel: [ 438.001472] [<c04c9419>] sys_socketcall+0x82/0x17b
Oct 2 17:41:59 darkstar kernel: [ 438.001477] [<c020329d>] syscall_call+0x7/0xb
Signed-off-by: NDave Young <hidave.darkstar@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f74c77cb

tcp: fix TCP_DEFER_ACCEPT retrans calculation · b103cf34

由 Julian Anastasov 提交于 10月 19, 2009

Fix TCP_DEFER_ACCEPT conversion between seconds and
retransmission to match the TCP SYN-ACK retransmission periods
because the time is converted to such retransmissions. The old
algorithm selects one more retransmission in some cases. Allow
up to 255 retransmissions.
Signed-off-by: NJulian Anastasov <ja@ssi.bg>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b103cf34

tcp: reduce SYN-ACK retrans for TCP_DEFER_ACCEPT · 0c3d79bc

由 Julian Anastasov 提交于 10月 19, 2009

Change SYN-ACK retransmitting code for the TCP_DEFER_ACCEPT
users to not retransmit SYN-ACKs during the deferring period if
ACK from client was received. The goal is to reduce traffic
during the deferring period. When the period is finished
we continue with sending SYN-ACKs (at least one) but this time
any traffic from client will change the request to established
socket allowing application to terminate it properly.
Also, do not drop acked request if sending of SYN-ACK fails.
Signed-off-by: NJulian Anastasov <ja@ssi.bg>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0c3d79bc

tcp: accept socket after TCP_DEFER_ACCEPT period · d1b99ba4

由 Julian Anastasov 提交于 10月 19, 2009

Willy Tarreau and many other folks in recent years
were concerned what happens when the TCP_DEFER_ACCEPT period
expires for clients which sent ACK packet. They prefer clients
that actively resend ACK on our SYN-ACK retransmissions to be
converted from open requests to sockets and queued to the
listener for accepting after the deferring period is finished.
Then application server can decide to wait longer for data
or to properly terminate the connection with FIN if read()
returns EAGAIN which is an indication for accepting after
the deferring period. This change still can have side effects
for applications that expect always to see data on the accepted
socket. Others can be prepared to work in both modes (with or
without TCP_DEFER_ACCEPT period) and their data processing can
ignore the read=EAGAIN notification and to allocate resources for
clients which proved to have no data to send during the deferring
period. OTOH, servers that use TCP_DEFER_ACCEPT=1 as flag (not
as a timeout) to wait for data will notice clients that didn't
send data for 3 seconds but that still resend ACKs.
Thanks to Willy Tarreau for the initial idea and to
Eric Dumazet for the review and testing the change.
Signed-off-by: NJulian Anastasov <ja@ssi.bg>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d1b99ba4

Revert "tcp: fix tcp_defer_accept to consider the timeout" · a1a2ad91

由 David S. Miller 提交于 10月 19, 2009

This reverts commit 6d01a026.

Julian Anastasov, Willy Tarreau and Eric Dumazet have come up
with a more correct way to deal with this.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a1a2ad91

19 10月, 2009 1 次提交

AF_UNIX: Fix deadlock on connecting to shutdown socket · 77238f2b

由 Tomoki Sekiyama 提交于 10月 18, 2009

I found a deadlock bug in UNIX domain socket, which makes able to DoS
attack against the local machine by non-root users.

How to reproduce:
1. Make a listening AF_UNIX/SOCK_STREAM socket with an abstruct
    namespace(*), and shutdown(2) it.
 2. Repeat connect(2)ing to the listening socket from the other sockets
    until the connection backlog is full-filled.
 3. connect(2) takes the CPU forever. If every core is taken, the
    system hangs.

PoC code: (Run as many times as cores on SMP machines.)

int main(void)
{
	int ret;
	int csd;
	int lsd;
	struct sockaddr_un sun;

	/* make an abstruct name address (*) */
	memset(&sun, 0, sizeof(sun));
	sun.sun_family = PF_UNIX;
	sprintf(&sun.sun_path[1], "%d", getpid());

	/* create the listening socket and shutdown */
	lsd = socket(AF_UNIX, SOCK_STREAM, 0);
	bind(lsd, (struct sockaddr *)&sun, sizeof(sun));
	listen(lsd, 1);
	shutdown(lsd, SHUT_RDWR);

	/* connect loop */
	alarm(15); /* forcely exit the loop after 15 sec */
	for (;;) {
		csd = socket(AF_UNIX, SOCK_STREAM, 0);
		ret = connect(csd, (struct sockaddr *)&sun, sizeof(sun));
		if (-1 == ret) {
			perror("connect()");
			break;
		}
		puts("Connection OK");
	}
	return 0;
}

(*) Make sun_path[0] = 0 to use the abstruct namespace.
    If a file-based socket is used, the system doesn't deadlock because
    of context switches in the file system layer.

Why this happens:
 Error checks between unix_socket_connect() and unix_wait_for_peer() are
 inconsistent. The former calls the latter to wait until the backlog is
 processed. Despite the latter returns without doing anything when the
 socket is shutdown, the former doesn't check the shutdown state and
 just retries calling the latter forever.

Patch:
 The patch below adds shutdown check into unix_socket_connect(), so
 connect(2) to the shutdown socket will return -ECONREFUSED.
Signed-off-by: NTomoki Sekiyama <tomoki.sekiyama.qu@hitachi.com>
Signed-off-by: NMasanori Yoshida <masanori.yoshida.tv@hitachi.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

77238f2b