提交 · 93ccb3910ae3dbff6d224aecd22d8eece3d70ce9 · openeuler / raspberrypi-kernel

11 1月, 2013 1 次提交

nfs: fix sunrpc/clnt.c kernel-doc warnings · 7144bca6

由 Randy Dunlap 提交于 1月 09, 2013

Fix new kernel-doc warnings in clnt.c:

  Warning(net/sunrpc/clnt.c:561): No description found for parameter 'flavor'
  Warning(net/sunrpc/clnt.c:561): Excess function parameter 'auth' description in 'rpc_clone_client_set_auth'
Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: linux-nfs@vger.kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7144bca6

09 1月, 2013 1 次提交

SUNRPC: Ensure we release the socket write lock if the rpc_task exits early · 87ed5003

由 Trond Myklebust 提交于 1月 07, 2013

If the rpc_task exits while holding the socket write lock before it has
allocated an rpc slot, then the usual mechanism for releasing the write
lock in xprt_release() is defeated.

The problem occurs if the call to xprt_lock_write() initially fails, so
that the rpc_task is put on the xprt->sending wait queue. If the task
exits after being assigned the lock by __xprt_lock_write_func, but
before it has retried the call to xprt_lock_and_alloc_slot(), then
it calls xprt_release() while holding the write lock, but will
immediately exit due to the test for task->tk_rqstp != NULL.
Reported-by: NChris Perl <chris.perl@gmail.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org [>= 3.1]

87ed5003

08 1月, 2013 2 次提交

s390/irq: remove split irq fields from /proc/stat · 420f42ec

由 Heiko Carstens 提交于 1月 02, 2013

Now that irq sum accounting for /proc/stat's "intr" line works again we
have the oddity that the sum field (first field) contains only the sum
of the second (external irqs) and third field (I/O interrupts).
The reason for that is that these two fields are already sums of all other
fields. So if we would sum up everything we would count every interrupt
twice.
This is broken since the split interrupt accounting was merged two years
ago: 052ff461 "[S390] irq: have detailed
statistics for interrupt types".
To fix this remove the split interrupt fields from /proc/stat's "intr"
line again and only have them in /proc/interrupts.

This restores the old behaviour, seems to be the only sane fix and mimics
a behaviour from other architectures where /proc/interrupts also contains
more than /proc/stat's "intr" line does.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

420f42ec

sctp: fix Kconfig bug in default cookie hmac selection · 36a25de2

由 Alex Elder 提交于 1月 07, 2013

Commit 0d0863b0 ("sctp: Change defaults on cookie hmac selection")
added a "choice" to the sctp Kconfig file.  It introduced a bug which
led to an infinite loop when while running "make oldconfig".

The problem is that the wrong symbol was defined as the default value
for the choice.  Using the correct value gets rid of the infinite loop.

Note:  if CONFIG_SCTP_COOKIE_HMAC_SHA1=y was present in the input
config file, both that and CONFIG_SCTP_COOKIE_HMAC_MD5=y be present
in the generated config file.
Signed-off-by: NAlex Elder <elder@inktank.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

36a25de2

07 1月, 2013 1 次提交

ipv4: fix NULL checking in devinet_ioctl() · c7e2e1d7

由 Xi Wang 提交于 1月 05, 2013

The NULL pointer check `!ifa' should come before its first use.

[ Bug origin : commit fd23c3b3
  (ipv4: Add hash table of interface addresses) in linux-2.6.39 ]
Signed-off-by: NXi Wang <xi.wang@gmail.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c7e2e1d7

05 1月, 2013 7 次提交

net/ipv4/ipconfig: really display the BOOTP/DHCP server's address. · 9dd4a13a

由 Philippe De Muyter 提交于 1月 03, 2013

Up to now, the debug and info messages from the ipconfig subsytem
claim to display the IP address of the DHCP/BOOTP server but
display instead the IP address of the bootserver.  Fix that.
Signed-off-by: NPhilippe De Muyter <phdm@macqel.be>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9dd4a13a

mac802154: fix NOHZ local_softirq_pending 08 warning · 5ff3fec6

由 Alexander Aring 提交于 1月 02, 2013

When using nanosleep() in an userspace application we get a
ratelimit warning

NOHZ: local_softirq_pending 08

for 10 times.

This patch replaces netif_rx() with netif_rx_ni() which has
to be used from process/softirq context.
The process/softirq context will be called from fakelb driver.

See linux-kernel commit 481a8199 for similar fix.
Signed-off-by: NAlexander Aring <alex.aring@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5ff3fec6

netfilter: xt_recent: avoid high order page allocations · 2727de76

由 Eric Dumazet 提交于 1月 03, 2013

xt_recent can try high order page allocations and this can fail.

iptables: page allocation failure: order:9, mode:0xc0d0

It also wastes about half the allocated space because of kmalloc()
power-of-two roundups and struct recent_table layout.

Use vmalloc() instead to save space and be less prone to allocation
errors when memory is fragmented.
Reported-by: NMiroslav Kratochvil <exa.exa@gmail.com>
Reported-by: NDave Jones <davej@redhat.com>
Reported-by: NHarald Reindl <h.reindl@thelounge.net>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

2727de76

netfilter: fix missing dependencies for the NOTRACK target · 757ae316

由 Pablo Neira Ayuso 提交于 1月 02, 2013

warning: (NETFILTER_XT_TARGET_NOTRACK) selects NETFILTER_XT_TARGET_CT which has unmet direct
+dependencies (NET && INET && NETFILTER && NETFILTER_XTABLES && NF_CONNTRACK && (IP_NF_RAW ||
+IP6_NF_RAW) && NETFILTER_ADVANCED)
Reported-by: NRandy Dunlap <rdunlap@infradead.org>
Reported-by: Nkbuild test robot <fengguang.wu@intel.com>
Acked-by: NRandy Dunlap <rdunlap@infradead.org>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

757ae316

netfilter: ip6t_NPT: fix IPv6 NTP checksum calculation · 429da4c0

由 Ulrich Weber 提交于 1月 02, 2013

csum16_add() has a broken carry detection, should be:
sum += sum < (__force u16)b;

Instead of fixing csum16_add, remove the custom checksum
functions and use the generic csum_add/csum_sub ones.
Signed-off-by: NUlrich Weber <ulrich.weber@sophos.com>
Acked-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

429da4c0

SUNRPC: Partial revert of commit · 360e1a53

由 Trond Myklebust 提交于 1月 04, 2013

Partially revert commit (SUNRPC: add WARN_ON_ONCE for potential deadlock).
The looping behaviour has been tracked down to a knownn issue with
workqueues, and a workaround has now been implemented.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Cc: Weston Andros Adamson <dros@netapp.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Bruce Fields <bfields@fieldses.org>
Cc: stable@vger.kernel.org [>= 3.7]

360e1a53

SUNRPC: Ensure that we free the rpc_task after cleanups are done · c6567ed1

由 Trond Myklebust 提交于 1月 04, 2013

This patch ensures that we free the rpc_task after the cleanup callbacks
are done in order to avoid a deadlock problem that can be triggered if
the callback needs to wait for another workqueue item to complete.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Cc: Weston Andros Adamson <dros@netapp.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Bruce Fields <bfields@fieldses.org>
Cc: stable@vger.kernel.org

c6567ed1

03 1月, 2013 2 次提交

bridge: add empty br_mdb_init() and br_mdb_uninit() definitions. · fdb184d1

由 Rami Rosen 提交于 1月 03, 2013

This patch adds empty br_mdb_init() and br_mdb_uninit() definitions in
br_private.h to avoid build failure when CONFIG_BRIDGE_IGMP_SNOOPING is not set.
These methods were moved from br_multicast.c to br_netlink.c by
commit 3ec8e9f0Signed-off-by: NRami Rosen <ramirose@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fdb184d1

bridge: Correctly unregister MDB rtnetlink handlers · 3ec8e9f0

由 Vlad Yasevich 提交于 1月 02, 2013

Commit 63233159:
    bridge: Do not unregister all PF_BRIDGE rtnl operations
introduced a bug where a removal of a single bridge from a
multi-bridge system would remove MDB netlink handlers.
The handlers should only be removed once all bridges are gone, but
since we don't keep track of the number of bridge interfaces, it's
simpler to do it when the bridge module is unloaded.  To make it
consistent, move the registration code into module initialization
code path.
Signed-off-by: NVlad Yasevich <vyasevic@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3ec8e9f0

28 12月, 2012 4 次提交

libceph: fix protocol feature mismatch failure path · 0fa6ebc6

由 Sage Weil 提交于 12月 27, 2012

We should not set con->state to CLOSED here; that happens in
ceph_fault() in the caller, where it first asserts that the state
is not yet CLOSED.  Avoids a BUG when the features don't match.

Since the fail_protocol() has become a trivial wrapper, replace
calls to it with direct calls to reset_connection().
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

0fa6ebc6

libceph: WARN, don't BUG on unexpected connection states · 122070a2

由 Alex Elder 提交于 12月 26, 2012

A number of assertions in the ceph messenger are implemented with
BUG_ON(), killing the system if connection's state doesn't match
what's expected.  At this point our state model is (evidently) not
well understood enough for these assertions to trigger a BUG().
Convert all BUG_ON(con->state...) calls to be WARN_ON(con->state...)
so we learn about these issues without killing the machine.

We now recognize that a connection fault can occur due to a socket
closure at any time, regardless of the state of the connection.  So
there is really nothing we can assert about the state of the
connection at that point so eliminate that assertion.
Reported-by: NUgis <ugis22@gmail.com>
Tested-by: NUgis <ugis22@gmail.com>
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

122070a2

libceph: always reset osds when kicking · e6d50f67

由 Alex Elder 提交于 12月 26, 2012

When ceph_osdc_handle_map() is called to process a new osd map,
kick_requests() is called to ensure all affected requests are
updated if necessary to reflect changes in the osd map.  This
happens in two cases:  whenever an incremental map update is
processed; and when a full map update (or the last one if there is
more than one) gets processed.

In the former case, the kick_requests() call is followed immediately
by a call to reset_changed_osds() to ensure any connections to osds
affected by the map change are reset.  But for full map updates
this isn't done.

Both cases should be doing this osd reset.

Rather than duplicating the reset_changed_osds() call, move it into
the end of kick_requests().
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

e6d50f67

libceph: move linger requests sooner in kick_requests() · ab60b16d

由 Alex Elder 提交于 12月 19, 2012

The kick_requests() function is called by ceph_osdc_handle_map()
when an osd map change has been indicated.  Its purpose is to
re-queue any request whose target osd is different from what it
was when it was originally sent.

It is structured as two loops, one for incomplete but registered
requests, and a second for handling completed linger requests.
As a special case, in the first loop if a request marked to linger
has not yet completed, it is moved from the request list to the
linger list.  This is as a quick and dirty way to have the second
loop handle sending the request along with all the other linger
requests.

Because of the way it's done now, however, this quick and dirty
solution can result in these incomplete linger requests never
getting re-sent as desired.  The problem lies in the fact that
the second loop only arranges for a linger request to be sent
if it appears its target osd has changed.  This is the proper
handling for *completed* linger requests (it avoids issuing
the same linger request twice to the same osd).

But although the linger requests added to the list in the first loop
may have been sent, they have not yet completed, so they need to be
re-sent regardless of whether their target osd has changed.

The first required fix is we need to avoid calling __map_request()
on any incomplete linger request.  Otherwise the subsequent
__map_request() call in the second loop will find the target osd
has not changed and will therefore not re-send the request.

Second, we need to be sure that a sent but incomplete linger request
gets re-sent.  If the target osd is the same with the new osd map as
it was when the request was originally sent, this won't happen.
This can be fixed through careful handling when we move these
requests from the request list to the linger list, by unregistering
the request *before* it is registered as a linger request.  This
works because a side-effect of unregistering the request is to make
the request's r_osd pointer be NULL, and *that* will ensure the
second loop actually re-sends the linger request.

Processing of such a request is done at that point, so continue with
the next one once it's been moved.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

ab60b16d

27 12月, 2012 10 次提交

ipv6/ip6_gre: set transport header correctly · ae782bb1

由 Isaku Yamahata 提交于 12月 24, 2012

ip6gre_xmit2() incorrectly sets transport header to inner payload
instead of GRE header. It seems copy-and-pasted from ipip.c.
Set transport header to gre header.
(In ipip case the transport header is the inner ip header, so that's
correct.)

Found by inspection. In practice the incorrect transport header
doesn't matter because the skb usually is sent to another net_device
or socket, so the transport header isn't referenced.
Signed-off-by: NIsaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ae782bb1

ipv4/ip_gre: set transport header correctly to gre header · 861aa6d5

由 Isaku Yamahata 提交于 12月 24, 2012

ipgre_tunnel_xmit() incorrectly sets transport header to inner payload
instead of GRE header. It seems copy-and-pasted from ipip.c.
So set transport header to gre header.
(In ipip case the transport header is the inner ip header, so that's
correct.)

Found by inspection. In practice the incorrect transport header
doesn't matter because the skb usually is sent to another net_device
or socket, so the transport header isn't referenced.
Signed-off-by: NIsaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

861aa6d5

IB/rds: suppress incompatible protocol when version is known · a4967598

由 Marciniszyn, Mike 提交于 12月 21, 2012

Add an else to only print the incompatible protocol message
when version hasn't been established.
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a4967598

IB/rds: Correct ib_api use with gs_dma_address/sg_dma_len · f2e9bd70

由 Marciniszyn, Mike 提交于 12月 21, 2012

0b088e00 ("RDS: Use page_remainder_alloc() for recv bufs")
added uses of sg_dma_len() and sg_dma_address(). This makes
RDS DOA with the qib driver.

IB ulps should use ib_sg_dma_len() and ib_sg_dma_address
respectively since some HCAs overload ib_sg_dma* operations.
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f2e9bd70

tcp: should drop incoming frames without ACK flag set · c3ae62af

由 Eric Dumazet 提交于 12月 26, 2012

In commit 96e0bf4b (tcp: Discard segments that ack data not yet
sent) John Dykstra enforced a check against ack sequences.

In commit 354e4aa3 (tcp: RFC 5961 5.2 Blind Data Injection Attack
Mitigation) I added more safety tests.

But we missed fact that these tests are not performed if ACK bit is
not set.

RFC 793 3.9 mandates TCP should drop a frame without ACK flag set.

" fifth check the ACK field,
      if the ACK bit is off drop the segment and return"

Not doing so permits an attacker to only guess an acceptable sequence
number, evading stronger checks.

Many thanks to Zhiyun Qian for bringing this issue to our attention.

See :
http://web.eecs.umich.edu/~zhiyunq/pub/ccs12_TCP_sequence_number_inference.pdfReported-by: NZhiyun Qian <zhiyunq@umich.edu>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Nandita Dukkipati <nanditad@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: John Dykstra <john.dykstra1@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c3ae62af

batman-adv: fix random jitter calculation · 143cdd8f

由 Akinobu Mita 提交于 12月 26, 2012

batadv_iv_ogm_emit_send_time() attempts to calculates a random integer
in the range of 'orig_interval +- BATADV_JITTER' by the below lines.

        msecs = atomic_read(&bat_priv->orig_interval) - BATADV_JITTER;
        msecs += (random32() % 2 * BATADV_JITTER);

But it actually gets 'orig_interval' or 'orig_interval - BATADV_JITTER'
because '%' and '*' have same precedence and associativity is
left-to-right.

This adds the parentheses at the appropriate position so that it matches
original intension.
Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
Acked-by: NAntonio Quartulli <ordex@autistici.org>
Cc: Marek Lindner <lindner_marek@yahoo.de>
Cc: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Cc: Antonio Quartulli <ordex@autistici.org>
Cc: b.a.t.m.a.n@lists.open-mesh.org
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

143cdd8f

netfilter: ctnetlink: fix leak in error path of ctnetlink_create_expect · 1310b955

由 Jesper Juhl 提交于 12月 26, 2012

This patch fixes a leak in one of the error paths of
ctnetlink_create_expect if no helper and no timeout is specified.
Signed-off-by: NJesper Juhl <jj@chaosbits.net>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

1310b955

netfilter: xt_hashlimit: fix namespace destroy path · 32263dd1

由 Vitaly E. Lavrov 提交于 12月 24, 2012

recent_net_exit() is called before recent_mt_destroy() in the
destroy path of network namespaces. Make sure there are no entries
in the parent proc entry xt_recent before removing it.
Signed-off-by: NVitaly E. Lavrov <lve@guap.ru>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

32263dd1

netfilter: xt_recent: fix namespace destroy path · 665e205c

由 Vitaly E. Lavrov 提交于 12月 24, 2012

recent_net_exit() is called before recent_mt_destroy() in the
destroy path of network namespaces. Make sure there are no entries
in the parent proc entry xt_recent before removing it.
Signed-off-by: NVitaly E. Lavrov <lve@guap.ru>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

665e205c

netfilter: xt_hashlimit: fix race that results in duplicated entries · 09181842

由 Pablo Neira Ayuso 提交于 12月 24, 2012

Two packets may race to create the same entry in the hashtable,
double check if this packet lost race. This double checking only
happens in the path of the packet that creates the hashtable for
first time.

Note that, with this patch, no packet drops occur if the race happens.
Reported-by: NFeng Gao <gfree.wind@gmail.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

09181842

25 12月, 2012 1 次提交

arp: fix a regression in arp_solicit() · cf0be880

由 Cong Wang 提交于 12月 23, 2012

Sedat reported the following commit caused a regression:

commit 9650388b
Author: Eric Dumazet <edumazet@google.com>
Date:   Fri Dec 21 07:32:10 2012 +0000

    ipv4: arp: fix a lockdep splat in arp_solicit

This is due to the 6th parameter of arp_send() needs to be NULL
for the broadcast case, the above commit changed it to an all-zero
array by mistake.
Reported-by: NSedat Dilek <sedat.dilek@gmail.com>
Tested-by: NSedat Dilek <sedat.dilek@gmail.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Julian Anastasov <ja@ssi.bg>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cf0be880

24 12月, 2012 1 次提交

netfilter: xt_CT: recover NOTRACK target support · 10db9069

由 Pablo Neira Ayuso 提交于 12月 20, 2012

Florian Westphal reported that the removal of the NOTRACK target
(96550501 netfilter: remove xt_NOTRACK) is breaking some existing
setups.

That removal was scheduled for removal since long time ago as
described in Documentation/feature-removal-schedule.txt

What:  xt_NOTRACK
Files: net/netfilter/xt_NOTRACK.c
When:  April 2011
Why:   Superseded by xt_CT

Still, people may have not notice / may have decided to stick to an
old iptables version. I agree with him in that some more conservative
approach by spotting some printk to warn users for some time is less
agressive.

Current iptables 1.4.16.3 already contains the aliasing support
that makes it point to the CT target, so upgrading would fix it.
Still, the policy so far has been to avoid pushing our users to
upgrade.

As a solution, this patch recovers the NOTRACK target inside the CT
target and it now spots a warning.
Reported-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

10db9069

22 12月, 2012 7 次提交

net: sched: integer overflow fix · d2fe85da

由 Stefan Hasko 提交于 12月 21, 2012

Fixed integer overflow in function htb_dequeue
Signed-off-by: NStefan Hasko <hasko.stevo@gmail.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d2fe85da

CONFIG_HOTPLUG removal from networking core · 8baf82b3

由 Greg KH 提交于 12月 21, 2012

CONFIG_HOTPLUG is always enabled now, so remove the unused code that was
trying to be compiled out when this option was disabled, in the
networking core.

Cc: Bill Pemberton <wfp5p@virginia.edu>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8baf82b3

bridge: call br_netpoll_disable in br_add_if · 9b1536c4

由 Gao feng 提交于 12月 19, 2012

When netdev_set_master faild in br_add_if, we should
call br_netpoll_disable to do some cleanup jobs,such
as free the memory of struct netpoll which allocated
in br_netpoll_enable.
Signed-off-by: NGao feng <gaofeng@cn.fujitsu.com>
Acked-by: NCong Wang <amwang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9b1536c4

ipv4: arp: fix a lockdep splat in arp_solicit() · 9650388b

由 Eric Dumazet 提交于 12月 21, 2012

Yan Burman reported following lockdep warning :

=============================================
[ INFO: possible recursive locking detected ]
3.7.0+ #24 Not tainted
---------------------------------------------
swapper/1/0 is trying to acquire lock:
  (&n->lock){++--..}, at: [<ffffffff8139f56e>] __neigh_event_send
+0x2e/0x2f0

but task is already holding lock:
  (&n->lock){++--..}, at: [<ffffffff813f63f4>] arp_solicit+0x1d4/0x280

other info that might help us debug this:
  Possible unsafe locking scenario:

        CPU0
        ----
   lock(&n->lock);
   lock(&n->lock);

  *** DEADLOCK ***

  May be due to missing lock nesting notation

4 locks held by swapper/1/0:
  #0:  (((&n->timer))){+.-...}, at: [<ffffffff8104b350>]
call_timer_fn+0x0/0x1c0
  #1:  (&n->lock){++--..}, at: [<ffffffff813f63f4>] arp_solicit
+0x1d4/0x280
  #2:  (rcu_read_lock_bh){.+....}, at: [<ffffffff81395400>]
dev_queue_xmit+0x0/0x5d0
  #3:  (rcu_read_lock_bh){.+....}, at: [<ffffffff813cb41e>]
ip_finish_output+0x13e/0x640

stack backtrace:
Pid: 0, comm: swapper/1 Not tainted 3.7.0+ #24
Call Trace:
  <IRQ>  [<ffffffff8108c7ac>] validate_chain+0xdcc/0x11f0
  [<ffffffff8108d570>] ? __lock_acquire+0x440/0xc30
  [<ffffffff81120565>] ? kmem_cache_free+0xe5/0x1c0
  [<ffffffff8108d570>] __lock_acquire+0x440/0xc30
  [<ffffffff813c3570>] ? inet_getpeer+0x40/0x600
  [<ffffffff8108d570>] ? __lock_acquire+0x440/0xc30
  [<ffffffff8139f56e>] ? __neigh_event_send+0x2e/0x2f0
  [<ffffffff8108ddf5>] lock_acquire+0x95/0x140
  [<ffffffff8139f56e>] ? __neigh_event_send+0x2e/0x2f0
  [<ffffffff8108d570>] ? __lock_acquire+0x440/0xc30
  [<ffffffff81448d4b>] _raw_write_lock_bh+0x3b/0x50
  [<ffffffff8139f56e>] ? __neigh_event_send+0x2e/0x2f0
  [<ffffffff8139f56e>] __neigh_event_send+0x2e/0x2f0
  [<ffffffff8139f99b>] neigh_resolve_output+0x16b/0x270
  [<ffffffff813cb62d>] ip_finish_output+0x34d/0x640
  [<ffffffff813cb41e>] ? ip_finish_output+0x13e/0x640
  [<ffffffffa046f146>] ? vxlan_xmit+0x556/0xbec [vxlan]
  [<ffffffff813cb9a0>] ip_output+0x80/0xf0
  [<ffffffff813ca368>] ip_local_out+0x28/0x80
  [<ffffffffa046f25a>] vxlan_xmit+0x66a/0xbec [vxlan]
  [<ffffffffa046f146>] ? vxlan_xmit+0x556/0xbec [vxlan]
  [<ffffffff81394a50>] ? skb_gso_segment+0x2b0/0x2b0
  [<ffffffff81449355>] ? _raw_spin_unlock_irqrestore+0x65/0x80
  [<ffffffff81394c57>] ? dev_queue_xmit_nit+0x207/0x270
  [<ffffffff813950c8>] dev_hard_start_xmit+0x298/0x5d0
  [<ffffffff813956f3>] dev_queue_xmit+0x2f3/0x5d0
  [<ffffffff81395400>] ? dev_hard_start_xmit+0x5d0/0x5d0
  [<ffffffff813f5788>] arp_xmit+0x58/0x60
  [<ffffffff813f59db>] arp_send+0x3b/0x40
  [<ffffffff813f6424>] arp_solicit+0x204/0x280
  [<ffffffff813a1a70>] ? neigh_add+0x310/0x310
  [<ffffffff8139f515>] neigh_probe+0x45/0x70
  [<ffffffff813a1c10>] neigh_timer_handler+0x1a0/0x2a0
  [<ffffffff8104b3cf>] call_timer_fn+0x7f/0x1c0
  [<ffffffff8104b350>] ? detach_if_pending+0x120/0x120
  [<ffffffff8104b748>] run_timer_softirq+0x238/0x2b0
  [<ffffffff813a1a70>] ? neigh_add+0x310/0x310
  [<ffffffff81043e51>] __do_softirq+0x101/0x280
  [<ffffffff814518cc>] call_softirq+0x1c/0x30
  [<ffffffff81003b65>] do_softirq+0x85/0xc0
  [<ffffffff81043a7e>] irq_exit+0x9e/0xc0
  [<ffffffff810264f8>] smp_apic_timer_interrupt+0x68/0xa0
  [<ffffffff8145122f>] apic_timer_interrupt+0x6f/0x80
  <EOI>  [<ffffffff8100a054>] ? mwait_idle+0xa4/0x1c0
  [<ffffffff8100a04b>] ? mwait_idle+0x9b/0x1c0
  [<ffffffff8100a6a9>] cpu_idle+0x89/0xe0
  [<ffffffff81441127>] start_secondary+0x1b2/0x1b6

Bug is from arp_solicit(), releasing the neigh lock after arp_send()
In case of vxlan, we eventually need to write lock a neigh lock later.

Its a false positive, but we can get rid of it without lockdep
annotations.

We can instead use neigh_ha_snapshot() helper.
Reported-by: NYan Burman <yanb@mellanox.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9650388b

net: devnet_rename_seq should be a seqcount · 30e6c9fa

由 Eric Dumazet 提交于 12月 20, 2012

Using a seqlock for devnet_rename_seq is not a good idea,
as device_rename() can sleep.

As we hold RTNL, we dont need a protection for writers,
and only need a seqcount so that readers can catch a change done
by a writer.

Bug added in commit c91f6df2 (sockopt: Change getsockopt() of
SO_BINDTODEVICE to return an interface name)
Reported-by: NDave Jones <davej@redhat.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Brian Haley <brian.haley@hp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

30e6c9fa

ip_gre: fix possible use after free · f7e75ba1

由 Eric Dumazet 提交于 12月 20, 2012

Once skb_realloc_headroom() is called, tiph might point to freed memory.

Cache tiph->ttl value before the reallocation, to avoid unexpected
behavior.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f7e75ba1

ip_gre: make ipgre_tunnel_xmit() not parse network header as IP unconditionally · 412ed947

由 Isaku Yamahata 提交于 12月 20, 2012

ipgre_tunnel_xmit() parses network header as IP unconditionally.
But transmitting packets are not always IP packet. For example such packet
can be sent by packet socket with sockaddr_ll.sll_protocol set.
So make the function check if skb->protocol is IP.
Signed-off-by: NIsaku Yamahata <yamahata@valinux.co.jp>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

412ed947

21 12月, 2012 3 次提交

libceph: register request before unregister linger · c89ce05e

由 Alex Elder 提交于 12月 06, 2012

In kick_requests(), we need to register the request before we
unregister the linger request.  Otherwise the unregister will
reset the request's osd pointer to NULL.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

c89ce05e

libceph: don't use rb_init_node() in ceph_osdc_alloc_request() · a978fa20

由 Alex Elder 提交于 12月 17, 2012

The red-black node in the ceph osd request structure is initialized
in ceph_osdc_alloc_request() using rbd_init_node().  We do need to
initialize this, because in __unregister_request() we call
RB_EMPTY_NODE(), which expects the node it's checking to have
been initialized.  But rb_init_node() is apparently overkill, and
may in fact be on its way out.  So use RB_CLEAR_NODE() instead.

For a little more background, see this commit:
    4c199a93 rbtree: empty nodes have no color"
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

a978fa20

libceph: init event->node in ceph_osdc_create_event() · 3ee5234d

由 Alex Elder 提交于 12月 17, 2012

The red-black node node in the ceph osd event structure is not
initialized in create_osdc_create_event().  Because this node can
be the subject of a RB_EMPTY_NODE() call later on, we should ensure
the node is initialized properly for that.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

3ee5234d