提交 · 7fda702f9315e6f4a74fee155c540750788a2d66 · openeuler / raspberrypi-kernel

17 11月, 2016 5 次提交

sctp: use new rhlist interface on sctp transport rhashtable · 7fda702f

由 Xin Long 提交于 8年前

Now sctp transport rhashtable uses hash(lport, dport, daddr) as the key
to hash a node to one chain. If in one host thousands of assocs connect
to one server with the same lport and different laddrs (although it's
not a normal case), all the transports would be hashed into the same
chain.

It may cause to keep returning -EBUSY when inserting a new node, as the
chain is too long and sctp inserts a transport node in a loop, which
could even lead to system hangs there.

The new rhlist interface works for this case that there are many nodes
with the same key in one chain. It puts them into a list then makes this
list be as a node of the chain.

This patch is to replace rhashtable_ interface with rhltable_ interface.
Since a chain would not be too long and it would not return -EBUSY with
this fix when inserting a node, the reinsert loop is also removed here.
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7fda702f

netpoll: more efficient locking · 89c4b442

由 Eric Dumazet 提交于 8年前

Callers of netpoll_poll_lock() own NAPI_STATE_SCHED

Callers of netpoll_poll_unlock() have BH blocked between
the NAPI_STATE_SCHED being cleared and poll_lock is released.

We can avoid the spinlock which has no contention, and use cmpxchg()
on poll_owner which we need to set anyway.

This removes a possible lockdep violation after the cited commit,
since sk_busy_loop() re-enables BH before calling busy_poll_stop()

Fixes: 217f6974 ("net: busy-poll: allow preemption in sk_busy_loop()")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

89c4b442

net: busy-poll: return busypolling status to drivers · 364b6055

由 Eric Dumazet 提交于 8年前

NAPI drivers use napi_complete_done() or napi_complete() when
they drained RX ring and right before re-enabling device interrupts.

In busy polling, we can avoid interrupts being delivered since
we are polling RX ring in a controlled loop.

Drivers can chose to use napi_complete_done() return value
to reduce interrupts overhead while busy polling is active.

This is optional, legacy drivers should work fine even
if not updated.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Adam Belay <abelay@google.com>
Cc: Tariq Toukan <tariqt@mellanox.com>
Cc: Yuval Mintz <Yuval.Mintz@cavium.com>
Cc: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

364b6055

net: busy-poll: allow preemption in sk_busy_loop() · 217f6974

由 Eric Dumazet 提交于 8年前

After commit 4cd13c21 ("softirq: Let ksoftirqd do its job"),
sk_busy_loop() needs a bit of care :
softirqs might be delayed since we do not allow preemption yet.

This patch adds preemptiom points in sk_busy_loop(),
and makes sure no unnecessary cache line dirtying
or atomic operations are done while looping.

A new flag is added into napi->state : NAPI_STATE_IN_BUSY_POLL

This prevents napi_complete_done() from clearing NAPIF_STATE_SCHED,
so that sk_busy_loop() does not have to grab it again.

Similarly, netpoll_poll_lock() is done one time.

This gives about 10 to 20 % improvement in various busy polling
tests, especially when many threads are busy polling in
configurations with large number of NIC queues.

This should allow experimenting with bigger delays without
hurting overall latencies.

Tested:
 On a 40Gb mlx4 NIC, 32 RX/TX queues.

 echo 70 >/proc/sys/net/core/busy_read
 for i in `seq 1 40`; do echo -n $i: ; ./super_netperf $i -H lpaa24 -t UDP_RR -- -N -n; done

    Before:      After:
 1:   90072   92819
 2:  157289  184007
 3:  235772  213504
 4:  344074  357513
 5:  394755  458267
 6:  461151  487819
 7:  549116  625963
 8:  544423  716219
 9:  720460  738446
10:  794686  837612
11:  915998  923960
12:  937507  925107
13: 1019677  971506
14: 1046831 1113650
15: 1114154 1148902
16: 1105221 1179263
17: 1266552 1299585
18: 1258454 1383817
19: 1341453 1312194
20: 1363557 1488487
21: 1387979 1501004
22: 1417552 1601683
23: 1550049 1642002
24: 1568876 1601915
25: 1560239 1683607
26: 1640207 1745211
27: 1706540 1723574
28: 1638518 1722036
29: 1734309 1757447
30: 1782007 1855436
31: 1724806 1888539
32: 1717716 1944297
33: 1778716 1869118
34: 1805738 1983466
35: 1815694 2020758
36: 1893059 2035632
37: 1843406 2034653
38: 1888830 2086580
39: 1972827 2143567
40: 1877729 2181851
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Adam Belay <abelay@google.com>
Cc: Tariq Toukan <tariqt@mellanox.com>
Cc: Yuval Mintz <Yuval.Mintz@cavium.com>
Cc: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

217f6974

ipv6: sr: add option to control lwtunnel support · 46738b13

由 David Lebrun 提交于 8年前

This patch adds a new option CONFIG_IPV6_SEG6_LWTUNNEL to enable/disable
support of encapsulation with the lightweight tunnels. When this option
is enabled, CONFIG_LWTUNNEL is automatically selected.

Fix commit 6c8702c6 ("ipv6: sr: add support for SRH encapsulation and injection with lwtunnels")

Without a proper option to control lwtunnel support for SR-IPv6, if
CONFIG_LWTUNNEL=n then the IPv6 initialization fails as a consequence
of seg6_iptunnel_init() failure with EOPNOTSUPP:

NET: Registered protocol family 10
IPv6: Attempt to unregister permanent protocol 6
IPv6: Attempt to unregister permanent protocol 136
IPv6: Attempt to unregister permanent protocol 17
NET: Unregistered protocol family 10

Tested (compiling, booting, and loading ipv6 module when relevant)
with possible combinations of CONFIG_IPV6={y,m,n},
CONFIG_IPV6_SEG6_LWTUNNEL={y,n} and CONFIG_LWTUNNEL={y,n}.
Reported-by: NLorenzo Colitti <lorenzo@google.com>
Suggested-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: NDavid Lebrun <david.lebrun@uclouvain.be>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

46738b13

16 11月, 2016 3 次提交

tcp: allow to enable the repair mode for non-listening sockets · 319b0534

由 Andrey Vagin 提交于 8年前

The repair mode is used to get and restore sequence numbers and
data from queues. It used to checkpoint/restore connections.

Currently the repair mode can be enabled for sockets in the established
and closed states, but for other states we have to dump the same socket
properties, so lets allow to enable repair mode for these sockets.

The repair mode reveals nothing more for sockets in other states.
Signed-off-by: NAndrei Vagin <avagin@openvz.org>
Acked-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

319b0534

dctcp: update cwnd on congestion event · 47805667

由 Florian Westphal 提交于 8年前

draft-ietf-tcpm-dctcp-02 says:

... when the sender receives an indication of congestion
(ECE), the sender SHOULD update cwnd as follows:

         cwnd = cwnd * (1 - DCTCP.Alpha / 2)

So, lets do this and reduce cwnd more smoothly (and faster), as per
current congestion estimate.

Cc: Lawrence Brakmo <brakmo@fb.com>
Cc: Andrew Shewmaker <agshew@gmail.com>
Cc: Glenn Judd <glenn.judd@morganstanley.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

47805667

udplite: fix NULL pointer dereference · c915fe13

由 Paolo Abeni 提交于 8年前

The commit 850cbadd ("udp: use it's own memory accounting schema")
assumes that the socket proto has memory accounting enabled,
but this is not the case for UDPLITE.
Fix it enabling memory accounting for UDPLITE and performing
fwd allocated memory reclaiming on socket shutdown.
UDP and UDPLITE share now the same memory accounting limits.
Also drop the backlog receive operation, since is no more needed.

Fixes: 850cbadd ("udp: use it's own memory accounting schema")
Reported-by: NAndrei Vagin <avagin@gmail.com>
Suggested-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c915fe13

15 11月, 2016 2 次提交

sctp: change sk state only when it has assocs in sctp_shutdown · 5bf35ddf

由 Xin Long 提交于 8年前

Now when users shutdown a sock with SEND_SHUTDOWN in sctp, even if
this sock has no connection (assoc), sk state would be changed to
SCTP_SS_CLOSING, which is not as we expect.

Besides, after that if users try to listen on this sock, kernel
could even panic when it dereference sctp_sk(sk)->bind_hash in
sctp_inet_listen, as bind_hash is null when sock has no assoc.

This patch is to move sk state change after checking sk assocs
is not empty, and also merge these two if() conditions and reduce
indent level.

Fixes: d46e416c ("sctp: sctp should change socket state when shutdown is received")
Reported-by: NAndrey Konovalov <andreyknvl@google.com>
Tested-by: NAndrey Konovalov <andreyknvl@google.com>
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5bf35ddf

net: fix sleeping for sk_wait_event() · d9dc8b0f

由 WANG Cong 提交于 8年前

Similar to commit 14135f30 ("inet: fix sleeping inside inet_wait_for_connect()"),
sk_wait_event() needs to fix too, because release_sock() is blocking,
it changes the process state back to running after sleep, which breaks
the previous prepare_to_wait().

Switch to the new wait API.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d9dc8b0f

14 11月, 2016 3 次提交

netfilter: x_tables: simplify IS_ERR_OR_NULL to NULL test · eb1a6bdc

由 Julia Lawall 提交于 8年前

Since commit 7926dbfa ("netfilter: don't use
mutex_lock_interruptible()"), the function xt_find_table_lock can only
return NULL on an error.  Simplify the call sites and update the
comment before the function.

The semantic patch that change the code is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
expression t,e;
@@

t = \(xt_find_table_lock(...)\|
      try_then_request_module(xt_find_table_lock(...),...)\)
... when != t=e
- ! IS_ERR_OR_NULL(t)
+ t

@@
expression t,e;
@@

t = \(xt_find_table_lock(...)\|
      try_then_request_module(xt_find_table_lock(...),...)\)
... when != t=e
- IS_ERR_OR_NULL(t)
+ !t

@@
expression t,e,e1;
@@

t = \(xt_find_table_lock(...)\|
      try_then_request_module(xt_find_table_lock(...),...)\)
... when != t=e
?- t ? PTR_ERR(t) : e1
+ e1
... when any

// </smpl>
Signed-off-by: NJulia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

eb1a6bdc

tcp: take care of truncations done by sk_filter() · ac6e7800

由 Eric Dumazet 提交于 8年前

With syzkaller help, Marco Grassi found a bug in TCP stack,
crashing in tcp_collapse()

Root cause is that sk_filter() can truncate the incoming skb,
but TCP stack was not really expecting this to happen.
It probably was expecting a simple DROP or ACCEPT behavior.

We first need to make sure no part of TCP header could be removed.
Then we need to adjust TCP_SKB_CB(skb)->end_seq

Many thanks to syzkaller team and Marco for giving us a reproducer.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: NMarco Grassi <marco.gra@gmail.com>
Reported-by: NVladis Dronov <vdronov@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ac6e7800

ipv4: use new_gw for redirect neigh lookup · 969447f2

由 Stephen Suryaputra Lin 提交于 8年前

In v2.6, ip_rt_redirect() calls arp_bind_neighbour() which returns 0
and then the state of the neigh for the new_gw is checked. If the state
isn't valid then the redirected route is deleted. This behavior is
maintained up to v3.5.7 by check_peer_redirect() because rt->rt_gateway
is assigned to peer->redirect_learned.a4 before calling
ipv4_neigh_lookup().

After commit 5943634f ("ipv4: Maintain redirect and PMTU info in
struct rtable again."), ipv4_neigh_lookup() is performed without the
rt_gateway assigned to the new_gw. In the case when rt_gateway (old_gw)
isn't zero, the function uses it as the key. The neigh is most likely
valid since the old_gw is the one that sends the ICMP redirect message.
Then the new_gw is assigned to fib_nh_exception. The problem is: the
new_gw ARP may never gets resolved and the traffic is blackholed.

So, use the new_gw for neigh lookup.

Changes from v1:
 - use __ipv4_neigh_lookup instead (per Eric Dumazet).

Fixes: 5943634f ("ipv4: Maintain redirect and PMTU info in struct rtable again.")
Signed-off-by: NStephen Suryaputra Lin <ssurya@ieee.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

969447f2

13 11月, 2016 10 次提交

openvswitch: allow L3 netdev ports · 217ac77a

由 Jiri Benc 提交于 8年前

Allow ARPHRD_NONE interfaces to be added to ovs bridge.

Based on previous versions by Lorand Jakab and Simon Horman.
Signed-off-by: NLorand Jakab <lojakab@cisco.com>
Signed-off-by: NSimon Horman <simon.horman@netronome.com>
Signed-off-by: NJiri Benc <jbenc@redhat.com>
Acked-by: NPravin B Shelar <pshelar@ovn.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

217ac77a

openvswitch: add Ethernet push and pop actions · 91820da6

由 Jiri Benc 提交于 8年前

It's not allowed to push Ethernet header in front of another Ethernet
header.

It's not allowed to pop Ethernet header if there's a vlan tag. This
preserves the invariant that L3 packet never has a vlan tag.

Based on previous versions by Lorand Jakab and Simon Horman.
Signed-off-by: NLorand Jakab <lojakab@cisco.com>
Signed-off-by: NSimon Horman <simon.horman@netronome.com>
Signed-off-by: NJiri Benc <jbenc@redhat.com>
Acked-by: NPravin B Shelar <pshelar@ovn.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

91820da6

openvswitch: netlink: support L3 packets · 0a6410fb

由 Jiri Benc 提交于 8年前

Extend the ovs flow netlink protocol to support L3 packets. Packets without
OVS_KEY_ATTR_ETHERNET attribute specify L3 packets; for those, the
OVS_KEY_ATTR_ETHERTYPE attribute is mandatory.

Push/pop vlan actions are only supported for Ethernet packets.

Based on previous versions by Lorand Jakab and Simon Horman.
Signed-off-by: NLorand Jakab <lojakab@cisco.com>
Signed-off-by: NSimon Horman <simon.horman@netronome.com>
Signed-off-by: NJiri Benc <jbenc@redhat.com>
Acked-by: NPravin B Shelar <pshelar@ovn.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0a6410fb

openvswitch: add processing of L3 packets · 5108bbad

由 Jiri Benc 提交于 8年前

Support receiving, extracting flow key and sending of L3 packets (packets
without an Ethernet header).

Note that even after this patch, non-Ethernet interfaces are still not
allowed to be added to bridges. Similarly, netlink interface for sending and
receiving L3 packets to/from user space is not in place yet.

Based on previous versions by Lorand Jakab and Simon Horman.
Signed-off-by: NLorand Jakab <lojakab@cisco.com>
Signed-off-by: NSimon Horman <simon.horman@netronome.com>
Signed-off-by: NJiri Benc <jbenc@redhat.com>
Acked-by: NPravin B Shelar <pshelar@ovn.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5108bbad

openvswitch: support MPLS push and pop for L3 packets · 1560a074

由 Jiri Benc 提交于 8年前

Update Ethernet header only if there is one.
Signed-off-by: NJiri Benc <jbenc@redhat.com>
Acked-by: NPravin B Shelar <pshelar@ovn.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1560a074

openvswitch: pass mac_proto to ovs_vport_send · e2d9d835

由 Jiri Benc 提交于 8年前

We'll need it to alter packets sent to ARPHRD_NONE interfaces.

Change do_output() to use the actual L2 header size of the packet when
deciding on the minimum cutlen. The assumption here is that what matters is
not the output interface hard_header_len but rather the L2 header of the
particular packet. For example, ARPHRD_NONE tunnels that encapsulate
Ethernet should get at least the Ethernet header.
Signed-off-by: NJiri Benc <jbenc@redhat.com>
Acked-by: NPravin B Shelar <pshelar@ovn.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e2d9d835

openvswitch: add mac_proto field to the flow key · 329f45bc

由 Jiri Benc 提交于 8年前

Use a hole in the structure. We support only Ethernet so far and will add
a support for L2-less packets shortly. We could use a bool to indicate
whether the Ethernet header is present or not but the approach with the
mac_proto field is more generic and occupies the same number of bytes in the
struct, while allowing later extensibility. It also makes the code in the
next patches more self explaining.

It would be nice to use ARPHRD_ constants but those are u16 which would be
waste. Thus define our own constants.

Another upside of this is that we can overload this new field to also denote
whether the flow key is valid. This has the advantage that on
refragmentation, we don't have to reparse the packet but can rely on the
stored eth.type. This is especially important for the next patches in this
series - instead of adding another branch for L2-less packets before calling
ovs_fragment, we can just remove all those branches completely.
Signed-off-by: NJiri Benc <jbenc@redhat.com>
Acked-by: NPravin B Shelar <pshelar@ovn.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

329f45bc

openvswitch: use hard_header_len instead of hardcoded ETH_HLEN · 738314a0

由 Jiri Benc 提交于 8年前

On tx, use hard_header_len while deciding whether to refragment or drop the
packet. That way, all combinations are calculated correctly:

* L2 packet going to L2 interface (the L2 header len is subtracted),
* L2 packet going to L3 interface (the L2 header is included in the packet
  lenght),
* L3 packet going to L3 interface.
Signed-off-by: NJiri Benc <jbenc@redhat.com>
Acked-by: NPravin B Shelar <pshelar@ovn.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

738314a0

net: __skb_flow_dissect() must cap its return value · 34fad54c

由 Eric Dumazet 提交于 8年前

After Tom patch, thoff field could point past the end of the buffer,
this could fool some callers.

If an skb was provided, skb->len should be the upper limit.
If not, hlen is supposed to be the upper limit.

Fixes: a6e544b0 ("flow_dissector: Jump to exit code in __skb_flow_dissect")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: Yibin Yang <yibyang@cisco.com
Acked-by: NAlexander Duyck <alexander.h.duyck@intel.com>
Acked-by: NWillem de Bruijn <willemb@google.com>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

34fad54c

bpf: Fix bpf_redirect to an ipip/ip6tnl dev · 4e3264d2

由 Martin KaFai Lau 提交于 8年前

If the bpf program calls bpf_redirect(dev, 0) and dev is
an ipip/ip6tnl, it currently includes the mac header.
e.g. If dev is ipip, the end result is IP-EthHdr-IP instead
of IP-IP.

The fix is to pull the mac header.  At ingress, skb_postpull_rcsum()
is not needed because the ethhdr should have been pulled once already
and then got pushed back just before calling the bpf_prog.
At egress, this patch calls skb_postpull_rcsum().

If bpf_redirect(dev, BPF_F_INGRESS) is called,
it also fails now because it calls dev_forward_skb() which
eventually calls eth_type_trans(skb, dev).  The eth_type_trans()
will set skb->type = PACKET_OTHERHOST because the mac address
does not match the redirecting dev->dev_addr.  The PACKET_OTHERHOST
will eventually cause the ip_rcv() errors out.  To fix this,
____dev_forward_skb() is added.

Joint work with Daniel Borkmann.

Fixes: cfc7381b ("ip_tunnel: add collect_md mode to IPIP tunnel")
Fixes: 8d79266b ("ip6_tunnel: add collect_md mode to IPv6 tunnels")
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NAlexei Starovoitov <ast@fb.com>
Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4e3264d2

11 11月, 2016 4 次提交

libceph: initialize last_linger_id with a large integer · 264048af

由 Ilya Dryomov 提交于 8年前

osdc->last_linger_id is a counter for lreq->linger_id, which is used
for watch cookies.  Starting with a large integer should ease the task
of telling apart kernel and userspace clients.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

264048af

libceph: fix legacy layout decode with pool 0 · 3890dce1

由 Yan, Zheng 提交于 8年前

If your data pool was pool 0, ceph_file_layout_from_legacy()
transform that to -1 unconditionally, which broke upgrades.
We only want do that for a fully zeroed ceph_file_layout,
so that it still maps to a file_layout_t. If any fields
are set, though, we trust the fl_pgpool to be a valid pool.

Fixes: 7627151e ("libceph: define new ceph_file_layout structure")
Link: http://tracker.ceph.com/issues/17825Signed-off-by: NYan, Zheng <zyan@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

3890dce1

ipv4: update comment to document GSO fragmentation cases. · 0ace81ec

由 Lance Richardson 提交于 8年前

This is a follow-up to commit 9ee6c5dc ("ipv4: allow local
fragmentation in ip_finish_output_gso()"), updating the comment
documenting cases in which fragmentation is needed for egress
GSO packets.
Suggested-by: NShmulik Ladkani <shmulik.ladkani@gmail.com>
Reviewed-by: NShmulik Ladkani <shmulik.ladkani@gmail.com>
Signed-off-by: NLance Richardson <lrichard@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0ace81ec

xprtrdma: Fix DMAR failure in frwr_op_map() after reconnect · 62bdf94a

由 Chuck Lever 提交于 8年前

When a LOCALINV WR is flushed, the frmr is marked STALE, then
frwr_op_unmap_sync DMA-unmaps the frmr's SGL. These STALE frmrs
are then recovered when frwr_op_map hunts for an INVALID frmr to
use.

All other cases that need frmr recovery leave that SGL DMA-mapped.
The FRMR recovery path unconditionally DMA-unmaps the frmr's SGL.

To avoid DMA unmapping the SGL twice for flushed LOCAL_INV WRs,
alter the recovery logic (rather than the hot frwr_op_unmap_sync
path) to distinguish among these cases. This solution also takes
care of the case where multiple LOCAL_INV WRs are issued for the
same rpcrdma_req, some complete successfully, but some are flushed.
Reported-by: NVasco Steinmetz <linux@kyberraum.net>
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Tested-by: NVasco Steinmetz <linux@kyberraum.net>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

62bdf94a

10 11月, 2016 13 次提交

netfilter: ipset: hash: fix boolreturn.cocci warnings · 737d387b

由 kbuild test robot 提交于 8年前

net/netfilter/ipset/ip_set_hash_ipmac.c:70:8-9: WARNING: return of 0/1 in function 'hash_ipmac4_data_list' with return type bool
net/netfilter/ipset/ip_set_hash_ipmac.c:178:8-9: WARNING: return of 0/1 in function 'hash_ipmac6_data_list' with return type bool

Return statements in functions returning bool should use
true/false instead of 1/0.
Generated by: scripts/coccinelle/misc/boolreturn.cocci

CC: Tomasz Chilinski <tomasz.chilinski@chilan.com>
Signed-off-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>

737d387b

netfilter: ipset: use setup_timer() and mod_timer(). · fcb58a03

由 Jozsef Kadlecsik 提交于 8年前

Use setup_timer() and instead of init_timer(), being the preferred way
of setting up a timer.

Also, quoting the mod_timer() function comment:
-> mod_timer() is a more efficient way to update the expire field of an
   active timer (if the timer is inactive it will be activated).

Use setup_timer() and mod_timer() to setup and arm a timer, making the
code compact and easier to read.
Signed-off-by: NMuhammad Falak R Wani <falakreyaz@gmail.com>
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>

fcb58a03

netfilter: ipset: hash:ipmac type support added to ipset · c0db95c7

由 Tomasz Chilinski 提交于 8年前

Introduce the hash:ipmac type.
Signed-off-by: NTomasz Chili??ski <tomasz.chilinski@chilan.com>
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>

c0db95c7

netfilter: ipset: Fix reported memory size for hash:* types · a71bdbfa

由 Jozsef Kadlecsik 提交于 9年前

The calculation of the full allocated memory did not take
into account the size of the base hash bucket structure at some
places.
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>

a71bdbfa

netfilter: ipset: Collapse same condition body to a single one · 9be37d2a

由 Jozsef Kadlecsik 提交于 9年前

The set full case (with net_ratelimit()-ed pr_warn()) is already
handled, simply jump there.
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>

9be37d2a

netfilter: ipset: Make struct htype per ipset family · 21956ab2

由 Jozsef Kadlecsik 提交于 9年前

Before this patch struct htype created at the first source
of ip_set_hash_gen.h and it is common for both IPv4 and IPv6
set variants.

Make struct htype per ipset family and use NLEN to make
nets array fixed size to simplify struct htype allocation.

Ported from a patch proposed by Sergey Popovich <popovich_sergei@mail.ua>.
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>

21956ab2

netfilter: ipset: Optimize hash creation routine · 961509ac

由 Jozsef Kadlecsik 提交于 8年前

Exit as easly as possible on error and use RCU_INIT_POINTER()
as set is not seen at creation time.
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>

961509ac

netfilter: ipset: Make sure element data size is a multiple of u32 · 5a902e6d

由 Jozsef Kadlecsik 提交于 9年前

Data for hashing required to be array of u32. Make sure that
element data always multiple of u32.

Ported from a patch proposed by Sergey Popovich <popovich_sergei@mail.ua>.
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>

5a902e6d

netfilter: ipset: Make NLEN compile time constant for hash types · cee8b97b

由 Jozsef Kadlecsik 提交于 8年前

Hash types define HOST_MASK before inclusion of ip_set_hash_gen.h
and the only place where NLEN needed to be calculated at runtime
is *_create() method.

Ported from a patch proposed by Sergey Popovich <popovich_sergei@mail.ua>.
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>

cee8b97b

netfilter: ipset: Simplify mtype_expire() for hash types · 509debc9

由 Jozsef Kadlecsik 提交于 8年前

Remove one leve of intendation by using continue while
iterating over elements in bucket.

Ported from a patch proposed by Sergey Popovich <popovich_sergei@mail.ua>.
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>

509debc9

netfilter: ipset: Remove redundant mtype_expire() arguments · 5fdb5f69

由 Jozsef Kadlecsik 提交于 8年前

Remove redundant parameters nets_length and dsize, because
they can be get from other parameters.

Ported from a patch proposed by Sergey Popovich <popovich_sergei@mail.ua>.
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>

5fdb5f69

netfilter: ipset: Count non-static extension memory for userspace · 9e41f26a

由 Jozsef Kadlecsik 提交于 8年前

Non-static (i.e. comment) extension was not counted into the memory
size. A new internal counter is introduced for this. In the case of
the hash types the sizes of the arrays are counted there as well so
that we can avoid to scan the whole set when just the header data
is requested.
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>

9e41f26a

netfilter: ipset: Add element count to all set types header · 702b71e7

由 Jozsef Kadlecsik 提交于 8年前

It is better to list the set elements for all set types, thus the
header information is uniform. Element counts are therefore added
to the bitmap and list types.
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>

702b71e7