提交 · 512137eeff00f73a8a62e481a6575f1556cf962c · openeuler / Kernel

10 12月, 2013 6 次提交

tipc: remove interface state mirroring in bearer · 512137ee

由 Erik Hugne 提交于 12月 06, 2013

struct 'tipc_bearer' is a generic representation of the underlying
media type, and exists in a one-to-one relationship to each interface
TIPC is using. The struct contains a 'blocked' flag that mirrors the
operational and execution state of the represented interface, and is
updated through notification calls from the latter. The users of
tipc_bearer are checking this flag before each attempt to send a
packet via the interface.

This state mirroring serves no purpose in the current code base. TIPC
links will not discover a media failure any faster through this
mechanism, and in reality the flag only adds overhead at packet
sending and reception.

Furthermore, the fact that the flag needs to be protected by a spinlock
aggregated into tipc_bearer has turned out to cause a serious and
completely unnecessary deadlock problem.

CPU0                                    CPU1
----                                    ----
Time 0: bearer_disable()                link_timeout()
Time 1:   spin_lock_bh(&b_ptr->lock)      tipc_link_push_queue()
Time 2:   tipc_link_delete()                tipc_bearer_blocked(b_ptr)
Time 3:     k_cancel_timer(&req->timer)       spin_lock_bh(&b_ptr->lock)
Time 4:       del_timer_sync(&req->timer)

I.e., del_timer_sync() on CPU0 never returns, because the timer handler
on CPU1 is waiting for the bearer lock.

We eliminate the 'blocked' flag from struct tipc_bearer, along with all
tests on this flag. This not only resolves the deadlock, but also
simplifies and speeds up the data path execution of TIPC. It also fits
well into our ongoing effort to make the locking policy simpler and
more manageable.

An effect of this change is that we can get rid of functions such as
tipc_bearer_blocked(), tipc_continue() and tipc_block_bearer().
We replace the latter with a new function, tipc_reset_bearer(), which
resets all links associated to the bearer immediately after an
interface goes down.

A user might notice one slight change in link behaviour after this
change. When an interface goes down, (e.g. through a NETDEV_DOWN
event) all attached links will be reset immediately, instead of
leaving it to each link to detect the failure through a timer-driven
mechanism. We consider this an improvement, and see no obvious risks
with the new behavior.
Signed-off-by: NErik Hugne <erik.hugne@ericsson.com>
Reviewed-by: NYing Xue <ying.xue@windriver.com>
Reviewed-by: NPaul Gortmaker <Paul.Gortmaker@windriver.com>
Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

512137ee

x25: convert printks to pr_<level> · b73e9e3c

由 wangweidong 提交于 12月 06, 2013

use pr_<level> instead of printk(LEVEL)
Suggested-by: NJoe Perches <joe@perches.com>
Signed-off-by: NWang Weidong <wangweidong1@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b73e9e3c

packet: introduce PACKET_QDISC_BYPASS socket option · d346a3fa

由 Daniel Borkmann 提交于 12月 06, 2013

This patch introduces a PACKET_QDISC_BYPASS socket option, that
allows for using a similar xmit() function as in pktgen instead
of taking the dev_queue_xmit() path. This can be very useful when
PF_PACKET applications are required to be used in a similar
scenario as pktgen, but with full, flexible packet payload that
needs to be provided, for example.

On default, nothing changes in behaviour for normal PF_PACKET
TX users, so everything stays as is for applications. New users,
however, can now set PACKET_QDISC_BYPASS if needed to prevent
own packets from i) reentering packet_rcv() and ii) to directly
push the frame to the driver.

In doing so we can increase pps (here 64 byte packets) for
PF_PACKET a bit:

  # CPUs -- QDISC_BYPASS   -- qdisc path -- qdisc path[**]
  1 CPU  ==  1,509,628 pps --  1,208,708 --  1,247,436
  2 CPUs ==  3,198,659 pps --  2,536,012 --  1,605,779
  3 CPUs ==  4,787,992 pps --  3,788,740 --  1,735,610
  4 CPUs ==  6,173,956 pps --  4,907,799 --  1,909,114
  5 CPUs ==  7,495,676 pps --  5,956,499 --  2,014,422
  6 CPUs ==  9,001,496 pps --  7,145,064 --  2,155,261
  7 CPUs == 10,229,776 pps --  8,190,596 --  2,220,619
  8 CPUs == 11,040,732 pps --  9,188,544 --  2,241,879
  9 CPUs == 12,009,076 pps -- 10,275,936 --  2,068,447
 10 CPUs == 11,380,052 pps -- 11,265,337 --  1,578,689
 11 CPUs == 11,672,676 pps -- 11,845,344 --  1,297,412
 [...]
 20 CPUs == 11,363,192 pps -- 11,014,933 --  1,245,081

 [**]: qdisc path with packet_rcv(), how probably most people
       seem to use it (hopefully not anymore if not needed)

The test was done using a modified trafgen, sending a simple
static 64 bytes packet, on all CPUs.  The trick in the fast
"qdisc path" case, is to avoid reentering packet_rcv() by
setting the RAW socket protocol to zero, like:
socket(PF_PACKET, SOCK_RAW, 0);

Tradeoffs are documented as well in this patch, clearly, if
queues are busy, we will drop more packets, tc disciplines are
ignored, and these packets are not visible to taps anymore. For
a pktgen like scenario, we argue that this is acceptable.

The pointer to the xmit function has been placed in packet
socket structure hole between cached_dev and prot_hook that
is hot anyway as we're working on cached_dev in each send path.

Done in joint work together with Jesper Dangaard Brouer.
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d346a3fa

net: dev: move inline skb_needs_linearize helper to header · 4262e5cc

由 Daniel Borkmann 提交于 12月 06, 2013

As we need it elsewhere, move the inline helper function of
skb_needs_linearize() over to skbuff.h include file. While
at it, also convert the return to 'bool' instead of 'int'
and add a proper kernel doc.
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4262e5cc

packet: fix send path when running with proto == 0 · 66e56cd4

由 Daniel Borkmann 提交于 12月 06, 2013

Commit e40526cb introduced a cached dev pointer, that gets
hooked into register_prot_hook(), __unregister_prot_hook() to
update the device used for the send path.

We need to fix this up, as otherwise this will not work with
sockets created with protocol = 0, plus with sll_protocol = 0
passed via sockaddr_ll when doing the bind.

So instead, assign the pointer directly. The compiler can inline
these helper functions automagically.

While at it, also assume the cached dev fast-path as likely(),
and document this variant of socket creation as it seems it is
not widely used (seems not even the author of TX_RING was aware
of that in his reference example [1]). Tested with reproducer
from e40526cb.

 [1] http://wiki.ipxwarzone.com/index.php5?title=Linux_packet_mmap#Example

Fixes: e40526cb ("packet: fix use after free race in send path when dev is released")
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Tested-by: NSalam Noureddine <noureddine@aristanetworks.com>
Tested-by: NJesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

66e56cd4

pkt_sched: give visibility to mq slave qdiscs · 95dc1929

由 Eric Dumazet 提交于 12月 05, 2013

Commit 6da7c8fc ("qdisc: allow setting default queuing discipline")
added the ability to change default qdisc from pfifo_fast to say fq

But as most modern ethernet devices are multiqueue, we cant really
see all the statistics from "tc -s qdisc show", as the default root
qdisc is mq.

This patch adds the calls to qdisc_list_add() to mq and mqprio
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

95dc1929

07 12月, 2013 20 次提交

unix: convert printks to pr_<level> · 5cc208be

由 wangweidong 提交于 12月 06, 2013

use pr_<level> instead of printk(LEVEL)
Signed-off-by: NWang Weidong <wangweidong1@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5cc208be

ipv6 addrconf: introduce IFA_F_MANAGETEMPADDR to tell kernel to manage temporary addresses · 53bd6749

由 Jiri Pirko 提交于 12月 06, 2013

Creating an address with this flag set will result in kernel taking care
of temporary addresses in the same way as if the address was created by
kernel itself (after RA receive). This allows userspace applications
implementing the autoconfiguration (NetworkManager for example) to
implement ipv6 addresses privacy.
Signed-off-by: NJiri Pirko <jiri@resnulli.us>
Signed-off-by: NThomas Haller <thaller@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

53bd6749

ipv6 addrconf: extend ifa_flags to u32 · 479840ff

由 Jiri Pirko 提交于 12月 06, 2013

There is no more space in u8 ifa_flags. So do what davem suffested and
add another netlink attr called IFA_FLAGS for carry more flags.
Signed-off-by: NJiri Pirko <jiri@resnulli.us>
Signed-off-by: NThomas Haller <thaller@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

479840ff

br: fix use of ->rx_handler_data in code executed on non-rx_handler path · 859828c0

由 Jiri Pirko 提交于 12月 05, 2013

br_stp_rcv() is reached by non-rx_handler path. That means there is no
guarantee that dev is bridge port and therefore simple NULL check of
->rx_handler_data is not enough. There is need to check if dev is really
bridge port and since only rcu read lock is held here, do it by checking
->rx_handler pointer.

Note that synchronize_net() in netdev_rx_handler_unregister() ensures
this approach as valid.

Introduced originally by:
commit f350a0a8
  "bridge: use rx_handler_data pointer to store net_bridge_port pointer"

Fixed but not in the best way by:
commit b5ed54e9
  "bridge: fix RCU races with bridge port"

Reintroduced by:
commit 716ec052
  "bridge: fix NULL pointer deref of br_port_get_rcu"

Please apply to stable trees as well. Thanks.

RH bugzilla reference: https://bugzilla.redhat.com/show_bug.cgi?id=1025770Reported-by: NLaine Stump <laine@redhat.com>
Debugged-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NJiri Pirko <jiri@resnulli.us>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

859828c0

net: introduce dev_consume_skb_any() · e6247027

由 Eric Dumazet 提交于 12月 05, 2013

Some network drivers use dev_kfree_skb_any() and dev_kfree_skb_irq()
helpers to free skbs, both for dropped packets and TX completed ones.

We need to separate the two causes to get better diagnostics
given by dropwatch or "perf record -e skb:kfree_skb"

This patch provides two new helpers, dev_consume_skb_any() and
dev_consume_skb_irq() to be used for consumed skbs.

__dev_kfree_skb_irq() is slightly optimized to remove one
atomic_dec_and_test() in fast path, and use this_cpu_{r|w} accessors.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e6247027

sctp: fix some comments in associola.c · 9d2c881a

由 wangweidong 提交于 12月 06, 2013

fix some typos
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NWang Weidong <wangweidong1@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9d2c881a

sctp: convert sctp_peer_needs_update to boolean · ce4a03db

由 wangweidong 提交于 12月 06, 2013

sctp_peer_needs_update only return 0 or 1.
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NWang Weidong <wangweidong1@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ce4a03db

sctp: remove the else path · 8b7318d3

由 wangweidong 提交于 12月 06, 2013

Make the code more simplification.
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Suggested-by: NJoe Perches <joe@perches.com>
Signed-off-by: NWang Weidong <wangweidong1@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8b7318d3

sctp: remove the duplicate initialize · d1d66186

由 wangweidong 提交于 12月 06, 2013

kzalloc had initialize the allocated memroy. Therefore, remove the
initialize with 0 and the memset.
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NWang Weidong <wangweidong1@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d1d66186

tcp: auto corking · f54b3111

由 Eric Dumazet 提交于 12月 05, 2013

With the introduction of TCP Small Queues, TSO auto sizing, and TCP
pacing, we can implement Automatic Corking in the kernel, to help
applications doing small write()/sendmsg() to TCP sockets.

Idea is to change tcp_push() to check if the current skb payload is
under skb optimal size (a multiple of MSS bytes)

If under 'size_goal', and at least one packet is still in Qdisc or
NIC TX queues, set the TCP Small Queue Throttled bit, so that the push
will be delayed up to TX completion time.

This delay might allow the application to coalesce more bytes
in the skb in following write()/sendmsg()/sendfile() system calls.

The exact duration of the delay is depending on the dynamics
of the system, and might be zero if no packet for this flow
is actually held in Qdisc or NIC TX ring.

Using FQ/pacing is a way to increase the probability of
autocorking being triggered.

Add a new sysctl (/proc/sys/net/ipv4/tcp_autocorking) to control
this feature and default it to 1 (enabled)

Add a new SNMP counter : nstat -a | grep TcpExtTCPAutoCorking
This counter is incremented every time we detected skb was under used
and its flush was deferred.

Tested:

Interesting effects when using line buffered commands under ssh.

Excellent performance results in term of cpu usage and total throughput.

lpq83:~# echo 1 >/proc/sys/net/ipv4/tcp_autocorking
lpq83:~# perf stat ./super_netperf 4 -t TCP_STREAM -H lpq84 -- -m 128
9410.39

Performance counter stats for './super_netperf 4 -t TCP_STREAM -H lpq84 -- -m 128':

35209.439626 task-clock # 2.901 CPUs utilized
2,294 context-switches # 0.065 K/sec
101 CPU-migrations # 0.003 K/sec
4,079 page-faults # 0.116 K/sec
97,923,241,298 cycles # 2.781 GHz [83.31%]
51,832,908,236 stalled-cycles-frontend # 52.93% frontend cycles idle [83.30%]
25,697,986,603 stalled-cycles-backend # 26.24% backend cycles idle [66.70%]
102,225,978,536 instructions # 1.04 insns per cycle
# 0.51 stalled cycles per insn [83.38%]
18,657,696,819 branches # 529.906 M/sec [83.29%]
91,679,646 branch-misses # 0.49% of all branches [83.40%]

12.136204899 seconds time elapsed

lpq83:~# echo 0 >/proc/sys/net/ipv4/tcp_autocorking
lpq83:~# perf stat ./super_netperf 4 -t TCP_STREAM -H lpq84 -- -m 128
6624.89

Performance counter stats for './super_netperf 4 -t TCP_STREAM -H lpq84 -- -m 128':
40045.864494 task-clock # 3.301 CPUs utilized
171 context-switches # 0.004 K/sec
53 CPU-migrations # 0.001 K/sec
4,080 page-faults # 0.102 K/sec
111,340,458,645 cycles # 2.780 GHz [83.34%]
61,778,039,277 stalled-cycles-frontend # 55.49% frontend cycles idle [83.31%]
29,295,522,759 stalled-cycles-backend # 26.31% backend cycles idle [66.67%]
108,654,349,355 instructions # 0.98 insns per cycle
# 0.57 stalled cycles per insn [83.34%]
19,552,170,748 branches # 488.244 M/sec [83.34%]
157,875,417 branch-misses # 0.81% of all branches [83.34%]

12.130267788 seconds time elapsed
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f54b3111

tcp: optimize some skb_shinfo(skb) uses · 7b7fc97a

由 Eric Dumazet 提交于 12月 05, 2013

Compiler doesn't know skb_shinfo(skb) pointer is usually constant.

By using a temporary variable, we help generating smaller code.

For example, tcp_init_nondata_skb() is inlined after this patch.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7b7fc97a

gro: small napi_get_frags() optim · 84b9cd63

由 Eric Dumazet 提交于 12月 05, 2013

Remove one useless conditional branch :
napi->skb is NULL, so nothing bad can happen.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

84b9cd63

ipv6: consistent use of IP6_INC_STATS_BH() in ip6_forward() · 15c77d8b

由 Eric Dumazet 提交于 12月 05, 2013

ip6_forward() runs from softirq context, we can use the SNMP macros
assuming this.

Use same indentation for all IP6_INC_STATS_BH() calls.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

15c77d8b

packet: use macro GET_PBDQC_FROM_RB to simplify the codes · 22781a5b

由 Duan Jiong 提交于 12月 06, 2013

Signed-off-by: NDuan Jiong <duanj.fnst@cn.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

22781a5b

net/*: Fix FSF address in file headers · c057b190

由 Jeff Kirsher 提交于 12月 06, 2013

Several files refer to an old address for the Free Software Foundation
in the file header comment.  Resolve by replacing the address with
the URL <http://www.gnu.org/licenses/> so that we do not have to keep
updating the header comments anytime the address changes.

CC: John Fastabend <john.r.fastabend@intel.com>
CC: Alex Duyck <alexander.h.duyck@intel.com>
CC: Marcel Holtmann <marcel@holtmann.org>
CC: Gustavo Padovan <gustavo@padovan.org>
CC: Johan Hedberg <johan.hedberg@gmail.com>
CC: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c057b190

net/irda: Fix FSF address in file headers · d3770509

由 Jeff Kirsher 提交于 12月 06, 2013

Several files refer to an old address for the Free Software Foundation
in the file header comment.  Resolve by replacing the address with
the URL <http://www.gnu.org/licenses/> so that we do not have to keep
updating the header comments anytime the address changes.

CC: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d3770509

netfilter: Fix FSF address in file headers · e664eabd

由 Jeff Kirsher 提交于 12月 06, 2013

Several files refer to an old address for the Free Software Foundation
in the file header comment.  Resolve by replacing the address with
the URL <http://www.gnu.org/licenses/> so that we do not have to keep
updating the header comments anytime the address changes.

CC: netfilter@vger.kernel.org
CC: Pablo Neira Ayuso <pablo@netfilter.org>
CC: Patrick McHardy <kaber@trash.net>
CC: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e664eabd

netlabel: Fix FSF address in file headers · d484ff15

由 Jeff Kirsher 提交于 12月 06, 2013

Several files refer to an old address for the Free Software Foundation
in the file header comment.  Resolve by replacing the address with
the URL <http://www.gnu.org/licenses/> so that we do not have to keep
updating the header comments anytime the address changes.

CC: Paul Moore <paul@paul-moore.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d484ff15

ipv4/ipv6: Fix FSF address in file headers · a99421d9

由 Jeff Kirsher 提交于 12月 06, 2013

Several files refer to an old address for the Free Software Foundation
in the file header comment.  Resolve by replacing the address with
the URL <http://www.gnu.org/licenses/> so that we do not have to keep
updating the header comments anytime the address changes.

CC: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
CC: James Morris <jmorris@namei.org>
CC: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
CC: Patrick McHardy <kaber@trash.net>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a99421d9

sctp: Fix FSF address in file headers · 4b2f13a2

由 Jeff Kirsher 提交于 12月 06, 2013

Several files refer to an old address for the Free Software Foundation
in the file header comment.  Resolve by replacing the address with
the URL <http://www.gnu.org/licenses/> so that we do not have to keep
updating the header comments anytime the address changes.

CC: Vlad Yasevich <vyasevich@gmail.com>
CC: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4b2f13a2

06 12月, 2013 8 次提交

net: clear local_df when passing skb between namespaces · 239c78db

由 Hannes Frederic Sowa 提交于 12月 05, 2013

We must clear local_df when passing the skb between namespaces as the
packet is not local to the new namespace any more and thus may not get
fragmented by local rules. Fred Templin noticed that other namespaces
do fragment IPv6 packets while forwarding. Instead they should have send
back a PTB.

The same problem should be present when forwarding DF-IPv4 packets
between namespaces.
Reported-by: NTemplin, Fred L <Fred.L.Templin@boeing.com>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

239c78db

tcp_memcontrol: Cleanup/fix cg_proto->memory_pressure handling. · 7f2cbdc2

由 Eric W. Biederman 提交于 12月 04, 2013

kill memcg_tcp_enter_memory_pressure.  The only function of
memcg_tcp_enter_memory_pressure was to reduce deal with the
unnecessary abstraction that was tcp_memcontrol.  Now that struct
tcp_memcontrol is gone remove this unnecessary function, the
unnecessary function pointer, and modify sk_enter_memory_pressure to
set this field directly, just as sk_leave_memory_pressure cleas this
field directly.

This fixes a small bug I intruduced when killing struct tcp_memcontrol
that caused memcg_tcp_enter_memory_pressure to never be called and
thus failed to ever set cg_proto->memory_pressure.

Remove the cg_proto enter_memory_pressure function as it now serves
no useful purpose.

Don't test cg_proto->memory_presser in sk_leave_memory_pressure before
clearing it.  The test was originally there to ensure that the pointer
was non-NULL.  Now that cg_proto is not a pointer the pointer does not
matter.
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7f2cbdc2

sctp: disable max_burst when the max_burst is 0 · 78ac814f

由 wangweidong 提交于 12月 04, 2013

As Michael pointed out that when max_burst is 0, it just disable
max_burst. It declared in rfc6458#section-8.1.24. so add the check
in sctp_transport_burst_limited, when it 0, just do nothing.
Reviewed-by: NDaniel Borkmann <dborkman@redhat.com>
Suggested-by: NVlad Yasevich <vyasevich@gmail.com>
Suggested-by: NMichael Tuexen <Michael.Tuexen@lurchi.franken.de>
Signed-off-by: NWang Weidong <wangweidong1@huawei.com>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Acked-by: NVlad Yasevich <vyasevich@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

78ac814f

net_sched: Use default action walker methods · 651a6493

由 Jamal Hadi Salim 提交于 12月 04, 2013

Signed-off-by: NJamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

651a6493

J
net_sched: Provide default walker function for actions · 382ca8a1
由 Jamal Hadi Salim 提交于 12月 04, 2013
```
Signed-off-by: NJamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
382ca8a1

net_sched: Use default action lookup functions · 43c00dcf

由 Jamal Hadi Salim 提交于 12月 04, 2013

Signed-off-by: NJamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

43c00dcf

J
net_sched: Default action lookup method for actions · 63ef6174
由 Jamal Hadi Salim 提交于 12月 04, 2013
```
Signed-off-by: NJamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
63ef6174
J
net_sched: Fail if missing mandatory action operation methods · 76c82d7a
由 Jamal Hadi Salim 提交于 12月 04, 2013
```
Signed-off-by: NJamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
76c82d7a

04 12月, 2013 1 次提交

rds: prevent BUG_ON triggered on congestion update to loopback · 18fc25c9

由 Venkat Venkatsubra 提交于 12月 02, 2013

After congestion update on a local connection, when rds_ib_xmit returns
less bytes than that are there in the message, rds_send_xmit calls
back rds_ib_xmit with an offset that causes BUG_ON(off & RDS_FRAG_SIZE)
to trigger.

For a 4Kb PAGE_SIZE rds_ib_xmit returns min(8240,4096)=4096 when actually
the message contains 8240 bytes. rds_send_xmit thinks there is more to send
and calls rds_ib_xmit again with a data offset "off" of 4096-48(rds header)
=4048 bytes thus hitting the BUG_ON(off & RDS_FRAG_SIZE) [RDS_FRAG_SIZE=4k].

The commit 6094628b
"rds: prevent BUG_ON triggering on congestion map updates" introduced
this regression. That change was addressing the triggering of a different
BUG_ON in rds_send_xmit() on PowerPC architecture with 64Kbytes PAGE_SIZE:
 	BUG_ON(ret != 0 &&
    		 conn->c_xmit_sg == rm->data.op_nents);
This was the sequence it was going through:
(rds_ib_xmit)
/* Do not send cong updates to IB loopback */
if (conn->c_loopback
   && rm->m_inc.i_hdr.h_flags & RDS_FLAG_CONG_BITMAP) {
  	rds_cong_map_updated(conn->c_fcong, ~(u64) 0);
    	return sizeof(struct rds_header) + RDS_CONG_MAP_BYTES;
}
rds_ib_xmit returns 8240
rds_send_xmit:
  c_xmit_data_off = 0 + 8240 - 48 (rds header accounted only the first time)
   		 = 8192
  c_xmit_data_off < 65536 (sg->length), so calls rds_ib_xmit again
rds_ib_xmit returns 8240
rds_send_xmit:
  c_xmit_data_off = 8192 + 8240 = 16432, calls rds_ib_xmit again
  and so on (c_xmit_data_off 24672,32912,41152,49392,57632)
rds_ib_xmit returns 8240
On this iteration this sequence causes the BUG_ON in rds_send_xmit:
    while (ret) {
    	tmp = min_t(int, ret, sg->length - conn->c_xmit_data_off);
    	[tmp = 65536 - 57632 = 7904]
    	conn->c_xmit_data_off += tmp;
    	[c_xmit_data_off = 57632 + 7904 = 65536]
    	ret -= tmp;
    	[ret = 8240 - 7904 = 336]
    	if (conn->c_xmit_data_off == sg->length) {
    		conn->c_xmit_data_off = 0;
    		sg++;
    		conn->c_xmit_sg++;
    		BUG_ON(ret != 0 &&
    			conn->c_xmit_sg == rm->data.op_nents);
    		[c_xmit_sg = 1, rm->data.op_nents = 1]

What the current fix does:
Since the congestion update over loopback is not actually transmitted
as a message, all that rds_ib_xmit needs to do is let the caller think
the full message has been transmitted and not return partial bytes.
It will return 8240 (RDS_CONG_MAP_BYTES+48) when PAGE_SIZE is 4Kb.
And 64Kb+48 when page size is 64Kb.
Reported-by: NJosh Hunt <joshhunt00@gmail.com>
Tested-by: NHonggang Li <honli@redhat.com>
Acked-by: NBang Nguyen <bang.nguyen@oracle.com>
Signed-off-by: NVenkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

18fc25c9

03 12月, 2013 3 次提交

IPv6: Fixed support for blackhole and prohibit routes · 7150aede

由 Kamala R 提交于 12月 02, 2013

The behaviour of blackhole and prohibit routes has been corrected by setting
the input and output pointers of the dst variable appropriately. For
blackhole routes, they are set to dst_discard and to ip6_pkt_discard and
ip6_pkt_discard_out respectively for prohibit routes.

ipv6: ip6_pkt_prohibit(_out) should not depend on
CONFIG_IPV6_MULTIPLE_TABLES

We need ip6_pkt_prohibit(_out) available without
CONFIG_IPV6_MULTIPLE_TABLES
Signed-off-by: NKamala R <kamala@aristanetworks.com>
Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7150aede

F
ipv6: fix third arg of anycast_dst_alloc(), must be bool. · 57ec0afe
由 François-Xavier Le Bail 提交于 12月 02, 2013
```
Signed-off-by: NFrancois-Xavier Le Bail <fx.lebail@yahoo.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
57ec0afe

ipv6: judge the accept_ra_defrtr before calling rt6_route_rcv · 30e56918

由 Duan Jiong 提交于 11月 26, 2013

when dealing with a RA message, if accept_ra_defrtr is false,
the kernel will not add the default route, and then deal with
the following route information options. Unfortunately, those
options maybe contain default route, so let's judge the
accept_ra_defrtr before calling rt6_route_rcv.
Signed-off-by: NDuan Jiong <duanj.fnst@cn.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

30e56918

02 12月, 2013 2 次提交

mac80211: fix rx_nss calculation for drivers with hw rc · ddcc347b

由 Michal Kazior 提交于 12月 02, 2013

Drivers with hardware rate control were given
sta->rx_nss set to 0. This was because rx_nss
calculation procedure was protected by hw/sw rate
control check.
Signed-off-by: NMichal Kazior <michal.kazior@tieto.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

ddcc347b

mac80211: check csa wiphy flag in ibss before switching · 0834ae3c

由 Simon Wunderlich 提交于 11月 26, 2013

When external CSA IEs are received (beacons or action messages), a
channel switch is triggered as well. This should only be allowed on
devices which actually support channel switches, otherwise disconnect.
(For the corresponding userspace invocation, the wiphy flag is checked
in nl80211).
Signed-off-by: NSimon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

0834ae3c

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功