提交 · 1aee6cc2a5193aa9963ea49b0d452723c1d493d8 · openeuler / raspberrypi-kernel

18 12月, 2013 5 次提交

net/hsr: using kfree_rcu() to simplify the code · 1aee6cc2

由 Wei Yongjun 提交于 12月 16, 2013

The callback function of call_rcu() just calls a kfree(), so we
can use kfree_rcu() instead of call_rcu() + callback function.
Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn>
Acked-by: NArvid Brodin <arvid.brodin@alten.se>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1aee6cc2

pkt_sched: fq: more robust memory allocation · c3bd8549

由 Eric Dumazet 提交于 12月 15, 2013

This patch brings NUMA support and automatic fallback to vmalloc()
in case kmalloc() failed to allocate FQ hash table.

NUMA support depends on XPS being setup for the device before
qdisc allocation. After a XPS change, it might be worth creating
qdisc hierarchy again.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c3bd8549

tcp: refine TSO splits · d4589926

由 Eric Dumazet 提交于 12月 13, 2013

While investigating performance problems on small RPC workloads,
I noticed linux TCP stack was always splitting the last TSO skb
into two parts (skbs). One being a multiple of MSS, and a small one
with the Push flag. This split is done even if TCP_NODELAY is set,
or if no small packet is in flight.

Example with request/response of 4K/4K

IP A > B: . ack 68432 win 2783 <nop,nop,timestamp 6524593 6525001>
IP A > B: . 65537:68433(2896) ack 69632 win 2783 <nop,nop,timestamp 6524593 6525001>
IP A > B: P 68433:69633(1200) ack 69632 win 2783 <nop,nop,timestamp 6524593 6525001>
IP B > A: . ack 68433 win 2768 <nop,nop,timestamp 6525001 6524593>
IP B > A: . 69632:72528(2896) ack 69633 win 2768 <nop,nop,timestamp 6525001 6524593>
IP B > A: P 72528:73728(1200) ack 69633 win 2768 <nop,nop,timestamp 6525001 6524593>
IP A > B: . ack 72528 win 2783 <nop,nop,timestamp 6524593 6525001>
IP A > B: . 69633:72529(2896) ack 73728 win 2783 <nop,nop,timestamp 6524593 6525001>
IP A > B: P 72529:73729(1200) ack 73728 win 2783 <nop,nop,timestamp 6524593 6525001>

We can avoid this split by including the Nagle tests at the right place.

Note : If some NIC had trouble sending TSO packets with a partial
last segment, we would have hit the problem in GRO/forwarding workload already.

tcp_minshall_update() is moved to tcp_output.c and is updated as we might
feed a TSO packet with a partial last segment.

This patch tremendously improves performance, as the traffic now looks
like :

IP A > B: . ack 98304 win 2783 <nop,nop,timestamp 6834277 6834685>
IP A > B: P 94209:98305(4096) ack 98304 win 2783 <nop,nop,timestamp 6834277 6834685>
IP B > A: . ack 98305 win 2768 <nop,nop,timestamp 6834686 6834277>
IP B > A: P 98304:102400(4096) ack 98305 win 2768 <nop,nop,timestamp 6834686 6834277>
IP A > B: . ack 102400 win 2783 <nop,nop,timestamp 6834279 6834686>
IP A > B: P 98305:102401(4096) ack 102400 win 2783 <nop,nop,timestamp 6834279 6834686>
IP B > A: . ack 102401 win 2768 <nop,nop,timestamp 6834687 6834279>
IP B > A: P 102400:106496(4096) ack 102401 win 2768 <nop,nop,timestamp 6834687 6834279>
IP A > B: . ack 106496 win 2783 <nop,nop,timestamp 6834280 6834687>
IP A > B: P 102401:106497(4096) ack 106496 win 2783 <nop,nop,timestamp 6834280 6834687>
IP B > A: . ack 106497 win 2768 <nop,nop,timestamp 6834688 6834280>
IP B > A: P 106496:110592(4096) ack 106497 win 2768 <nop,nop,timestamp 6834688 6834280>

Before :

lpq83:~# nstat >/dev/null;perf stat ./super_netperf 200 -t TCP_RR -H lpq84 -l 20 -- -r 4K,4K
280774

Performance counter stats for './super_netperf 200 -t TCP_RR -H lpq84 -l 20 -- -r 4K,4K':

205719.049006 task-clock # 9.278 CPUs utilized
8,449,968 context-switches # 0.041 M/sec
1,935,997 CPU-migrations # 0.009 M/sec
160,541 page-faults # 0.780 K/sec
548,478,722,290 cycles # 2.666 GHz [83.20%]
455,240,670,857 stalled-cycles-frontend # 83.00% frontend cycles idle [83.48%]
272,881,454,275 stalled-cycles-backend # 49.75% backend cycles idle [66.73%]
166,091,460,030 instructions # 0.30 insns per cycle
# 2.74 stalled cycles per insn [83.39%]
29,150,229,399 branches # 141.699 M/sec [83.30%]
1,943,814,026 branch-misses # 6.67% of all branches [83.32%]

22.173517844 seconds time elapsed

lpq83:~# nstat | egrep "IpOutRequests|IpExtOutOctets"
IpOutRequests 16851063 0.0
IpExtOutOctets 23878580777 0.0

After patch :

lpq83:~# nstat >/dev/null;perf stat ./super_netperf 200 -t TCP_RR -H lpq84 -l 20 -- -r 4K,4K
280877

Performance counter stats for './super_netperf 200 -t TCP_RR -H lpq84 -l 20 -- -r 4K,4K':

107496.071918 task-clock # 4.847 CPUs utilized
5,635,458 context-switches # 0.052 M/sec
1,374,707 CPU-migrations # 0.013 M/sec
160,920 page-faults # 0.001 M/sec
281,500,010,924 cycles # 2.619 GHz [83.28%]
228,865,069,307 stalled-cycles-frontend # 81.30% frontend cycles idle [83.38%]
142,462,742,658 stalled-cycles-backend # 50.61% backend cycles idle [66.81%]
95,227,712,566 instructions # 0.34 insns per cycle
# 2.40 stalled cycles per insn [83.43%]
16,209,868,171 branches # 150.795 M/sec [83.20%]
874,252,952 branch-misses # 5.39% of all branches [83.37%]

22.175821286 seconds time elapsed

lpq83:~# nstat | egrep "IpOutRequests|IpExtOutOctets"
IpOutRequests 11239428 0.0
IpExtOutOctets 23595191035 0.0

Indeed, the occupancy of tx skbs (IpExtOutOctets/IpOutRequests) is higher :
2099 instead of 1417, thus helping GRO to be more efficient when using FQ packet
scheduler.

Many thanks to Neal for review and ideas.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Nandita Dukkipati <nanditad@google.com>
Cc: Van Jacobson <vanj@google.com>
Acked-by: NNeal Cardwell <ncardwell@google.com>
Tested-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d4589926

net: remove dead code for add/del multiple · 477bb933

由 stephen hemminger 提交于 12月 13, 2013

These function to manipulate multiple addresses are not used anywhere
in current net-next tree. Some out of tree code maybe using these but
too bad; they should submit their code upstream..

Also, make __hw_addr_flush local since only used by dev_addr_lists.c
Signed-off-by: NStephen Hemminger <stephen@networkplumber.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

477bb933

net: ovs: use CRC32 accelerated flow hash if available · 500f8087

由 Francesco Fusco 提交于 12月 12, 2013

Currently OVS uses jhash2() for calculating flow hashes in its
internal flow_hash() function. The performance of the flow_hash()
function is critical, as the input data can be hundreds of bytes
long.

OVS is largely deployed in x86_64 based datacenters.  Therefore,
we argue that the performance critical fast path of OVS should
exploit underlying CPU features in order to reduce the per packet
processing costs. We replace jhash2 with the hash implementation
provided by the kernel hash lib, which exploits the crc32l
instruction to achieve high performance

Our patch greatly reduces the hash footprint from ~200 cycles of
jhash2() to around ~90 cycles in case of ovs_flow_hash_crc()
(measured with rdtsc over maximum length flow keys on an i7 Intel
CPU).

Additionally, we wrote a microbenchmark to stress the flow table
performance. The benchmark inserts random flows into the flow
hash and then performs lookups. Our hash deployed on a CRC32
capable CPU reduces the lookup for 1000 flows, 100 masks from
~10,100us to ~6,700us, for example.

Thus, simply use the newly introduced arch_fast_hash2() as a
drop-in replacement.
Signed-off-by: NFrancesco Fusco <ffusco@redhat.com>
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NThomas Graf <tgraf@redhat.com>
Acked-by: NJesse Gross <jesse@nicira.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

500f8087

17 12月, 2013 4 次提交

tipc: change lock_sock order in connect() · d3fbccf2

由 wangweidong 提交于 12月 12, 2013

Instead of reaquiring the socket lock and taking the normal exit
path when a connection times out, we bail out early with a
return -ETIMEDOUT.
Reviewed-by: NJon Maloy <jon.maloy@ericsson.com>
Reviewed-by: NErik Hugne <erik.hugne@ericsson.com>
Signed-off-by: NWang Weidong <wangweidong1@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d3fbccf2

tipc: Use <linux/uaccess.h> instead of <asm/uaccess.h> · 776a74ce

由 wangweidong 提交于 12月 12, 2013

As warned by checkpatch.pl, use #include <linux/uaccess.h>
instead of <asm/uaccess.h>
Reviewed-by: NJon Maloy <jon.maloy@ericsson.com>
Reviewed-by: NErik Hugne <erik.hugne@ericsson.com>
Signed-off-by: NWang Weidong <wangweidong1@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

776a74ce

tipc: kill unnecessary goto's · 3b8401fe

由 wangweidong 提交于 12月 12, 2013

Remove a number of needless 'goto exit' in send_stream
when the socket is in an unconnected state.
This patch is cosmetic and does not alter the operation of
TIPC in any way.
Reviewed-by: NJon Maloy <jon.maloy@ericsson.com>
Reviewed-by: NErik Hugne <erik.hugne@ericsson.com>
Signed-off-by: NWang Weidong <wangweidong1@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3b8401fe

tipc: remove unnecessary variables and conditions · 0cee6bbe

由 wangweidong 提交于 12月 12, 2013

We remove a number of unnecessary variables and branches
in TIPC. This patch is cosmetic and does not change the
operation of TIPC in any way.
Reviewed-by: NJon Maloy <jon.maloy@ericsson.com>
Reviewed-by: NErik Hugne <erik.hugne@ericsson.com>
Signed-off-by: NWang Weidong <wangweidong1@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0cee6bbe

16 12月, 2013 1 次提交

net-ipv6: Fix alleged compiler warning in ipv6_exthdrs_len() · 810c23a3

由 Jerry Chu 提交于 12月 15, 2013

It was reported that Commit 299603e8
("net-gro: Prepare GRO stack for the upcoming tunneling support")
triggered a compiler warning in ipv6_exthdrs_len():

net/ipv6/ip6_offload.c: In function ‘ipv6_gro_complete’:
net/ipv6/ip6_offload.c:178:24: warning: ‘optlen’ may be used uninitialized in this function [-Wmaybe-u
    opth = (void *)opth + optlen;
    			^
    net/ipv6/ip6_offload.c:164:22: note: ‘optlen’ was declared here
    int len = 0, proto, optlen;
                        ^
Note that there was no real bug here - optlen was never uninitialized
before use. (Was the version of gcc I used smarter to not complain?)
Reported-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NH.K. Jerry Chu <hkchu@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

810c23a3

14 12月, 2013 5 次提交

ipv6: fix compiler warning in ipv6_exthdrs_len · f52d81dc

由 Hannes Frederic Sowa 提交于 12月 14, 2013

Commit 299603e8 ("net-gro: Prepare GRO
stack for the upcoming tunneling support") used an uninitialized variable
which leads to the following compiler warning:

net/ipv6/ip6_offload.c: In function ‘ipv6_gro_complete’:
net/ipv6/ip6_offload.c:178:24: warning: ‘optlen’ may be used uninitialized in this function [-Wmaybe-uninitialized]
    opth = (void *)opth + optlen;
                        ^
net/ipv6/ip6_offload.c:164:22: note: ‘optlen’ was declared here
  int len = 0, proto, optlen;
                      ^
Fix it up.

Cc: Jerry Chu <hkchu@google.com>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f52d81dc

bonding: create bond_first_slave_rcu() · e001bfad

由 dingtianhong 提交于 12月 13, 2013

The bond_first_slave_rcu() will be used to instead of bond_first_slave()
in rcu_read_lock().

According to the Jay Vosburgh's suggestion, the struct netdev_adjacent
should hide from users who wanted to use it directly. so I package a
new function to get the first slave of the bond.
Suggested-by: NNikolay Aleksandrov <nikolay@redhat.com>
Suggested-by: NJay Vosburgh <fubar@us.ibm.com>
Suggested-by: NVeaceslav Falico <vfalico@redhat.com>
Signed-off-by: NDing Tianhong <dingtianhong@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e001bfad

pkt_sched: set root qdisc before change() in attach_default_qdiscs() · e57a784d

由 Eric Dumazet 提交于 12月 12, 2013

After commit 95dc1929 ("pkt_sched: give visibility to mq slave
qdiscs") we call disc_list_add() while the device qdisc might be
the noop_qdisc one.

This shows up as duplicates in "tc qdisc show", as all inactive devices
point to noop_qdisc.

Fix this by setting dev->qdisc to the new qdisc before calling
ops->change() in attach_default_qdiscs()

Add a WARN_ON_ONCE() to catch any future similar problem.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e57a784d

packet: fix using smp_processor_id() in preemptible code · 1cbac010

由 Li Zhong 提交于 12月 12, 2013

This patches fixes the following warning by replacing smp_processor_id()
with raw_smp_processor_id():

[   11.120893] BUG: using smp_processor_id() in preemptible [00000000] code: arping/3510
[   11.120913] caller is .packet_sendmsg+0xc14/0xe68
[   11.120920] CPU: 13 PID: 3510 Comm: arping Not tainted 3.13.0-rc3-next-20131211-dirty #1
[   11.120926] Call Trace:
[   11.120932] [c0000001f803f6f0] [c0000000000138dc] .show_stack+0x110/0x25c (unreliable)
[   11.120942] [c0000001f803f7e0] [c00000000083dd24] .dump_stack+0xa0/0x37c
[   11.120951] [c0000001f803f870] [c000000000493fd4] .debug_smp_processor_id+0xfc/0x12c
[   11.120959] [c0000001f803f900] [c0000000007eba78] .packet_sendmsg+0xc14/0xe68
[   11.120968] [c0000001f803fa80] [c000000000700968] .sock_sendmsg+0xa0/0xe0
[   11.120975] [c0000001f803fbf0] [c0000000007014d8] .SyS_sendto+0x100/0x148
[   11.120983] [c0000001f803fd60] [c0000000006fff10] .SyS_socketcall+0x1c4/0x2e8
[   11.120990] [c0000001f803fe30] [c00000000000a1e4] syscall_exit+0x0/0x9c
Signed-off-by: NLi Zhong <zhong@linux.vnet.ibm.com>
Acked-by: NJesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1cbac010

netconf: add proxy-arp support · f085ff1c

由 stephen hemminger 提交于 12月 12, 2013

Add support to netconf to show changes to proxy-arp status on a per
interface basis via netlink in a manner similar to forwarding
and reverse path state.
Signed-off-by: NStephen Hemminger <stephen@networkplumber.org>
Acked-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f085ff1c

13 12月, 2013 2 次提交

ipv6: fix incorrect type in declaration · 68536053

由 Florent Fourcot 提交于 12月 12, 2013

Introduced by 1397ed35
  "ipv6: add flowinfo for tcp6 pkt_options for all cases"
Reported-by: Nkbuild test robot <fengguang.wu@intel.com>

V2: fix the title, add empty line after the declaration (Sergei Shtylyov
feedbacks)
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

68536053

net-gro: Prepare GRO stack for the upcoming tunneling support · 299603e8

由 Jerry Chu 提交于 12月 11, 2013

This patch modifies the GRO stack to avoid the use of "network_header"
and associated macros like ip_hdr() and ipv6_hdr() in order to allow
an arbitary number of IP hdrs (v4 or v6) to be used in the
encapsulation chain. This lays the foundation for various IP
tunneling support (IP-in-IP, GRE, VXLAN, SIT,...) to be added later.

With this patch, the GRO stack traversing now is mostly based on
skb_gro_offset rather than special hdr offsets saved in skb (e.g.,
skb->network_header). As a result all but the top layer (i.e., the
the transport layer) must have hdrs of the same length in order for
a pkt to be considered for aggregation. Therefore when adding a new
encap layer (e.g., for tunneling), one must check and skip flows
(e.g., by setting NAPI_GRO_CB(p)->same_flow to 0) that have a
different hdr length.

Note that unlike the network header, the transport header can and
will continue to be set by the GRO code since there will be at
most one "transport layer" in the encap chain.
Signed-off-by: NH.K. Jerry Chu <hkchu@google.com>
Suggested-by: NEric Dumazet <edumazet@google.com>
Reviewed-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

299603e8

12 12月, 2013 5 次提交

ipv6: router reachability probing · 7e980569

由 Jiri Benc 提交于 12月 11, 2013

RFC 4191 states in 3.5:

   When a host avoids using any non-reachable router X and instead sends
   a data packet to another router Y, and the host would have used
   router X if router X were reachable, then the host SHOULD probe each
   such router X's reachability by sending a single Neighbor
   Solicitation to that router's address.  A host MUST NOT probe a
   router's reachability in the absence of useful traffic that the host
   would have sent to the router if it were reachable.  In any case,
   these probes MUST be rate-limited to no more than one per minute per
   router.

Currently, when the neighbour corresponding to a router falls into
NUD_FAILED, it's never considered again. Introduce a new rt6_nud_state
value, RT6_NUD_FAIL_PROBE, which suggests the route should not be used but
should be probed with a single NS. The probe is ratelimited by the existing
code. To better distinguish meanings of the failure values, rename
RT6_NUD_FAIL_SOFT to RT6_NUD_FAIL_DO_RR.
Signed-off-by: NJiri Benc <jbenc@redhat.com>
Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7e980569

sctp: remove redundant null check on asoc · e4772668

由 wangweidong 提交于 12月 11, 2013

In sctp_err_lookup, goto out while the asoc is not NULL, so remove the
check NULL. Also, in sctp_err_finish which called by sctp_v4_err and
sctp_v6_err, they pass asoc to sctp_err_finish while the asoc is not
NULL, so remove the check.
Signed-off-by: NWang Weidong <wangweidong1@huawei.com>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Acked-by: NVlad Yasevich <vyasevich@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e4772668

sch_htb: remove unnecessary NULL pointer judgment · 6b1dd856

由 Yang Yingliang 提交于 12月 11, 2013

It already has a NULL pointer judgment of rtab in qdisc_put_rtab().
Remove the judgment outside of qdisc_put_rtab().
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6b1dd856

ipv4: fix wildcard search with inet_confirm_addr() · b601fa19

由 Nicolas Dichtel 提交于 12月 10, 2013

Help of this function says: "in_dev: only on this interface, 0=any interface",
but since commit 39a6d063 ("[NETNS]: Process inet_confirm_addr in the
correct namespace."), the code supposes that it will never be NULL. This
function is never called with in_dev == NULL, but it's exported and may be used
by an external module.

Because this patch restore the ability to call inet_confirm_addr() with in_dev
== NULL, I partially revert the above commit, as suggested by Julian.

CC: Julian Anastasov <ja@ssi.bg>
Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: NJulian Anastasov <ja@ssi.bg>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b601fa19

net_sched: expand control flow of macro SKIP_NONLOCAL · 4f8f61eb

由 Yang Yingliang 提交于 12月 11, 2013

SKIP_NONLOCAL hides the control flow. The control flow should be
inlined and expanded explicitly in code so that someone who reads
it can tell the control flow can be changed by the statement.
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4f8f61eb

11 12月, 2013 18 次提交

nfc: Fix FSF address in file headers · 98b32dec

由 Jeff Kirsher 提交于 12月 06, 2013

Several files refer to an old address for the Free Software Foundation
in the file header comment.  Resolve by replacing the address with
the URL <http://www.gnu.org/licenses/> so that we do not have to keep
updating the header comments anytime the address changes.

CC: linux-wireless@vger.kernel.org
CC: Lauro Ramos Venancio <lauro.venancio@openbossa.org>
CC: Aloisio Almeida Jr <aloisio.almeida@openbossa.org>
CC: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

98b32dec

rfkill: Fix FSF address in file headers · 8232f1f3

由 Jeff Kirsher 提交于 12月 06, 2013

Several files refer to an old address for the Free Software Foundation
in the file header comment.  Resolve by replacing the address with
the URL <http://www.gnu.org/licenses/> so that we do not have to keep
updating the header comments anytime the address changes.

CC: linux-wireless@vger.kernel.org
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

8232f1f3

tipc: remove unused 'blocked' flag from tipc_link struct · 77a7e07a

由 Ying Xue 提交于 12月 10, 2013

In early versions of TIPC it was possible to administratively block
individual links through the use of the member flag 'blocked'. This
functionality was deemed redundant, and since commit 7368dd ("tipc:
clean out all instances of #if 0'd unused code"), this flag has been
unused.

In the current code, a link only needs to be blocked for sending and
reception if it is subject to an ongoing link failover. In that case,
it is sufficient to check if the number of expected failover packets
is non-zero, something which is done via the funtion 'link_blocked()'.

This commit finally removes the redundant 'blocked' flag completely.
Signed-off-by: NYing Xue <ying.xue@windriver.com>
Reviewed-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

77a7e07a

tipc: eliminate code duplication in media layer · e4d050cb

由 Ying Xue 提交于 12月 10, 2013

Currently TIPC supports two L2 media types, Ethernet and Infiniband.
Because both these media are accessed through the common net_device API,
several functions in the two media adaptation files turn out to be
fully or almost identical, leading to unnecessary code duplication.

In this commit we extract this common code from the two media files
and move them to the generic bearer.c. Additionally, we change
the function names to reflect their real role: to access L2 media,
irrespective of type.
Signed-off-by: NYing Xue <ying.xue@windriver.com>
Cc: Patrick McHardy <kaber@trash.net>
Reviewed-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e4d050cb

tipc: relocate common functions from media to bearer · 6e967adf

由 Ying Xue 提交于 12月 10, 2013

Currently, registering a TIPC stack handler in the network device layer
is done twice, once for Ethernet (eth_media) and Infiniband (ib_media)
repectively. But, as this registration is not media specific, we can
avoid some code duplication by moving the registering function to
the generic bearer layer, to the file bearer.c, and call it only once.
The same is true for the network device event notifier.

As a side effect, the two workqueues we are using for for setting up/
cleaning up media can now be eliminated. Furthermore, the array for
storing the specific media type structs, media_array[], can be entirely
deleted.

Note that the eth_started and ib_started flags were removed during the
code relocation.  There is now only one call to bearer_setup and
bearer_cleanup, and these can logically not race against each other.

Despite its size, this cleanup work incurs no functional changes in TIPC.
In particular, it should be noted that the sequence ordering of received
packets is unaffected by this change, since packet reception never was
subject to any work queue handling in the first place.
Signed-off-by: NYing Xue <ying.xue@windriver.com>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
Reviewed-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6e967adf

tipc: remove TIPC usage of field af_packet_priv in struct net_device · 37cb0620

由 Ying Xue 提交于 12月 10, 2013

TIPC is currently using the field 'af_packet_priv' in struct net_device
as a handle to find the bearer instance associated to the given network
device. But, by doing so it is blocking other networking cleanups, such
as the one discussed here:

http://patchwork.ozlabs.org/patch/178044/

This commit removes this usage from TIPC. Instead, we introduce a new
field, 'tipc_ptr', to the net_device structure, to serve this purpose.
When TIPC bearer is enabled, the bearer object is associated to
'tipc_ptr'. When a TIPC packet arrives in the recv_msg() upcall
from a networking device, the bearer object can now be obtained from
'tipc_ptr'. When a bearer is disabled, the bearer object is detached
from its underlying network device by setting 'tipc_ptr' to NULL.

Additionally, an RCU lock is used to protect the new pointer.
Henceforth, the existing tipc_net_lock is used in write mode to
serialize write accesses to this pointer, while the new RCU lock is
applied on the read side to ensure that the pointer is 100% valid
within its wrapped area for all readers.
Signed-off-by: NYing Xue <ying.xue@windriver.com>
Cc: Patrick McHardy <kaber@trash.net>
Reviewed-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

37cb0620

tipc: improve naming and comment consistency in media layer · ef72a7e0

由 Jon Paul Maloy 提交于 12月 10, 2013

struct 'tipc_media' represents the specific info that the media
layer adaptors (eth_media and ib_media) expose to the generic
bearer layer. We clarify this by improved commenting, and by giving
the 'media_list' array the more appropriate name 'media_info_array'.

There are no functional changes in this commit.
Signed-off-by: NYing Xue <ying.xue@windriver.com>
Reviewed-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ef72a7e0

tipc: initiate media type array at compile time · 5702dbab

由 Jon Paul Maloy 提交于 12月 10, 2013

Communication media types are abstracted through the struct 'tipc_media',
one per media type. These structs are allocated statically inside their
respective media file.

Furthermore, in order to be able to reach all instances from a central
location, we keep a static array with pointers to these structs. This
array is currently initialized at runtime, under protection of
tipc_net_lock. However, since the contents of the array itself never
changes after initialization, we can just as well initialize it at
compile time and make it 'const', at the same time making it obvious
that no lock protection is needed here.

This commit makes the array constant and removes the redundant lock
protection.
Signed-off-by: NYing Xue <ying.xue@windriver.com>
Reviewed-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5702dbab

tipc: eliminate redundant code with kfree_skb_list routine · d77b3831

由 Ying Xue 提交于 12月 10, 2013

sk_buff lists are currently relased by looping over the list and
explicitly releasing each buffer.

We replace all occurrences of this loop with a call to kfree_skb_list().
Signed-off-by: NYing Xue <ying.xue@windriver.com>
Reviewed-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d77b3831

net_sched: sfq: put sfq_unlink in a do - while loop · fa08943b

由 Yang Yingliang 提交于 12月 10, 2013

Macros with multiple statements should be enclosed in a do - while loop
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fa08943b

net_sched: add space around '>' and before '(' · 833fa743

由 Yang Yingliang 提交于 12月 10, 2013

Spaces required around that '>' (ctx:VxV) and
before the open parenthesis '('.
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

833fa743

net_sched: change "foo* bar" to "foo *bar" · 82d567c2

由 Yang Yingliang 提交于 12月 10, 2013

"foo* bar" or "foo * bar" should be "foo *bar".
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

82d567c2

net_sched: cls_bpf: use tabs to do indent · 1fab9abc

由 Yang Yingliang 提交于 12月 10, 2013

Code indent should use tabs where possible
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1fab9abc

net_sched: remove unnecessary parentheses while return · 17569fae

由 Yang Yingliang 提交于 12月 10, 2013

return is not a function, parentheses are not required.
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

17569fae

net: handle error more gracefully in socketpair() · d73aa286

由 Yann Droneaud 提交于 12月 09, 2013

This patch makes socketpair() use error paths which do not
rely on heavy-weight call to sys_close(): it's better to try
to push the file descriptor to userspace before installing
the socket file to the file descriptor, so that errors are
catched earlier and being easier to handle.

Using sys_close() seems to be the exception, while writing the
file descriptor before installing it look like it's more or less
the norm: eg. except for code used in init/, error handling
involve fput() and put_unused_fd(), but not sys_close().

This make socketpair() usage of sys_close() quite unusual.
So it deserves to be replaced by the common pattern relying on
fput() and put_unused_fd() just like, for example, the one used
in pipe(2) or recvmsg(2).

Three distinct error paths are still needed since calling
fput() on file structure returned by sock_alloc_file() will
implicitly call sock_release() on the associated socket
structure.

Cc: David S. Miller <davem@davemloft.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NYann Droneaud <ydroneaud@opteya.com>
Link: http://marc.info/?i=1385979146-13825-1-git-send-email-ydroneaud@opteya.comSigned-off-by: NDavid S. Miller <davem@davemloft.net>

d73aa286

net: more spelling fixes · 8e3bff96

由 stephen hemminger 提交于 12月 08, 2013

Various spelling fixes in networking stack
Signed-off-by: NStephen Hemminger <stephen@networkplumber.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8e3bff96

ipv4: add support for IFA_FLAGS nl attribute · ad6c8135

由 Jiri Pirko 提交于 12月 08, 2013

Signed-off-by: NJiri Pirko <jiri@resnulli.us>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ad6c8135

dn_dev: add support for IFA_FLAGS nl attribute · 9a32b860

由 Jiri Pirko 提交于 12月 08, 2013

Signed-off-by: NJiri Pirko <jiri@resnulli.us>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9a32b860