提交 · 6f70eb7552abc73584f43b701d71c1c377887af0 · openanolis / cloud-kernel

07 6月, 2015 8 次提交

batman-adv: split name from variable for uint mesh attributes · 6f70eb75

由 Antonio Quartulli 提交于 3月 01, 2015

Some mesh attributes are behind substructs in the
batadv_priv object and for this reason the name cannot be
used anymore to refer to them.

This patch allows to specify the variable name where the
attribute is stored inside batadv_priv instead of using the
name
Signed-off-by: NAntonio Quartulli <antonio@open-mesh.com>
Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>

6f70eb75

batman-adv: Use common Jenkins Hash implementation · 36fd61cb

由 Sven Eckelmann 提交于 3月 01, 2015

An unoptimized version of the Jenkins one-at-a-time hash function is used
and partially copied all over the code wherever an hashtable is used.
Instead the optimized version shared between the whole kernel should be
used to reduce code duplication and use better optimized code.

Only the DAT code must use the old implementation because it is used as
distributed hash function which has to be common for all nodes.
Signed-off-by: NSven Eckelmann <sven@narfation.org>
Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>

36fd61cb

bpf: allow programs to write to certain skb fields · d691f9e8

由 Alexei Starovoitov 提交于 6月 04, 2015

allow programs read/write skb->mark, tc_index fields and
((struct qdisc_skb_cb *)cb)->data.

mark and tc_index are generically useful in TC.
cb[0]-cb[4] are primarily used to pass arguments from one
program to another called via bpf_tail_call() which can
be seen in sockex3_kern.c example.

All fields of 'struct __sk_buff' are readable to socket and tc_cls_act progs.
mark, tc_index are writeable from tc_cls_act only.
cb[0]-cb[4] are writeable by both sockets and tc_cls_act.

Add verifier tests and improve sample code.
Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d691f9e8

bpf: make programs see skb->data == L2 for ingress and egress · 3431205e

由 Alexei Starovoitov 提交于 6月 04, 2015

eBPF programs attached to ingress and egress qdiscs see inconsistent skb->data.
For ingress L2 header is already pulled, whereas for egress it's present.
This is known to program writers which are currently forced to use
BPF_LL_OFF workaround.
Since programs don't change skb internal pointers it is safe to do
pull/push right around invocation of the program and earlier taps and
later pt->func() will not be affected.
Multiple taps via packet_rcv(), tpacket_rcv() are doing the same trick
around run_filter/BPF_PROG_RUN even if skb_shared.

This fix finally allows programs to use optimized LD_ABS/IND instructions
without BPF_LL_OFF for higher performance.
tc ingress + cls_bpf + samples/bpf/tcbpf1_kern.o
       w/o JIT   w/JIT
before  20.5     23.6 Mpps
after   21.8     26.6 Mpps

Old programs with BPF_LL_OFF will still work as-is.

We can now undo most of the earlier workaround commit:
a166151c ("bpf: fix bpf helpers to use skb->mac_header relative offsets")
Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3431205e

tcp: remove redundant checks II · 98da81a4

由 Eric Dumazet 提交于 6月 04, 2015

For same reasons than in commit 12e25e10 ("tcp: remove redundant
checks"), we can remove redundant checks done for timewait sockets.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

98da81a4

fddi: print an address with %p format specifier rather than %x · 908e80d6

由 Colin Ian King 提交于 6月 05, 2015

The debug is printing the struct smt_header * address using
the %x format specifier. Fix it to use %p instead.
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

908e80d6

wan: dscc4: fix build warning Wunused-but-set-variable · c5726d26

由 Nicholas Mc Guire 提交于 6月 06, 2015

Fix:
drivers/net/wan/dscc4.c: In function 'dscc4_open':
drivers/net/wan/dscc4.c:1049:25: warning: variable 'ppriv' set but not used
[-Wunused-but-set-variable]

This has been in there unused since 1da177e4 (Linux-2.6.12-rc2) simply
remove it.
Signed-off-by: NNicholas Mc Guire <hofrat@osadl.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c5726d26

inet: add IP_BIND_ADDRESS_NO_PORT to overcome bind(0) limitations · 90c337da

由 Eric Dumazet 提交于 6月 06, 2015

When an application needs to force a source IP on an active TCP socket
it has to use bind(IP, port=x).

As most applications do not want to deal with already used ports, x is
often set to 0, meaning the kernel is in charge to find an available
port.
But kernel does not know yet if this socket is going to be a listener or
be connected.
It has very limited choices (no full knowledge of final 4-tuple for a
connect())

With limited ephemeral port range (about 32K ports), it is very easy to
fill the space.

This patch adds a new SOL_IP socket option, asking kernel to ignore
the 0 port provided by application in bind(IP, port=0) and only
remember the given IP address.

The port will be automatically chosen at connect() time, in a way
that allows sharing a source port as long as the 4-tuples are unique.

This new feature is available for both IPv4 and IPv6 (Thanks Neal)

Tested:

Wrote a test program and checked its behavior on IPv4 and IPv6.

strace(1) shows sequences of bind(IP=127.0.0.2, port=0) followed by
connect().
Also getsockname() show that the port is still 0 right after bind()
but properly allocated after connect().

socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 5
setsockopt(5, SOL_IP, IP_BIND_ADDRESS_NO_PORT, [1], 4) = 0
bind(5, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("127.0.0.2")}, 16) = 0
getsockname(5, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("127.0.0.2")}, [16]) = 0
connect(5, {sa_family=AF_INET, sin_port=htons(53174), sin_addr=inet_addr("127.0.0.3")}, 16) = 0
getsockname(5, {sa_family=AF_INET, sin_port=htons(38050), sin_addr=inet_addr("127.0.0.2")}, [16]) = 0

IPv6 test :

socket(PF_INET6, SOCK_STREAM, IPPROTO_IP) = 7
setsockopt(7, SOL_IP, IP_BIND_ADDRESS_NO_PORT, [1], 4) = 0
bind(7, {sa_family=AF_INET6, sin6_port=htons(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = 0
getsockname(7, {sa_family=AF_INET6, sin6_port=htons(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 0
connect(7, {sa_family=AF_INET6, sin6_port=htons(57300), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = 0
getsockname(7, {sa_family=AF_INET6, sin6_port=htons(60964), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 0

I was able to bind()/connect() a million concurrent IPv4 sockets,
instead of ~32000 before patch.

lpaa23:~# ulimit -n 1000010
lpaa23:~# ./bind --connect --num-flows=1000000 &
1000000 sockets

lpaa23:~# grep TCP /proc/net/sockstat
TCP: inuse 2000063 orphan 0 tw 47 alloc 2000157 mem 66

Check that a given source port is indeed used by many different
connections :

lpaa23:~# ss -t src :40000 | head -10
State      Recv-Q Send-Q   Local Address:Port          Peer Address:Port
ESTAB      0      0           127.0.0.2:40000         127.0.202.33:44983
ESTAB      0      0           127.0.0.2:40000         127.2.27.240:44983
ESTAB      0      0           127.0.0.2:40000           127.2.98.5:44983
ESTAB      0      0           127.0.0.2:40000        127.0.124.196:44983
ESTAB      0      0           127.0.0.2:40000         127.2.139.38:44983
ESTAB      0      0           127.0.0.2:40000          127.1.59.80:44983
ESTAB      0      0           127.0.0.2:40000          127.3.6.228:44983
ESTAB      0      0           127.0.0.2:40000          127.0.38.53:44983
ESTAB      0      0           127.0.0.2:40000         127.1.197.10:44983
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

90c337da

06 6月, 2015 8 次提交

cxgb4: Fix static checker warning · 513d1a1d

由 Hariprasad Shenai 提交于 6月 05, 2015

The patch e85c9a7a: ("cxgb4/cxgb4vf: Add code to calculate T5 BAR2
Offsets for SGE Queue Registers") from Dec 3, 2014, leads to the
following static checker warning:

        drivers/net/ethernet/chelsio/cxgb4/t4_hw.c:5358
	t4_bar2_sge_qregs()
        warn: should '(qid >> qpp_shift) << page_shift' be a 64 bit type?

This patch fixes it
Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

513d1a1d

Merge branch 'cxgb4-next' · 4da44fdf

由 David S. Miller 提交于 6月 05, 2015

Hariprasad Shenai says:

====================
Free VI, flush sge ec and some other misc. fixes

This patch series adds the following.
Free VI interface during remove, flush SGE ec routine, rename
t4_link_start to t4_link_l1cfg since it only does l1 configuration, set
mac addr from when we can't contact firmware for debug purpose, set pcie
completion timeout and use fw interface to access TP_PIO_XXX registers

This patch series has been created against net-next tree and includes
patches on cxgb4 driver.

We have included all the maintainers of respective drivers. Kindly review
the change and let us know in case of any review comments.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4da44fdf

cxgb4: Use FW LDST cmd to access TP_PIO_{ADDR, DATA} register first · c1e9af0c

由 Hariprasad Shenai 提交于 6月 05, 2015

The TP_PIO_{ADDR,DATA} registers are are in conflict with the firmware's
use of these registers. Added a routine to access it through FW LDST
cmd.
Access all TP_PIO_{ADDR,DATA} register access through new routine if FW
is alive. If firmware is dead, than fall back to indirect access.
Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c1e9af0c

cxgb4: program pci completion timeout · eca0f6ee

由 Hariprasad Shenai 提交于 6月 05, 2015

Set pci completion timeout to 0xd.
Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

eca0f6ee

cxgb4: Set mac addr from vpd, when we can't contact firmware · 098ef6c2

由 Hariprasad Shenai 提交于 6月 05, 2015

Grab the Adapter MAC Address out of the VPD and use it for the "debug"
network interface when either we can't contact the firmware
Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

098ef6c2

cxgb4: Rename t4_link_start() to t4_link_l1cfg · 4036da90

由 Hariprasad Shenai 提交于 6月 05, 2015

t4_link_start() was completely misnamed. It does _not_ start up the
link. It merely does the L1 Configuration for the link. The Link Up
process is started automatically by the firmware when the number of
enabled Virtual Interfaces on a port goes from 0 to 1. So renaming
this routine to t4_link_l1cfg() for better documentation.
Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4036da90

cxgb4: Add sge ec context flush service · 5d700ecb

由 Hariprasad Shenai 提交于 6月 05, 2015

Add function to flush the sge ec context cache, and utilize
this new function in the driver
Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5d700ecb

cxgb4: Free Virtual Interfaces in remove routine · 4f3a0fcf

由 Hariprasad Shenai 提交于 6月 05, 2015

Free VI interfaces in remove routine. If we don't do this then the
firmware will never drop the physical link to the peer.
Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4f3a0fcf

05 6月, 2015 24 次提交

Merge branch 'mlx5-next' · bb62f791

由 David S. Miller 提交于 6月 04, 2015

Or Gerlitz says:

====================
mlx5: Add Interface Step Sequence ID support

ISSI (Interface Step Sequence ID) defines the step sequence ID of the
interface between the driver to the firmware and is incremented by
steps of one. ISSI is used to enable deprecating/modifying features,
command interfaces and such, while maintaining compatibility.

As the driver serves both ConnectIB (CIB) and ConnectX4, we carefully
made sure that the IB functionality keeps running also on older CIB
firmware releases that don't support ISSI.

The Ethernet functionailty is available only on ConnectX4 where all
firmware releases support the feature since the very basic ISSI level.
So at this point no need for compatility code there.

As done prior to this series, when the Ethernet functionlity is enabled,
during the initialization flow, the core driver performs a query of the
supported ISSIs using the QUERY_ISSI command, and then, if ISSI is supported,
sets the actual issi value informing the firmware on which ISSI level to run,
using SET_ISSI command.

Previously, the IB driver wasn't ready to work on that mode, and hence
building both the IB driver and the Ethernet functionality in the core
driver were disallowed by Kconfigs, with this series, we allow users to
enable them both.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bb62f791

mlx5: Enable mutual support for IB and Ethernet · 4aa17b28

由 Haggai Abramonvsky 提交于 6月 04, 2015

Ethernet functionality is only available when working in ISSI > 0 mode.

Previously, the IB driver wasn't ready to work on that mode, and hence
building both the IB driver and the Ethernet functionality in the core
driver were disallowed by Kconfigs.

Now, once we have all the pre-steps in place, we can remove this limitation.

The last steps in the IB driver for getting that setup to work are:
create dummy SRQ for the driver's use (until now we could use XRC_SRQ
as SRQ and XRC_SRQ, after moving to ISSI > 0, we separate XRC SRQs from
basic SRQs) and adapt the create QP function to be compatible with ISSI > 0.
Signed-off-by: NHaggai Abramovsky <hagaya@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4aa17b28

IB/mlx5: Don't create IB instance over Ethernet ports · 647241ea