提交 · 4f0087812648b7611157ae22954acfaed820d24e · openanolis / cloud-kernel

06 1月, 2016 2 次提交

sctp: apply rhashtable api to send/recv path · 4f008781

由 Xin Long 提交于 12月 30, 2015

apply lookup apis to two functions, for __sctp_endpoint_lookup_assoc
and __sctp_lookup_association, it's invoked in the protection of sock
lock, it will be safe, but sctp_lookup_association need to call
rcu_read_lock() and to detect the t->dead to protect it.
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4f008781

sctp: add the rhashtable apis for sctp global transport hashtable · d6c0256a

由 Xin Long 提交于 12月 30, 2015

tranport hashtbale will replace the association hashtable to do the
lookup for transport, and then get association by t->assoc, rhashtable
apis will be used because of it's resizable, scalable and using rcu.

lport + rport + paddr will be the base hashkey to locate the chain,
with net to protect one netns from another, then plus the laddr to
compare to get the target.

this patch will provider the lookup functions:
- sctp_epaddr_lookup_transport
- sctp_addrs_lookup_transport

hash/unhash functions:
- sctp_hash_transport
- sctp_unhash_transport

init/destroy functions:
- sctp_transport_hashtable_init
- sctp_transport_hashtable_destroy
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d6c0256a

05 1月, 2016 32 次提交

Merge branch 'faster-soreuseport' · 6a5ef90c

由 David S. Miller 提交于 1月 04, 2016

Craig Gallek says:

====================
Faster SO_REUSEPORT

This series contains two optimizations for the SO_REUSEPORT feature:
Faster lookup when selecting a socket for an incoming packet and
the ability to select the socket from the group using a BPF program.

This series only includes the UDP path.  I plan to submit a follow-up
including the TCP path if the implementation in this series is
acceptable.

Changes in v4:
- pskb_may_pull is unnecessary with pskb_pull (per Alexei Starovoitov)

Changes in v3:
- skb_pull_inline -> pskb_pull (per Alexei Starovoitov)
- reuseport_attach* -> sk_reuseport_attach* and simple return statement
  syntax change (per Daniel Borkmann)

Changes in v2:
- Fix ARM build; remove unnecessary include.
- Handle case where protocol header is not in linear section (per
  Alexei Starovoitov).
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6a5ef90c

soreuseport: BPF selection functional test · 3ca8e402

由 Craig Gallek 提交于 1月 04, 2016

This program will build classic and extended BPF programs and
validate the socket selection logic when used with
SO_ATTACH_REUSEPORT_CBPF and SO_ATTACH_REUSEPORT_EBPF.

It also validates the re-programing flow and several edge cases.
Signed-off-by: NCraig Gallek <kraig@google.com>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3ca8e402

soreuseport: setsockopt SO_ATTACH_REUSEPORT_[CE]BPF · 538950a1

由 Craig Gallek 提交于 1月 04, 2016

Expose socket options for setting a classic or extended BPF program
for use when selecting sockets in an SO_REUSEPORT group.  These options
can be used on the first socket to belong to a group before bind or
on any socket in the group after bind.

This change includes refactoring of the existing sk_filter code to
allow reuse of the existing BPF filter validation checks.
Signed-off-by: NCraig Gallek <kraig@google.com>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

538950a1

soreuseport: fast reuseport UDP socket selection · e32ea7e7

由 Craig Gallek 提交于 1月 04, 2016

Include a struct sock_reuseport instance when a UDP socket binds to
a specific address for the first time with the reuseport flag set.
When selecting a socket for an incoming UDP packet, use the information
available in sock_reuseport if present.

This required adding an additional field to the UDP source address
equality function to differentiate between exact and wildcard matches.
The original use case allowed wildcard matches when checking for
existing port uses during bind.  The new use case of adding a socket
to a reuseport group requires exact address matching.

Performance test (using a machine with 2 CPU sockets and a total of
48 cores):  Create reuseport groups of varying size.  Use one socket
from this group per user thread (pinning each thread to a different
core) calling recvmmsg in a tight loop.  Record number of messages
received per second while saturating a 10G link.
  10 sockets: 18% increase (~2.8M -> 3.3M pkts/s)
  20 sockets: 14% increase (~2.9M -> 3.3M pkts/s)
  40 sockets: 13% increase (~3.0M -> 3.4M pkts/s)

This work is based off a similar implementation written by
Ying Cai <ycai@google.com> for implementing policy-based reuseport
selection.
Signed-off-by: NCraig Gallek <kraig@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e32ea7e7

soreuseport: define reuseport groups · ef456144

由 Craig Gallek 提交于 1月 04, 2016

struct sock_reuseport is an optional shared structure referenced by each
socket belonging to a reuseport group.  When a socket is bound to an
address/port not yet in use and the reuseport flag has been set, the
structure will be allocated and attached to the newly bound socket.
When subsequent calls to bind are made for the same address/port, the
shared structure will be updated to include the new socket and the
newly bound socket will reference the group structure.

Usually, when an incoming packet was destined for a reuseport group,
all sockets in the same group needed to be considered before a
dispatching decision was made.  With this structure, an appropriate
socket can be found after looking up just one socket in the group.

This shared structure will also allow for more complicated decisions to
be made when selecting a socket (eg a BPF filter).

This work is based off a similar implementation written by
Ying Cai <ycai@google.com> for implementing policy-based reuseport
selection.
Signed-off-by: NCraig Gallek <kraig@google.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ef456144

Merge branch 'mlxsw-fixes' · ebb3cf41

由 David S. Miller 提交于 1月 04, 2016

Jiri Pirko says:

====================
mlxsw: couple of fixes

Couple of fixes from Ido.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ebb3cf41

mlxsw: spectrum: Change bridge port attributes only when bridged · 6c72a3d0

由 Ido Schimmel 提交于 1月 04, 2016

Bridge port attributes are offloaded to hardware when invoked with SELF
flag set, but it really makes no sense to reflect them when port is not
bridged.

Allow a user to change these attribute only when port is bridged and
initialize them correctly when joining or leaving a bridge.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6c72a3d0

mlxsw: spectrum: Set bridge status in appropriate functions · 5a8f4525

由 Ido Schimmel 提交于 1月 04, 2016

Set the bridge status of physical ports in the appropriate functions, to
be consistent with LAG join/leave and vPorts joining/leaving bridge.

Also, remove the error messages in these two functions, as we already
emit errors in both the single functions they call.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5a8f4525

mlxsw: spectrum: Return NOTIFY_BAD on bridge failure · 78124078

由 Ido Schimmel 提交于 1月 04, 2016

It is possible for us to fail when joining or leaving a bridge, so let
the user know about that by returning NOTIFY_BAD, as already done for
LAG join/leave and 802.1D bridges.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

78124078

mlxsw: spectrum: Initialize PVID only once · 7b31abe7

由 Ido Schimmel 提交于 1月 04, 2016

We set PVID to 1 in mlxsw_sp_port_vlan_init(), so we can remove this
statement.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7b31abe7

chelsio: constify cphy_ops structures · 46f85a92

由 Julia Lawall 提交于 1月 03, 2016

The cphy_ops structures are never modified, so declare them as const.

Done with the help of Coccinelle.
Signed-off-by: NJulia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

46f85a92

fsl/fman: allow modular build · 46678612

由 Arnd Bergmann 提交于 1月 01, 2016

ARM allmodconfig fails because of the addition of the FMAN driver:

drivers/built-in.o: In function `dtsec_restart_autoneg':
binder.c:(.text+0x173328): undefined reference to `mdiobus_read'
binder.c:(.text+0x173348): undefined reference to `mdiobus_write'
drivers/built-in.o: In function `dtsec_config':
binder.c:(.text+0x173d24): undefined reference to `of_phy_find_device'
drivers/built-in.o: In function `init_phy':
binder.c:(.text+0x1763b0): undefined reference to `of_phy_connect'
drivers/built-in.o: In function `stop':
binder.c:(.text+0x176014): undefined reference to `phy_stop'
drivers/built-in.o: In function `start':
binder.c:(.text+0x176078): undefined reference to `phy_start'

The reason is that the driver uses PHYLIB, but that is a loadable
module here, and fman itself is built-in.

This patch makes it possible to configure fman as a module as well
so we don't change the status of PHYLIB in an allmodconfig kernel,
and it adds a 'select PHYLIB' statement to ensure that phylib is
always built-in when fman is.

The driver uses "builtin_platform_driver(fman_driver);", which means
it cannot be unloaded, but it's still possible to have it as a loadable
module that gets loaded once and never removed.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Fixes: 5adae51a ("fsl/fman: Add FMan MURAM support")
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

46678612

net: make ip6tunnel_xmit definition conditional · 0efeff29

由 Arnd Bergmann 提交于 1月 01, 2016

Moving the caller of iptunnel_xmit_stats causes a build error in
randconfig builds that disable CONFIG_INET:

In file included from ../net/xfrm/xfrm_input.c:17:0:
../include/net/ip6_tunnel.h: In function 'ip6tunnel_xmit':
../include/net/ip6_tunnel.h:93:2: error: implicit declaration of function 'iptunnel_xmit_stats' [-Werror=implicit-function-declaration]
  iptunnel_xmit_stats(dev, pkt_len);

The reason is that the iptunnel_xmit_stats definition is hidden
inside #ifdef CONFIG_INET but the caller is not. We can change
one or the other to fix it, and this patch adds a second #ifdef
around ip6tunnel_xmit() to avoid seeing the invalid call.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Fixes: 039f5062 ("ip_tunnel: Move stats update to iptunnel_xmit()")
Acked-by: NPravin B Shelar <pshelar@nicira.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0efeff29

Merge tag 'nfc-next-4.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-next · 15ab90f4

由 David S. Miller 提交于 1月 04, 2016

Samuel Ortiz says:

====================
NFC 4.5 pull request

This is the first NFC pull request for 4.5 and it brings:

- A new driver for the STMicroelectronics ST95HF NFC chipset.
  The ST95HF is an NFC digital transceiver with an embedded analog
  front-end and as such relies on the Linux NFC digital
  implementation. This is the 3rd user of the NFC digital stack.

- ACPI support for the ST st-nci and st21nfca drivers.

- A small improvement for the nfcsim driver, as we can now tune
  the Rx delay through sysfs.

- A bunch of minor cleanups and small fixes from Christophe Ricard,
  for a few drivers and the NFC core code.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

15ab90f4

udp: properly support MSG_PEEK with truncated buffers · 197c949e

由 Eric Dumazet 提交于 12月 30, 2015

Backport of this upstream commit into stable kernels :
89c22d8c ("net: Fix skb csum races when peeking")
exposed a bug in udp stack vs MSG_PEEK support, when user provides
a buffer smaller than skb payload.

In this case,
skb_copy_and_csum_datagram_iovec(skb, sizeof(struct udphdr),
                                 msg->msg_iov);
returns -EFAULT.

This bug does not happen in upstream kernels since Al Viro did a great
job to replace this into :
skb_copy_and_csum_datagram_msg(skb, sizeof(struct udphdr), msg);
This variant is safe vs short buffers.

For the time being, instead reverting Herbert Xu patch and add back
skb->ip_summed invalid changes, simply store the result of
udp_lib_checksum_complete() so that we avoid computing the checksum a
second time, and avoid the problematic
skb_copy_and_csum_datagram_iovec() call.

This patch can be applied on recent kernels as it avoids a double
checksumming, then backported to stable kernels as a bug fix.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

197c949e

Merge branch 'r8169-hw-programming-typo-fixes' · 815bc580

由 David S. Miller 提交于 1月 04, 2016

Chunhao Lin says:

====================
Fix some typos in setting hardware parameter

The typos are in setting RTL8168DP, RTL8168EP and RTL8168H hardware parameters.
This series of patch fix these typos.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

815bc580

r8169:Correct the way of setting RTL8168DP ephy · 1016a4a1

由 Chun-Hao Lin 提交于 12月 29, 2015

The original way is wrong, it always writes ephy reg 0x03.
Signed-off-by: NChunhao Lin <hau@realtek.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1016a4a1

r8169:Fix typo in setting RTL8168H PHY PFM mode. · c832c35f

由 Chun-Hao Lin 提交于 12月 29, 2015

The PHY PFM register is in PHY page 0x0a44 register 0x11, not 0x14.
Signed-off-by: NChunhao Lin <hau@realtek.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c832c35f

r8169:Fix typo in setting RTL8168EP and RTL8168H D3cold PFM mode · 69f3dc37

由 Chun-Hao Lin 提交于 12月 29, 2015

The register for setting D3code PFM mode is  MISC_1, not DLLPR.
Signed-off-by: NChunhao Lin <hau@realtek.com>
Reviewed-by: NFrancois Romieu <romieu@fr.zoreil.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

69f3dc37

l2tp: rely on ppp layer for skb scrubbing · 98f40b3e

由 Guillaume Nault 提交于 12月 29, 2015

Since 79c441ae ("ppp: implement x-netns support"), the PPP layer
calls skb_scrub_packet() whenever the skb is received on the PPP
device. Manually resetting packet meta-data in the L2TP layer is thus
redundant.
Signed-off-by: NGuillaume Nault <g.nault@alphalink.fr>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

98f40b3e

Merge branch 'sh_eth-remove-BE-desc-support' · 04c67a90

由 David S. Miller 提交于 1月 04, 2016

Sergei Shtylyov says:

====================
sh_eth: remove unused BE descriptor support

   Here's a set of 2 patches against DaveM's 'net-next.git' repo plus the
recently merged to 'net.git' repo fix for the 16-bit descriptor endianness.
We get rid of ~30 LoCs and ~300 bytes of code.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

04c67a90

sh_eth: get rid of {cpu|edmac}_to_{edmac|cpu}() · 7cf72477

由 Sergei Shtylyov 提交于 12月 28, 2015

Now that {cpu|edmac}_to_{edmac|cpu}() functions boiled down to the mere
{cpu|le32}_to_{le32|cpu}() calls, there's no need for these functions
anymore, so just get rid of them.
Signed-off-by: NSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Acked-by: NSimon Horman <horms+renesas@verge.net.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7cf72477

sh_eth: remove EDMAC_BIG_ENDIAN · 888cc8c2

由 Sergei Shtylyov 提交于 12月 28, 2015

Commit 71557a37 ("[netdrvr] sh_eth: Add SH7619 support") added support
for the big-endian EDMAC descriptors. However, it was never used and never
worked right until the recent driver fixes. I think we now can just remove
this support, it was only burdening the driver from the start. It should be
easy to do without disturbing the SH platform code, at least for now...
Signed-off-by: NSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Acked-by: NSimon Horman <horms+renesas@verge.net.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

888cc8c2

tilepro: use to_delayed_work · 6e898bfd

由 Geliang Tang 提交于 1月 01, 2016

Use to_delayed_work() instead of open-coding it.
Signed-off-by: NGeliang Tang <geliangtang@163.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6e898bfd

Merge branch 'bnxt_en-combined-rx-tx-channels' · 48b874cc

由 David S. Miller 提交于 1月 04, 2016

Michael Chan says:

====================
bnxt_en: Support combined and rx/tx channels.

The bnxt hardware uses a completion ring for rx and tx events.  The driver
has to process the completion ring entries sequentially for the events.
The current code only supports an rx/tx ring pair for each completion ring.
This patch series add support for using a dedicated completion ring for
rx only or tx only as an option configuarble using ethtool -L.

The benefits for using dedicated completion rings are:

1. A burst of rx packets can cause delay in processing tx events if the
completion ring is shared.  If tx queue is stopped by BQL, this can cause
delay in re-starting the tx queue.

2. A completion ring is sized according to the rx and tx ring size rounded
up to the nearest power of 2.  When the completion ring is shared, it is
sized by adding the rx and tx ring sizes and then rounded to the next power
of 2, often with a lot of wasted space.

3. Using dedicated completion ring, we can adjust the tx and rx coalescing
parameters independently for rx and tx.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

48b874cc

bnxt_en: Modify ethtool -l|-L to support combined or rx/tx rings. · 068c9ec6