提交 · 3268e5cb494d8778a5a67a9fa2b1bdb0243b77ad · openeuler / raspberrypi-kernel

18 12月, 2015 12 次提交

team: Advertise tunneling offload features · 3268e5cb

由 Eran Ben Elisha 提交于 12月 17, 2015

When the underlying device supports offloads encapulated traffic,
we need to reflect that through the hw_enc_features field of the
team net-device.

This will cause the xmit path in the core networking stack to provide
team with encapsulated GSO frames to offload into the HW etc.

Using this over Mellanox ConnectX3-pro (mlx4 driver) card that supports
VXLAN offloads we got 36.0 Gbits/sec using eight iperf streams.
Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Reviewed-by: NOr Gerlitz <ogerlitz@mellanox.com>
Acked-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3268e5cb

net: qmi_wwan: ignore bogus CDC Union descriptors · 34a55d5e

由 Bjørn Mork 提交于 12月 17, 2015

The CDC descriptors found on these vendor specific functions should
not be considered authoritative.  They seem to be ignored by drivers
for other systems, and the quality is therefore low.

One device (1e0e:9001) has been reported to have such a bogus union
descriptor on the QMI function, making it fail probing even if the
device id was dynamically added.  The report was not complete enough
to allow adding a device entry for this modem. But this should at
least fix the dynamic id probing problem.
Reported-by: NKanerva Topi <Topi.Kanerva@cinia.fi>
Signed-off-by: NBjørn Mork <bjorn@mork.no>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

34a55d5e

net/macb: Update device tree binding for resetting PHY using GPIO · 270c499f

由 Gregory CLEMENT 提交于 12月 17, 2015

Instead of being at the MAC level the reset gpio preperty is moved at the
PHY child node level. It is still managed by the MAC, but from the point
of view of the binding it make more sense to be part of the PHY node.

This commit also fixes a build errors if GPIOLIB is not selected.
Signed-off-by: NGregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

270c499f

Merge branch 'cxgb4-l2-table-enhancements' · 08f411d6

由 David S. Miller 提交于 12月 17, 2015

Hariprasad Shenai says:

====================
Few l2 table related enhancements for cxgb4

This series adds a new API to allocate and update l2t entry, replaces
arpq_head/arpq_tail with double skb double linked list. Use t4_mgmt_tx()
to send control packets of l2t write request. Use symbolic constants
while calculating vlan priority.

This patch series has been created against net-next tree and includes
patches on cxgb4 driver.

We have included all the maintainers of respective drivers. Kindly review
the change and let us know in case of any review comments.

Thanks

V2: Remove unnecessary MAS operation while calculating vlan prio in
    PATCH 1/4 ("cxgb4: Use symbolic constant for VLAN priority calculation")
    based on review comment by David Miller
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

08f411d6

cxgb4: Replace arpq_head/arpq_tail with SKB double link-list code · 749cb5fe

由 Hariprasad Shenai 提交于 12月 17, 2015

Based on original work by Michael Werner <werner@chelsio.com>
Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

749cb5fe

cxgb4: Use t4_mgmt_tx() API for sending write l2t request ctrl packets. · 9baeb9d7

由 Hariprasad Shenai 提交于 12月 17, 2015

Based on original work by Michael Werner <werner@chelsio.com>
Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9baeb9d7

cxgb4: Add API to alloc l2t entry; also update existing ones · f7502659

由 Hariprasad Shenai 提交于 12月 17, 2015

Based on original work by Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f7502659

H
cxgb4: Use symbolic constant for VLAN priority calculation · e41e2824
由 Hariprasad Shenai 提交于 12月 17, 2015
```
Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
e41e2824

nfp: clear ring delayed kick counters · 301c141d

由 Jakub Kicinski 提交于 12月 16, 2015

We need to clear delayed kick counters when we free rings otherwise
after ndo_close()/ndo_open() we could kick HW by more entries than
actually written to rings.
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: NRolf Neugebauer <rolf.neugebauer@netronome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

301c141d

tun: honor IFF_UP in tun_get_user() · 1bd4978a

由 Eric Dumazet 提交于 12月 16, 2015

If a tun interface is turned down, we should not allow packet injection
into the kernel.

Kernel does not send packets to the tun already.

TUNATTACHFILTER can not be used as only tun_net_xmit() is taking care
of it.
Reported-by: NCurt Wohlgemuth <curtw@google.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1bd4978a

ipv6: add IPV6_HDRINCL option for raw sockets · 715f504b

由 Hannes Frederic Sowa 提交于 12月 16, 2015

Same as in Windows, we miss IPV6_HDRINCL for SOL_IPV6 and SOL_RAW.
The SOL_IP/IP_HDRINCL is not available for IPv6 sockets.
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

715f504b

ipv6: allow routes to be configured with expire values · 32bc201e

由 Xin Long 提交于 12月 16, 2015

Add the support for adding expire value to routes,  requested by
Tom Gundersen <teg@jklm.no> for systemd-networkd, and NetworkManager
wants it too.

implement it by adding the new RTNETLINK attribute RTA_EXPIRES.
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

32bc201e

17 12月, 2015 2 次提交

net: Pass ndm_state to route netlink FDB notifications. · b3379041

由 Hubert Sokolowski 提交于 12月 15, 2015

Before this change applications monitoring FDB notifications
were not able to determine whether a new FDB entry is permament
or not:
bridge fdb add f1:f2:f3:f4:f5:f8 dev sw0p1 temp self
bridge fdb add f1:f2:f3:f4:f5:f9 dev sw0p1 self

bridge monitor fdb

f1:f2:f3:f4:f5:f8 dev sw0p1 self permanent
f1:f2:f3:f4:f5:f9 dev sw0p1 self permanent

With this change ndm_state from the original netlink message
is passed to the new netlink message sent as notification.

bridge fdb add f1:f2:f3:f4:f5:f6 dev sw0p1 self
bridge fdb add f1:f2:f3:f4:f5:f7 dev sw0p1 temp self

bridge monitor fdb
f1:f2:f3:f4:f5:f6 dev sw0p1 self permanent
f1:f2:f3:f4:f5:f7 dev sw0p1 self static
Signed-off-by: NHubert Sokolowski <hubert.sokolowski@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b3379041

Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge · 04ad3783

由 David S. Miller 提交于 12月 16, 2015

Antonio Quartulli says:

====================
Included changes:
- change my email in MAINTAINERS and Doc files
- create and export list of single hop neighs per interface
- protect CRC in the BLA code by means of its own lock
- minor fixes and code cleanups
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

04ad3783

16 12月, 2015 26 次提交

Merge branch 'geneve-udp-port-offload' · 897ca373

由 David S. Miller 提交于 12月 16, 2015

Anjali Singhai Jain says:

====================
Add support for Geneve udp port offload

This patch series adds new ndo ops for Geneve add/del port, so as
to help offload Geneve tunnel functionalities such as RX checksum,
RSS, filters etc.

i40e driver has been tested with the changes to make sure the offloads
happen.

We do understand that this is not the ideal solution and most likely
will be redone with a more generic offload framework.
But this certainly will enable us to start seeing benefits of the
accelerations for Geneve tunnels.

As a side note, we did find an existing issue in i40e driver where a
service task can modify tunnel data structures with no locks held to
help linearize access. A separate patch will be taking care of that issue.

A question out to the community is regarding the driver Kconfig parameters
for VxLAN and Geneve, it would be ideal to drop those if there is a way
to help resolve vxlan/geneve_get_rx_port symbols while the tunnel modules
are not loaded.

Performance numbers:
With the offloads enable on X722 devices with remote checksum enabled
and no other tuning in terms of cpu governer etc on my test machine:

With offload
Throughput: 5527Mbits/sec with a single thread
%cpu: ~43% per core with 4 threads

Without offload
Throughput: 2364Mbits/sec with a single thread
%cpu: ~99% per core with 4 threads

These numbers will get better for X722 as it is being worked. But
this does bring out the delta in terms of when the stack is notified
with csum_level 1 and CHECKSUM_UNNECESSARY vs not without the RX offload.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

897ca373

i40e: Call geneve_get_rx_port to get the existing Geneve ports · cd866606

由 Singhai, Anjali 提交于 12月 14, 2015

This patch adds a call to geneve_get_rx_port in i40e so that when it
comes up it can learn about the existing geneve tunnels.
Signed-off-by: NAnjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cd866606

geneve: Add geneve_get_rx_port support · 05ca4029

由 Singhai, Anjali 提交于 12月 14, 2015

This patch adds an op that the drivers can call into to get existing
geneve ports.
Signed-off-by: NAnjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

05ca4029

i40e: Kernel dependency update for i40e to support geneve offload · c110c311

由 Singhai, Anjali 提交于 12月 14, 2015

Update the Kconfig file with dependency for supporting GENEVE tunnel
offloads.
Signed-off-by: NAnjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: NKiran Patil <kiran.patil@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c110c311

i40e: geneve tunnel offload support · 6a899024

由 Singhai, Anjali 提交于 12月 14, 2015

This patch adds driver hooks to implement ndo_ops to add/del
udp port in the HW to identify GENEVE tunnels.
Signed-off-by: NAnjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: NKiran Patil <kiran.patil@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6a899024

geneve: Add geneve udp port offload for ethernet devices · a8170d2b

由 Singhai, Anjali 提交于 12月 14, 2015

Add ndo_ops to add/del UDP ports to a device that supports geneve
offload.

v2: Comment fix.
Signed-off-by: NAnjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: NKiran Patil <kiran.patil@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a8170d2b

net: sctp: dynamically enable or disable pf state · 566178f8

由 Zhu Yanjun 提交于 12月 16, 2015

As we all know, the value of pf_retrans >= max_retrans_path can
disable pf state. The variables of pf_retrans and max_retrans_path
can be changed by the userspace application.

Sometimes the user expects to disable pf state while the 2
variables are changed to enable pf state. So it is necessary to
introduce a new variable to disable pf state.

According to the suggestions from Vlad Yasevich, extra1 and extra2
are removed. The initialization of pf_enable is added.
Acked-by: NVlad Yasevich <vyasevich@gmail.com>
Signed-off-by: NZhu Yanjun <zyjzyj2000@gmail.com>
Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

566178f8

batman-adv: lock crc access in bridge loop avoidance · 5a1dd8a4

由 Simon Wunderlich 提交于 9月 11, 2015

We have found some networks in which nodes were constantly requesting
other nodes BLA claim tables to synchronize, just to ask for that again
once completed. The reason was that the crc checksum of the asked nodes
were out of sync due to missing locking and multiple writes to the same
crc checksum when adding/removing entries. Therefore the asked nodes
constantly reported the wrong crc, which caused repeating requests.

To avoid multiple functions changing a backbone gateways crc entry at
the same time, lock it using a spinlock.
Signed-off-by: NSimon Wunderlich <sw@simonwunderlich.de>
Tested-by: NAlfons Name <AlfonsName@web.de>
Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: NAntonio Quartulli <antonio@meshcoding.com>

5a1dd8a4

batman-adv: Fix typo 'wether' -> 'whether' · c05a57f6

由 Sven Eckelmann 提交于 8月 26, 2015

Signed-off-by: NSven Eckelmann <sven@narfation.org>
Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: NAntonio Quartulli <antonio@meshcoding.com>

c05a57f6

batman-adv: Use chain pointer when purging fragments · 01f6b5c7

由 Sven Eckelmann 提交于 8月 26, 2015

The chain pointer was already created in batadv_frag_purge_orig to make the
checks more readable. Just use the chain pointer everywhere instead of
having the same dereference + array access in the most lines of this
function.
Signed-off-by: NSven Eckelmann <sven@narfation.org>
Acked-by: NMartin Hundebøll <martin@hundeboll.net>
Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: NAntonio Quartulli <antonio@meshcoding.com>

01f6b5c7

batman-adv: unify flags access style in tt global add · ad7e2c46

由 Simon Wunderlich 提交于 8月 26, 2015

This should slightly improve readability
Signed-off-by: NSimon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: NAntonio Quartulli <antonio@meshcoding.com>

ad7e2c46

batman-adv: detect local excess vlans in TT request · c169c59d

由 Simon Wunderlich 提交于 9月 02, 2015

If the local representation of the global TT table of one originator has
more VLAN entries than the respective TT update, there is some
inconsistency present. By detecting and reporting this inconsistency,
the global table gets updated and the excess VLAN will get removed in
the process.
Reported-by: NAlessandro Bolletta <alessandro@mediaspot.net>
Signed-off-by: NSimon Wunderlich <sw@simonwunderlich.de>
Acked-by: NAntonio Quartulli <antonio@meshcoding.com>
Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: NAntonio Quartulli <antonio@meshcoding.com>

c169c59d

sctp: use GFP_KERNEL in sctp_init() · 6857a02a

由 Eric Dumazet 提交于 12月 15, 2015

modules init functions being called from process context, we better
use GFP_KERNEL allocations to increase our chances to get these
high-order pages we want for SCTP hash tables.

This mostly matters if SCTP module is loaded once memory got fragmented.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6857a02a

Merge branch 'sock-diag-destroy' · 5cfe6d8a

由 David S. Miller 提交于 12月 15, 2015

Lorenzo Colitti says:

====================
Support administratively closing application sockets

This patchset adds the ability to administratively close a socket
without any action from the process owning the socket or the
socket protocol.

It implements this by adding a new diag_destroy function pointer
to struct proto. In-kernel callers can access this functionality
directly by calling sk->sk_prot->diag_destroy(sk, err).

It also exposes this functionality to userspace via a new
SOCK_DESTROY operation in the NETLINK_SOCK_DIAG sockets. This
allows a privileged userspace process, such as a connection
manager or system administration tool, to close sockets belonging
to other apps when the network they were established on has
disconnected. It is needed on laptops and mobile hosts to ensure
that network switches / disconnects do not result in applications
being blocked for long periods of time (minutes) in read or
connect calls on TCP sockets that will never succeed because the
IP address they are bound to is no longer on the system. Closing
the sockets causes these calls to fail fast and allows the apps
to reconnect on another network.

Userspace intervention is necessary because in many cases the
kernel does not have enough information to know that a connection
is now inoperable. The kernel can know if a packet can't be
routed, but in general it won't know if a TCP connection is stuck
because it is now routed to a network where its source address is
no longer valid [5][6].

Many other operating systems offer similar functionality:

 - FreeBSD has had this since 5.4 in 2005 [2]. It is available
   to privileged userspace and there is a tool to use it [3].
 - The FreeBSD commit description states that the idea came
   from OpenBSD.
 - iOS has been administratively closing app sockets since
   iOS 4 - see [4], which states that a socket "might get
   reclaimed by the kernel" and after that will return EBADF].
   For many years Android kernels have supported this via an
   out-of-tree SIOCKILLADDR ioctl that is called on every
   RTM_DELADDR event, but this solution is cleaner, more robust
   and more flexible: the connection manager can iterate over all
   connections on the deleted IP address and close all of them.
   It can also be used to close all sockets opened by a given app
   process, for example if the user has restricted that app from
   using the network, if a secure network such as a VPN has
   connected and security policy requires all of an application's
   connections to be routed via the VPN, etc.
 - For many years Android kernels have supported an out-of-tree
   SIOCKILLADDR ioctl that is called when a network disconnects
   or an RTM_DELADDR event is received. This solution is cleaner,
   more robust and more flexible. The connection manager can
   implement SIOCKILLADDR by iterating over all connections on
   the deleted IP address and close all of them, but it can also
   close all sockets opened by a given app process (for example
   if the user has restricted that app from), close all of a
   user's TCP connections if a user has connected a secure
   network such as a VPN and expects all of an application's
   connections to be routed via the VPN, etc.

Alternative schemes such as TCP keepalives in combination with
"iptables -j REJECT --reject-with tcp-reset", could be used to
achieve similar results, but on mobile devices TCP keepalives are
very expensive, and in such a scheme detecting stuck connections
has to wait for a keepalive to be sent or the application to
perform a write. An explicit notification from userspace is
cheaper and faster in the common case where an application is
blocked on read.

SOCK_DESTROY is placed behind an INET_DIAG_DESTROY configuration
option, which is currently off by default.

The TCP implementation of diag_destroy causes a TCP ABORT as
specified by RFC 793 [1]: immediately send a RST and clear local
connection state. This is what happens today if an application
enables SO_LINGER with a timeout of 0 and then calls close.

The first versions of the patchset did not send a RST, but that
is not graceful/correct TCP behaviour. tcp_abort now does a
proper RFC 793 ABORT and sends a RST to the peer. This is
consistent with BSD's tcpdrop, and is more correct in general,
even though in many use cases tcp_abort will only be called when
sending a RST is no longer possible (e.g., the network has
disconnected).

The original patchset also behaved like SIOCKILADDR and closed
TCP sockets with ETIMEDOUT. Tom Herbert pointed out that it would
be better if applications could distinguish between a timeout and
an administrative close. ECONNABORTED was chosen because it is
consistent with BSD.

[1] http://tools.ietf.org/html/rfc793#page-50
[2] http://svnweb.freebsd.org/base?view=revision&revision=141381
[3] https://www.freebsd.org/cgi/man.cgi?query=tcpdrop&sektion=8&manpath=FreeBSD+5.4-RELEASE
[4] https://developer.apple.com/library/ios/technotes/tn2277/_index.html#//apple_ref/doc/uid/DTS40010841-CH1-SUBSECTION3
[5] http://www.spinics.net/lists/netdev/msg352775.html
[6] http://www.spinics.net/lists/netdev/msg352952.html
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5cfe6d8a

net: diag: Support destroying TCP sockets. · c1e64e29

由 Lorenzo Colitti 提交于 12月 16, 2015

This implements SOCK_DESTROY for TCP sockets. It causes all
blocking calls on the socket to fail fast with ECONNABORTED and
causes a protocol close of the socket. It informs the other end
of the connection by sending a RST, i.e., initiating a TCP ABORT
as per RFC 793. ECONNABORTED was chosen for consistency with
FreeBSD.
Signed-off-by: NLorenzo Colitti <lorenzo@google.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c1e64e29

net: diag: Support SOCK_DESTROY for inet sockets. · 6eb5d2e0

由 Lorenzo Colitti 提交于 12月 16, 2015

This passes the SOCK_DESTROY operation to the underlying protocol
diag handler, or returns -EOPNOTSUPP if that handler does not
define a destroy operation.

Most of this patch is just renaming functions. This is not
strictly necessary, but it would be fairly counterintuitive to
have the code to destroy inet sockets be in a function whose name
starts with inet_diag_get.
Signed-off-by: NLorenzo Colitti <lorenzo@google.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6eb5d2e0

net: diag: Add the ability to destroy a socket. · 64be0aed

由 Lorenzo Colitti 提交于 12月 16, 2015

This patch adds a SOCK_DESTROY operation, a destroy function
pointer to sock_diag_handler, and a diag_destroy function
pointer.  It does not include any implementation code.
Signed-off-by: NLorenzo Colitti <lorenzo@google.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

64be0aed

net: diag: split inet_diag_dump_one_icsk into two · b613f56e

由 Lorenzo Colitti 提交于 12月 16, 2015

Currently, inet_diag_dump_one_icsk finds a socket and then dumps
its information to userspace. Split it into a part that finds the
socket and a part that dumps the information.
Signed-off-by: NLorenzo Colitti <lorenzo@google.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b613f56e

Merge branch 'ila-early-demux' · fec65bd4

由 David S. Miller 提交于 12月 15, 2015

Tom Herbert says:

====================
ila: Optimization to preserve value of early demux

In the current implementation of ILA, LWT is used to perform
translation on both the input and output paths. This is functional,
however there is a big performance hit in the receive path. Early
demux occurs before the routing lookup (a hit actually obviates the
route lookup). Therefore the stack currently performs early
demux before translation so that a local connection with ILA
addresses is never matched. Note that this issue is not just
with ILA, but pretty much any translated or encapsulated packet
handled by LWT would miss the opportunity for early demux. Solving
the general problem seems non trivial since we would need to move
the route lookup before early demx thereby mitigating the value.

This patch set addresses the issue for ILA by adding a fast locator
lookup that occurs before early demux. This done by hooking in to
NF_INET_PRE_ROUTING

For the backend we implement an rhashtable that contains identifier
to locator to mappings. The table also allows more specific matches
that include original locator and interface.

This patch set:
 - Add an rhashtable function to atomically replace and element.
   This is useful to implement sub-trees from a table entry
   without needing to use a special anchor structure as the
   table entry.
 - Add a start callback for starting a netlink dump.
 - Creates an ila directory under net/ipv6 and moves ila.c to it.
   ila.c is split into ila_common.c and ila_lwt.c.
 - Implement a table to do identifier->locator mapping. This is
   an rhashtable (in ila_xlat.c).
 - Configuration for the table with netlink.
 - Add a hook into NF_INET_PRE_ROUTING to perform ILA translation
   before early demux.

Changes in v2:
 - Use iptables targets instead of a new xfrm function

Changes in v3:
 - Add __rcu to next pointer in struct ila_map

Changes in v4:
 - Use hook for NF_INET_PRE_ROUTING

Changed in v5:
 - Register hooks per namespace using nf_register_net_hooks
 - Only register hooks when first mapping is actually added

Changed in v6:
  - Remove gfp argument in alloc_ila_locks, it is unnecessary
  - Set registered_hooks properly when hooks are registered

Testing:
   Running 200 netperf TCP_RR streams

No ILA, baseline
   79.26% CPU utilization
   1678282 tps
   104/189/390 50/90/99% latencies

ILA before fix (LWT on both input and output)
   81.91% CPU utilization
   1464723 tps (-14.5% from baseline)
   121/215/411 50/90/99% latencies

ILA after fix
   80.62% CPU utilization
   1622985 (-3.4% from baseline)
   110/191/347 50/90/99% latencies
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fec65bd4

ila: Add generic ILA translation facility · 7f00feaf

由 Tom Herbert 提交于 12月 15, 2015

This patch implements an ILA tanslation table. This table can be
configured with identifier to locator mappings, and can be be queried
to resolve a mapping. Queries can be parameterized based on interface,
direction (incoming or outoing), and matching locator.  The table is
implemented using rhashtable and is configured via netlink (through
"ip ila .." in iproute).

The table may be used as alternative means to do do ILA tanslations
other than the lw tunnels
Signed-off-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7f00feaf

netlink: add a start callback for starting a netlink dump · fc9e50f5

由 Tom Herbert 提交于 12月 15, 2015

The start callback allows the caller to set up a context for the
dump callbacks. Presumably, the context can then be destroyed in
the done callback.
Signed-off-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fc9e50f5

rhashtable: add function to replace an element · 3502cad7

由 Tom Herbert 提交于 12月 15, 2015

Add the rhashtable_replace_fast function. This replaces one object in
the table with another atomically. The hashes of the new and old objects
must be equal.
Signed-off-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3502cad7

ila: Create net/ipv6/ila directory · 33f11d16

由 Tom Herbert 提交于 12月 15, 2015

Create ila directory in preparation for supporting other hooks in the
kernel than LWT for doing ILA. This includes:
  - Moving ila.c to ila/ila_lwt.c
  - Splitting out some common functions into ila_common.c
Signed-off-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

33f11d16

Merge branch 'stmmac-mdio-compat' · 3026043d

由 David S. Miller 提交于 12月 15, 2015

Merge branch 'stmmac-mdio-compat'

Phil Reid says:

====================
stmmac: create of compatible mdio bus for stmacc driver

Provide ability to specify a fixed phy in the device tree and
retain the mdio bus if no phy is found. This is needed where
a dsa is connected via a fixed phy and uses the mdio bus for config.
Fixed ptp ref clock calculatins for the stmmac when ptp ref clock
is running at <= 50Mhz. Also add device tree setting to config
ptp clk source on socfpga platforms.

Changes from V5:
- Restore behaviour of unregister mdio bus when no phys found
  if there is no device tree node create the bus.
- Modify condition to allocate mdio_base_data conditional
  on fixed phy presece as well. Maintains existing behaviour
  in conditions where a fixed phy is not present.

Changes from V4:
- Restore #ifdef CONFIG_OF around setting of reset_gpio.
  Member doesn't exist when this isn't defined.

Changes from V3:
- Use if (IS_ENABLED(CONFIG_OF)) instead of #if.
  Reorder some code to reduce if statements.
- of_mdiobus_register already falls back to mdiobus_register
- Tested on system with CONFIG_OF

Changes from V2:
- Formatting, spaces & lines > 80 chars. Using checkpatch
- Drop PTP register debugfs patch.

Changes from V1:
- Fixed mismatch doc / code for ptp_ref_clk dt node.
- Remove unit address from doc example.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3026043d

stmmac: socfpga: Provide dt node to config ptp clk source. · 43569814

由 Phil Reid 提交于 12月 14, 2015

Provides an options to use the ptp clock routed from the Altera FPGA
fabric. Instead of the defalt eosc1 clock connected to the ARM HPS core.
This setting affects all emacs in the core as the ptp clock is common.
Acked-by: NRob Herring <robh@kernel.org>
Signed-off-by: NPhil Reid <preid@electromag.com.au>
Acked-by: NDinh Nguyen <dinguyen@opensource.altera.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

43569814

stmmac: Fix calculations for ptp counters when clock input = 50Mhz. · 19d857c9

由 Phil Reid 提交于 12月 14, 2015

stmmac_config_sub_second_increment set the sub second increment to 20ns.
Driver is configured to use the fine adjustment method where the sub second
register is incremented when the acculumator incremented by the addend
register wraps overflows. This accumulator is update on every ptp clk
cycle. If a ptp clk with a period of greater than 20ns was used the
sub second register would not get updated correctly.

Instead set the sub sec increment to twice the period of the ptp clk.
This result in the addend register being set mid range and overflow
the accumlator every 2 clock cycles.
Signed-off-by: NPhil Reid <preid@electromag.com.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

19d857c9