提交 · 1d2e3f396c532b95a37bbee92269f37efe908457 · openanolis / cloud-kernel

26 8月, 2015 4 次提交

RDS: restore return value in rds_cmsg_rdma_args() · 1d2e3f39

由 santosh.shilimkar@oracle.com 提交于 8月 22, 2015

In rds_cmsg_rdma_args() 'ret' is used by rds_pin_pages() which returns
number of pinned pages on success. And the same value is returned to the
caller of rds_cmsg_rdma_args() on success which is not intended.

Commit f4a3fc03 ("RDS: Clean up error handling in rds_cmsg_rdma_args")
removed the 'ret = 0' line which broke RDS RDMA mode.

Fix it by restoring the return value on rds_pin_pages() success
keeping the clean-up in place.
Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1d2e3f39

tcp: refine pacing rate determination · 43e122b0

由 Eric Dumazet 提交于 8月 21, 2015

When TCP pacing was added back in linux-3.12, we chose
to apply a fixed ratio of 200 % against current rate,
to allow probing for optimal throughput even during
slow start phase, where cwnd can be doubled every other gRTT.

At Google, we found it was better applying a different ratio
while in Congestion Avoidance phase.
This ratio was set to 120 %.

We've used the normal tcp_in_slow_start() helper for a while,
then tuned the condition to select the conservative ratio
as soon as cwnd >= ssthresh/2 :

- After cwnd reduction, it is safer to ramp up more slowly,
  as we approach optimal cwnd.
- Initial ramp up (ssthresh == INFINITY) still allows doubling
  cwnd every other RTT.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Acked-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

43e122b0

xfrm: Use VRF master index if output device is enslaved · 4ec3b28c

由 David Ahern 提交于 8月 20, 2015

Directs route lookups to VRF table. Compiles out if NET_VRF is not
enabled. With this patch able to successfully bring up ipsec tunnels
in VRFs, even with duplicate network configuration.
Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
Acked-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Acked-by: NSteffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4ec3b28c

tcp: fix slow start after idle vs TSO/GSO · 6f021c62

由 Eric Dumazet 提交于 8月 21, 2015

slow start after idle might reduce cwnd, but we perform this
after first packet was cooked and sent.

With TSO/GSO, it means that we might send a full TSO packet
even if cwnd should have been reduced to IW10.

Moving the SSAI check in skb_entail() makes sense, because
we slightly reduce number of times this check is done,
especially for large send() and TCP Small queue callbacks from
softirq context.

As Neal pointed out, we also need to perform the check
if/when receive window opens.

Tested:

Following packetdrill test demonstrates the problem
// Test of slow start after idle

`sysctl -q net.ipv4.tcp_slow_start_after_idle=1`

0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0    setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+0    bind(3, ..., ...) = 0
+0    listen(3, 1) = 0

+0    < S 0:0(0) win 65535 <mss 1000,sackOK,nop,nop,nop,wscale 7>
+0    > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 6>
+.100 < . 1:1(0) ack 1 win 511
+0    accept(3, ..., ...) = 4
+0    setsockopt(4, SOL_SOCKET, SO_SNDBUF, [200000], 4) = 0

+0    write(4, ..., 26000) = 26000
+0    > . 1:5001(5000) ack 1
+0    > . 5001:10001(5000) ack 1
+0    %{ assert tcpi_snd_cwnd == 10 }%

+.100 < . 1:1(0) ack 10001 win 511
+0    %{ assert tcpi_snd_cwnd == 20, tcpi_snd_cwnd }%
+0    > . 10001:20001(10000) ack 1
+0    > P. 20001:26001(6000) ack 1

+.100 < . 1:1(0) ack 26001 win 511
+0    %{ assert tcpi_snd_cwnd == 36, tcpi_snd_cwnd }%

+4 write(4, ..., 20000) = 20000
// If slow start after idle works properly, we should send 5 MSS here (cwnd/2)
+0    > . 26001:31001(5000) ack 1
+0    %{ assert tcpi_snd_cwnd == 10, tcpi_snd_cwnd }%
+0    > . 31001:36001(5000) ack 1
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Acked-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6f021c62

25 8月, 2015 27 次提交

Merge branch 'fjes' · 56fff0a0

由 David S. Miller 提交于 8月 24, 2015

Taku Izumi says:

====================
FUJITSU Extended Socket network device driver

This patchsets adds FUJITSU Extended Socket network device driver.
Extended Socket network device is a shared memory based high-speed
network interface between Extended Partitions of PRIMEQUEST 2000 E2
series.

You can get some information about Extended Partition and Extended
Socket by referring the following manual.

http://globalsp.ts.fujitsu.com/dmsp/Publications/public/CA92344-0537.pdf
    3.2.1 Extended Partitioning
    3.2.2 Extended Socke

v2.2 -> v3:
   - Fix up according to David's comment (No functional change)
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

56fff0a0

fjes: ethtool support · 786eec27

由 Taku Izumi 提交于 8月 21, 2015

This patch adds implementation for ethtool support.
Signed-off-by: NTaku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

786eec27

fjes: handle receive cancellation request interrupt · cb79eaae

由 Taku Izumi 提交于 8月 21, 2015

This patch adds implementation of handling IRQ
of other receiver's receive cancellation request.
Signed-off-by: NTaku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cb79eaae

fjes: epstop_task · b5a9152d

由 Taku Izumi 提交于 8月 21, 2015

This patch adds epstop_task.
This task is used to process other receiver's
cancellation request.
Signed-off-by: NTaku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b5a9152d

fjes: update_zone_task · 785f28e0

由 Taku Izumi 提交于 8月 21, 2015

This patch adds update_zone_task.
Zoning information can be changed by user.
This task is used to monitor if zoning information is
changed or not.
Signed-off-by: NTaku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

785f28e0

fjes: unshare_watch_task · 8fc4cadb

由 Taku Izumi 提交于 8月 21, 2015

This patch adds unshare_watch_task.
Shared buffer's status can be changed into unshared.
This task is used to monitor shared buffer's status.
Signed-off-by: NTaku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8fc4cadb

fjes: force_close_task · ff5b4210

由 Taku Izumi 提交于 8月 21, 2015

This patch adds force_close_task.
This task is used to close network device forcibly.
Signed-off-by: NTaku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ff5b4210

fjes: interrupt_watch_task · 8edb62a8

由 Taku Izumi 提交于 8月 21, 2015

This patch adds interrupt_watch_task.
This task is used to prevent delay of interrupts.
Signed-off-by: NTaku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8edb62a8

fjes: net_device_ops.ndo_vlan_rx_add/kill_vid · 3e3fedda

由 Taku Izumi 提交于 8月 21, 2015

This patch adds net_device_ops.ndo_vlan_rx_add_vid and
net_device_ops.ndo_vlan_rx_kill_vid callback.
Signed-off-by: NTaku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3e3fedda

fjes: net_device_ops.ndo_tx_timeout · 4393e767

由 Taku Izumi 提交于 8月 21, 2015

This patch adds net_device_ops.ndo_tx_timeout callback.
Signed-off-by: NTaku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4393e767

fjes: net_device_ops.ndo_change_mtu · b9e23a67

由 Taku Izumi 提交于 8月 21, 2015

This patch adds net_device_ops.ndo_change_mtu.
Signed-off-by: NTaku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b9e23a67

fjes: net_device_ops.ndo_get_stats64 · 879bc9a3

由 Taku Izumi 提交于 8月 21, 2015

This patch adds net_device_ops.ndo_get_stats64 callback.
Signed-off-by: NTaku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

879bc9a3

fjes: NAPI polling function · 26585930

由 Taku Izumi 提交于 8月 21, 2015

This patch adds NAPI polling function and receive related work.
Signed-off-by: NTaku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

26585930

fjes: tx_stall_task · ac63b947

由 Taku Izumi 提交于 8月 21, 2015

This patch adds tx_stall_task.
When receiver's buffer is full, sender stops
its tx queue. This task is used to monitor
receiver's status and when receiver's buffer
is avairable, it resumes tx queue.
Signed-off-by: NTaku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ac63b947

fjes: raise_intr_rxdata_task · b772b9dc

由 Taku Izumi 提交于 8月 21, 2015

This patch add raise_intr_rxdata_task.
Extended Socket Network Device is shared memory
based, so someone's transmission denotes other's
reception. In order to notify receivers, sender
has to raise interruption of receivers.
raise_intr_rxdata_task does this work.
Signed-off-by: NTaku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b772b9dc

fjes: net_device_ops.ndo_start_xmit · 9acf51cb

由 Taku Izumi 提交于 8月 21, 2015

This patch adds net_device_ops.ndo_start_xmit callback,
which is called when sending packets.
Signed-off-by: NTaku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9acf51cb

fjes: net_device_ops.ndo_open and .ndo_stop · e5d486dc

由 Taku Izumi 提交于 8月 21, 2015

This patch adds net_device_ops.ndo_open and .ndo_stop
callback. These function is called when network device
activation and deactivation.
Signed-off-by: NTaku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e5d486dc

fjes: buffer address regist/unregistration routine · 7950e6c5

由 Taku Izumi 提交于 8月 21, 2015

This patch adds buffer address regist/unregistration routine.

This function is mainly invoked when network device's
activation (open) and deactivation (close)
in order to retist/unregist shared buffer address.
Signed-off-by: NTaku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7950e6c5

fjes: ES information acquisition routine · 3bb025d4

由 Taku Izumi 提交于 8月 21, 2015

This patch adds ES information acquisition routine.
ES information can be retrieved issuing information
request command. ES information includes which
receiver is same zone.
Signed-off-by: NTaku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3bb025d4

fjes: platform_driver's .probe and .remove routine · 2fcbca68

由 Taku Izumi 提交于 8月 21, 2015

This patch implements platform_driver's .probe and .remove
routine, and also adds board specific private data structure.

This driver registers net_device at platform_driver's .probe
routine and unregisters net_device at its .remove routine.
Signed-off-by: NTaku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2fcbca68

fjes: Hardware cleanup routine · a18aaec2

由 Taku Izumi 提交于 8月 21, 2015

This patch adds hardware cleanup routine to be
invoked at driver's .remove routine.
Signed-off-by: NTaku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a18aaec2

fjes: Hardware initialization routine · 8cdc3f6c

由 Taku Izumi 提交于 8月 21, 2015

This patch adds hardware initialization routine to be
invoked at driver's .probe routine.
Signed-off-by: NTaku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8cdc3f6c

fjes: Introduce FUJITSU Extended Socket Network Device driver · 658d439b

由 Taku Izumi 提交于 8月 21, 2015

This patch adds the basic code of FUJITSU Extended Socket
Network Device driver.

When "PNP0C02" is found in ACPI DSDT, it evaluates "_STR"
to check if "PNP0C02" is for Extended Socket device driver
and retrieves ACPI resource information. Then creates
platform_device.
Signed-off-by: NTaku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

658d439b

3c59x: Add BQL support for 3c59x ethernet driver. · 4a89ba04

由 Loganaden Velvindron 提交于 8月 20, 2015

This BQL patch is based on work done by Tino Reichardt.

Tested on 0000:05:00.0: 3Com PCI 3c905C Tornado at ffffc90000e6e000 by running
Flent several times.
Signed-off-by: NLoganaden Velvindron <logan@elandsys.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4a89ba04

Merge branch 'ila-precompute' · b17f2964

由 David S. Miller 提交于 8月 24, 2015

Tom Herbert says:

====================
ila: Precompute checksums

This patch set:
 - Adds argument ot LWT build_state that holds a pointer to the fib
   configuration being applied to the new route
 - Adds support in ILA to precompute checksum difference for
   performance optimization

v2:
 - Move return argument in build_state to end of arguments

v3:
 - Update the signature for ip6_tun_build_state()
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b17f2964

ila: Precompute checksum difference for translations · 92b78aff

由 Tom Herbert 提交于 8月 24, 2015

In the ILA build state for LWT compute the checksum difference to apply
to transport checksums that include the IPv6 pseudo header. The
difference is between the route destination (from fib6_config) and the
locator to write.
Signed-off-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

92b78aff

lwt: Add cfg argument to build_state · 127eb7cd

由 Tom Herbert 提交于 8月 24, 2015

Add cfg and family arguments to lwt build state functions. cfg is a void
pointer and will either be a pointer to a fib_config or fib6_config
structure. The family parameter indicates which one (either AF_INET
or AF_INET6).

LWT encpasulation implementation may use the fib configuration to build
the LWT state.
Signed-off-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

127eb7cd

24 8月, 2015 9 次提交

net: phy: add interrupt support for aquantia phy · 54cf7be9

由 Shaohui Xie 提交于 8月 21, 2015

By implementing config_intr & ack_interrupt, now the phy can support
link connect/disconnect interrupt.
Signed-off-by: NShaohui Xie <Shaohui.Xie@freescale.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

54cf7be9

Merge tag 'nfc-next-4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-next · d9893d13

由 David S. Miller 提交于 8月 23, 2015

Samuel Ortiz says:

====================
NFC 4.3 pull request

This is the NFC pull request for 4.3.
With this one we have:

- A new driver for Samsung's S3FWRN5 NFC chipset. In order to
  properly support this driver, a few NCI core routines needed
  to be exported. Future drivers like Intel's Fields Peak will
  benefit from this.

- SPI support as a physical transport for STM st21nfcb.

- An additional netlink API for sending replies back to userspace
  from vendor commands.

- 2 small fixes for TI's trf7970a

- A few st-nci fixes.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d9893d13

route: fix breakage after moving lwtunnel state · 751a587a

由 Jiri Benc 提交于 8月 21, 2015

__recnt and related fields need to be in its own cacheline for performance
reasons. Commit 61adedf3 ("route: move lwtunnel state to dst_entry")
broke that on 32bit archs, causing BUILD_BUG_ON in dst_hold to be triggered.

This patch fixes the breakage by moving the lwtunnel state to the end of
dst_entry on 32bit archs. Unfortunately, this makes it share the cacheline
with __refcnt and may affect performance, thus further patches may be
needed.
Reported-by: Nkbuild test robot <fengguang.wu@intel.com>
Fixes: 61adedf3 ("route: move lwtunnel state to dst_entry")
Signed-off-by: NJiri Benc <jbenc@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

751a587a

Merge tag 'linux-can-next-for-4.3-20150820' of... · 31fbde99

由 David S. Miller 提交于 8月 23, 2015

Merge tag 'linux-can-next-for-4.3-20150820' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next

Marc Kleine-Budde says:

====================
this is a pull request of a two patches for net-next.

The first patch is by Nik Nyby and fixes a typo in a function name. The
second patch by Lucas Stach demotes register output to debug level.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

31fbde99

Merge branch 'tipc-failover-fixes' · c5f98b56

由 David S. Miller 提交于 8月 23, 2015

Jon Maloy says:

====================
tipc: fix link failover/synch problems

We fix three problems with the new link failover/synch implementation,
which was introduced earlier in this release cycle. They are all related
to situations where there is a very short interval between the disabling
and enabling of interfaces.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c5f98b56

tipc: fix stale link problem during synchronization · 2be80c2d

由 Jon Paul Maloy 提交于 8月 20, 2015

Recent changes to the link synchronization means that we can now just
drop packets arriving on the synchronizing link before the synch point
is reached. This has lead to significant simplifications to the
implementation, but also turns out to have a flip side that we need
to consider.

Under unlucky circumstances, the two endpoints may end up
repeatedly dropping each other's packets, while immediately
asking for retransmission of the same packets, just to drop
them once more. This pattern will eventually be broken when
the synch point is reached on the other link, but before that,
the endpoints may have arrived at the retransmission limit
(stale counter) that indicates that the link should be broken.
We see this happen at rare occasions.

The fix for this is to not ask for retransmissions when a link is in
state LINK_SYNCHING. The fact that the link has reached this state
means that it has already received the first SYNCH packet, and that it
knows the synch point. Hence, it doesn't need any more packets until the
other link has reached the synch point, whereafter it can go ahead and
ask for the missing packets.

However, because of the reduced traffic on the synching link that
follows this change, it may now take longer to discover that the
synch point has been reached. We compensate for this by letting all
packets, on any of the links, trig a check for synchronization
termination. This is possible because the packets themselves don't
contain any information that is needed for discovering this condition.
Reviewed-by: NYing Xue <ying.xue@windriver.com>
Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2be80c2d

tipc: interrupt link synchronization when a link goes down · 5ae2f8e6

由 Jon Paul Maloy 提交于 8月 20, 2015

When we introduced the new link failover/synch mechanism
in commit 6e498158
("tipc: move link synch and failover to link aggregation level"),
we missed the case when the non-tunnel link goes down during the link
synchronization period. In this case the tunnel link will remain in
state LINK_SYNCHING, something leading to unpredictable behavior when
the failover procedure is initiated.

In this commit, we ensure that the node and remaining link goes
back to regular communication state (SELF_UP_PEER_UP/LINK_ESTABLISHED)
when one of the parallel links goes down. We also ensure that we don't
re-enter synch mode if subsequent SYNCH packets arrive on the remaining
link.
Reviewed-by: NYing Xue <ying.xue@windriver.com>
Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5ae2f8e6

tipc: eliminate risk of premature link setup during failover · 17b20630

由 Jon Paul Maloy 提交于 8月 20, 2015

When a link goes down, and there is still a working link towards its
destination node, a failover is initiated, and the failed link is not
allowed to re-establish until that procedure is finished. To ensure
this, the concerned link endpoints are set to state LINK_FAILINGOVER,
and the node endpoints to NODE_FAILINGOVER during the failover period.

However, if the link reset is due to a disabled bearer, the corres-
ponding link endpoint is deleted, and only the node endpoint knows
about the ongoing failover. Now, if the disabled bearer is re-enabled
during the failover period, the discovery mechanism may create a new
link endpoint that is ready to be established, despite that this is not
permitted. This situation may cause both the ongoing failover and any
subsequent link synchronization to fail.

In this commit, we ensure that a newly created link goes directly to
state LINK_FAILINGOVER if the corresponding node state is
NODE_FAILINGOVER. This eliminates the problem described above.

Furthermore, we tighten the criteria for which packets are allowed
to end a failover state in the function tipc_node_check_state().
By checking that the receiving link is up and running, instead of just
checking that it is not in failover mode, we eliminate the risk that
protocol packets from the re-created link may cause the failover to
be prematurely terminated.
Reviewed-by: NYing Xue <ying.xue@windriver.com>
Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

17b20630

Merge branch 'nps_enet_fixes' · 7f629be1

由 David S. Miller 提交于 8月 23, 2015

Noam Camus says:

====================
*** nps_enet fixups ***

Change v2
TX done is handled back with NAPI poll.

Change v1
This patch set is a bunch of fixes to make nps_enet work correctly with
all platforms, i.e. real device, emulation system, and simulation system.
The main trigger for this patch set was that in our emulation system
the TX end interrupt is "edge-sensitive" and therefore we cannot use the
cause register since it is not sticky.
Also:
TX is handled during HW interrupt context and not NAPI job.
race with TX done was fixed.
added acknowledge for TX when device is "level sensitive".
enable drop of control frames which is not needed for regular usage.

So most of this patch set is about TX handling, which is now more complete.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7f629be1

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功