提交 · 718e14bb292a2e16b506133d191886110417df51 · openeuler / raspberrypi-kernel

14 1月, 2017 16 次提交

Merge branch 'tcp-RACK-fast-recovery' · 718e14bb

由 David S. Miller 提交于 1月 13, 2017

Yuchung Cheng says:

====================
tcp: RACK fast recovery

The patch set enables RACK loss detection (draft-ietf-tcpm-rack-01)
to trigger fast recovery with a reordering timer.

Previously RACK has been running in auxiliary mode where it is
used to detect packet losses once the recovery has triggered by
other algorithms (e.g., FACK). By inspecting packet timestamps,
RACK can start ACK-driven repairs timely. A few similar heuristics
are no longer needed and are either removed or disabled to reduce
the complexity of the Linux TCP loss recovery engine:

  1. FACK (Forward Acknowledgement)
  2. Early Retransmit (RFC5827)
  3. thin_dupack (fast recovery on single DUPACK for thin-streams)
  4. NCR (Non-Congestion Robustness RFC4653) (RFC4653)
  5. Forward Retransmit

After this change, Linux's loss recovery algorithms consist of
  1. Conventional DUPACK threshold approach (RFC6675)
  2. RACK and Tail Loss Probe (draft-ietf-tcpm-rack-01)
  3. RTO plus F-RTO extension (RFC5682)

The patch set has been tested on Google servers extensively and
presented in several IETF meetings. The data suggests that RACK
successfully improves recovery performance:
https://www.ietf.org/proceedings/97/slides/slides-97-tcpm-draft-ietf-tcpm-rack-01.pdf
https://www.ietf.org/proceedings/96/slides/slides-96-tcpm-3.pdf
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

718e14bb

tcp: disable fack by default · 94bdc978

由 Yuchung Cheng 提交于 1月 12, 2017

This patch disables FACK by default as RACK is the successor of FACK
(inspired by the insights behind FACK).

FACK[1] in Linux works as follows: a packet P is deemed lost,
if packet Q of higher sequence is s/acked and P and Q are distant
by at least dupthresh number of packets in sequence space.

FACK is more aggressive than the IETF recommened recovery for SACK
(RFC3517 A Conservative Selective Acknowledgment (SACK)-based Loss
 Recovery Algorithm for TCP), because a single SACK may trigger
fast recovery. This obviously won't work well with reordering so
FACK is dynamically disabled upon detecting reordering.

RACK supersedes FACK by using time distance instead of sequence
distance. On reordering, RACK waits for a quarter of RTT receiving
a single SACK before starting recovery. (the timer can be made more
adaptive in the future by measuring reordering distance in time,
but currently RTT/4 seem to work well.) Once the recovery starts,
RACK behaves almost like FACK because it reduces the reodering
window to 1ms, so it fast retransmits quickly. In addition RACK
can detect loss retransmission as it does not care about the packet
sequences (being repeated or not), which is extremely useful when
the connection is going through a traffic policer.

Google server experiments indicate that disabling FACK after enabling
RACK has negligible impact on the overall loss recovery performance
with more reordering events detected.  But we still keep the FACK
implementation for backup if RACK has bugs that needs to be disabled.

[1] M. Mathis, J. Mahdavi, "Forward Acknowledgment: Refining
TCP Congestion Control," In Proceedings of SIGCOMM '96, August 1996.
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NNeal Cardwell <ncardwell@google.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

94bdc978

tcp: remove thin_dupack feature · 4a7f6009

由 Yuchung Cheng 提交于 1月 12, 2017

Thin stream DUPACK is to start fast recovery on only one DUPACK
provided the connection is a thin stream (i.e., low inflight).  But
this older feature is now subsumed with RACK. If a connection
receives only a single DUPACK, RACK would arm a reordering timer
and soon starts fast recovery instead of timeout if no further
ACKs are received.

The socket option (THIN_DUPACK) is kept as a nop for compatibility.
Note that this patch does not change another thin-stream feature
which enables linear RTO. Although it might be good to generalize
that in the future (i.e., linear RTO for the first say 3 retries).
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NNeal Cardwell <ncardwell@google.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4a7f6009

tcp: remove RFC4653 NCR · ac229dca

由 Yuchung Cheng 提交于 1月 12, 2017

This patch removes the (partial) implementation of the aggressive
limited transmit in RFC4653 TCP Non-Congestion Robustness (NCR).

NCR is a mitigation to the problem created by the dynamic
DUPACK threshold.  With the current adaptive DUPACK threshold
(tp->reordering) could cause timeouts by preventing fast recovery.
For example, if the last packet of a cwnd burst was reordered, the
threshold will be set to the size of cwnd. But if next application
burst is smaller than threshold and has drops instead of reorderings,
the sender would not trigger fast recovery but instead resorts to a
timeout recovery.

NCR mitigates this issue by checking the number of DUPACKs against
the current flight size additionally. The techniqueue is similar to
the early retransmit RFC.

With RACK loss detection, this mitigation is not needed, because RACK
does not use DUPACK threshold to detect losses. RACK arms a reordering
timer to fire at most a quarter RTT later to start fast recovery.
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NNeal Cardwell <ncardwell@google.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ac229dca

tcp: remove early retransmit · bec41a11

由 Yuchung Cheng 提交于 1月 12, 2017

This patch removes the support of RFC5827 early retransmit (i.e.,
fast recovery on small inflight with <3 dupacks) because it is
subsumed by the new RACK loss detection. More specifically when
RACK receives DUPACKs, it'll arm a reordering timer to start fast
recovery after a quarter of (min)RTT, hence it covers the early
retransmit except RACK does not limit itself to specific inflight
or dupack numbers.
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NNeal Cardwell <ncardwell@google.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bec41a11

tcp: remove forward retransmit feature · 840a3cbe

由 Yuchung Cheng 提交于 1月 12, 2017

Forward retransmit is an esoteric feature in RFC3517 (condition(3)
in the NextSeg()). Basically if a packet is not considered lost by
the current criteria (# of dupacks etc), but the congestion window
has room for more packets, then retransmit this packet.

However it actually conflicts with the rest of recovery design. For
example, when reordering is detected we want to be conservative
in retransmitting packets but forward-retransmit feature would
break that to force more retransmission. Also the implementation is
fairly complicated inside the retransmission logic inducing extra
iterations in the write queue. With RACK losses are being detected
timely and this heuristic is no longer necessary. There this patch
removes the feature.
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NNeal Cardwell <ncardwell@google.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

840a3cbe

tcp: extend F-RTO to catch more spurious timeouts · 89fe18e4

由 Yuchung Cheng 提交于 1月 12, 2017

Current F-RTO reverts cwnd reset whenever a never-retransmitted
packet was (s)acked. The timeout can be declared spurious because
the packets acknoledged with this ACK was transmitted before the
timeout, so clearly not all the packets are lost to reset the cwnd.

This nice detection does not really depend F-RTO internals. This
patch applies the detection universally. On Google servers this
change detected 20% more spurious timeouts.
Suggested-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NNeal Cardwell <ncardwell@google.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

89fe18e4

tcp: enable RACK loss detection to trigger recovery · a0370b3f

由 Yuchung Cheng 提交于 1月 12, 2017

This patch changes two things:

1. Start fast recovery with RACK in addition to other heuristics
   (e.g., DUPACK threshold, FACK). Prior to this change RACK
   is enabled to detect losses only after the recovery has
   started by other algorithms.

2. Disable TCP early retransmit. RACK subsumes the early retransmit
   with the new reordering timer feature. A latter patch in this
   series removes the early retransmit code.
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NNeal Cardwell <ncardwell@google.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a0370b3f

tcp: check undo conditions before detecting losses · 98e36d44

由 Yuchung Cheng 提交于 1月 12, 2017

Currently RACK would mark loss before the undo operations in TCP
loss recovery. This could incorrectly identify real losses as
spurious. For example a sender first experiences a delay spike and
then eventually some packets were lost due to buffer overrun.
In this case, the sender should perform fast recovery b/c not all
the packets were lost.

But the sender may first trigger a (spurious) RTO and reset
cwnd to 1. The following ACKs may used to mark real losses by
tcp_rack_mark_lost. Then in tcp_process_loss this ACK could trigger
F-RTO undo condition and unmark real losses and revert the cwnd
reduction. If there are no more ACKs coming back, eventually the
sender would timeout again instead of performing fast recovery.

The patch fixes this incorrect process by always performing
the undo checks before detecting losses.

Fixes: 4f41b1c5 ("tcp: use RACK to detect losses")
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NNeal Cardwell <ncardwell@google.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

98e36d44

tcp: use sequence to break TS ties for RACK loss detection · 1d0833df

由 Yuchung Cheng 提交于 1月 12, 2017

The packets inside a jumbo skb (e.g., TSO) share the same skb
timestamp, even though they are sent sequentially on the wire. Since
RACK is based on time, it can not detect some packets inside the
same skb are lost.  However, we can leverage the packet sequence
numbers as extended timestamps to detect losses. Therefore, when
RACK timestamp is identical to skb's timestamp (i.e., one of the
packets of the skb is acked or sacked), we use the sequence numbers
of the acked and unacked packets to break ties.

We can use the same sequence logic to advance RACK xmit time as
well to detect more losses and avoid timeout.
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NNeal Cardwell <ncardwell@google.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1d0833df

tcp: add reordering timer in RACK loss detection · 57dde7f7

由 Yuchung Cheng 提交于 1月 12, 2017

This patch makes RACK install a reordering timer when it suspects
some packets might be lost, but wants to delay the decision
a little bit to accomodate reordering.

It does not create a new timer but instead repurposes the existing
RTO timer, because both are meant to retransmit packets.
Specifically it arms a timer ICSK_TIME_REO_TIMEOUT when
the RACK timing check fails. The wait time is set to

  RACK.RTT + RACK.reo_wnd - (NOW - Packet.xmit_time) + fudge

This translates to expecting a packet (Packet) should take
(RACK.RTT + RACK.reo_wnd + fudge) to deliver after it was sent.

When there are multiple packets that need a timer, we use one timer
with the maximum timeout. Therefore the timer conservatively uses
the maximum window to expire N packets by one timeout, instead of
N timeouts to expire N packets sent at different times.

The fudge factor is 2 jiffies to ensure when the timer fires, all
the suspected packets would exceed the deadline and be marked lost
by tcp_rack_detect_loss(). It has to be at least 1 jiffy because the
clock may tick between calling icsk_reset_xmit_timer(timeout) and
actually hang the timer. The next jiffy is to lower-bound the timeout
to 2 jiffies when reo_wnd is < 1ms.

When the reordering timer fires (tcp_rack_reo_timeout): If we aren't
in Recovery we'll enter fast recovery and force fast retransmit.
This is very similar to the early retransmit (RFC5827) except RACK
is not constrained to only enter recovery for small outstanding
flights.
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NNeal Cardwell <ncardwell@google.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

57dde7f7

tcp: record most recent RTT in RACK loss detection · deed7be7

由 Yuchung Cheng 提交于 1月 12, 2017

Record the most recent RTT in RACK. It is often identical to the
"ca_rtt_us" values in tcp_clean_rtx_queue. But when the packet has
been retransmitted, RACK choses to believe the ACK is for the
(latest) retransmitted packet if the RTT is over minimum RTT.

This requires passing the arrival time of the most recent ACK to
RACK routines. The timestamp is now recorded in the "ack_time"
in tcp_sacktag_state during the ACK processing.

This patch does not change the RACK algorithm itself. It only adds
the RTT variable to prepare the next main patch.
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NNeal Cardwell <ncardwell@google.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

deed7be7

tcp: new helper for RACK to detect loss · e636f8b0

由 Yuchung Cheng 提交于 1月 12, 2017

Create a new helper tcp_rack_detect_loss to prepare the upcoming
RACK reordering timer patch.
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NNeal Cardwell <ncardwell@google.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e636f8b0

tcp: new helper function for RACK loss detection · db8da6bb

由 Yuchung Cheng 提交于 1月 12, 2017

Create a new helper tcp_rack_mark_skb_lost to prepare the
upcoming RACK reordering timer support.
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NNeal Cardwell <ncardwell@google.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

db8da6bb

liquidio: use fallback for selecting txq · 7410191a

由 Satanand Burla 提交于 1月 12, 2017

Remove assignment to ndo_select_queue so that fallback is used for
selecting txq.  Also remove the now-useless function that used to be
assigned to ndo_select_queue.
Signed-off-by: NSatanand Burla <satananda.burla@cavium.com>
Signed-off-by: NFelix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: NDerek Chickles <derek.chickles@cavium.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7410191a

net: dsa: mv88e6xxx: add EEPROM support to 6390 · 98fc3c6f

由 Vivien Didelot 提交于 1月 12, 2017

The Marvell 6352 chip has a 8-bit address/16-bit data EEPROM access.
The Marvell 6390 chip has a 16-bit address/8-bit data EEPROM access.

This patch implements the 8-bit data EEPROM access in the mv88e6xxx
driver and adds its support to chips of the 6390 family.
Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

98fc3c6f

13 1月, 2017 11 次提交

ipv6: sr: static percpu allocation for hmac_ring · 717ac5ce

由 Eric Dumazet 提交于 1月 12, 2017

Current allocations are not NUMA aware, and lack proper
cleanup in case of error.

It is perfectly fine to use static per cpu allocations for 256 bytes
per cpu.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: David Lebrun <david.lebrun@uclouvain.be>
Acked-by: NDavid Lebrun <david.lebrun@uclouvain.be>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

717ac5ce

ipmr: improve hash scalability · 8fb472c0

由 Nikolay Aleksandrov 提交于 1月 12, 2017

Recently we started using ipmr with thousands of entries and easily hit
soft lockups on smaller devices. The reason is that the hash function
uses the high order bits from the src and dst, but those don't change in
many common cases, also the hash table  is only 64 elements so with
thousands it doesn't scale at all.
This patch migrates the hash table to rhashtable, and in particular the
rhl interface which allows for duplicate elements to be chained because
of the MFC_PROXY support (*,G; *,*,oif cases) which allows for multiple
duplicate entries to be added with different interfaces (IMO wrong, but
it's been in for a long time).

And here are some results from tests I've run in a VM:
 mr_table size (default, allocated for all namespaces):
  Before                    After
   49304 bytes               2400 bytes

 Add 65000 routes (the diff is much larger on smaller devices):
  Before                    After
   1m42s                     58s

 Forwarding 256 byte packets with 65000 routes (test done in a VM):
  Before                    After
   3 Mbps / ~1465 pps        122 Mbps / ~59000 pps

As a bonus we no longer see the soft lockups on smaller devices which
showed up even with 2000 entries before.
Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8fb472c0

secure_seq: fix sparse errors · c1ce1560

由 Eric Dumazet 提交于 1月 11, 2017

Fixes following warnings :

net/core/secure_seq.c:125:28: warning: incorrect type in argument 1
(different base types)
net/core/secure_seq.c:125:28:    expected unsigned int const [unsigned]
[usertype] a
net/core/secure_seq.c:125:28:    got restricted __be32 [usertype] saddr
net/core/secure_seq.c:125:35: warning: incorrect type in argument 2
(different base types)
net/core/secure_seq.c:125:35:    expected unsigned int const [unsigned]
[usertype] b
net/core/secure_seq.c:125:35:    got restricted __be32 [usertype] daddr
net/core/secure_seq.c:125:43: warning: cast from restricted __be16
net/core/secure_seq.c:125:61: warning: restricted __be16 degrades to
integer

Fixes: 7cd23e53 ("secure_seq: use SipHash in place of MD5")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reviewed-by: NJason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c1ce1560

liquidio VF: reduce load time of module · a8ac1a55

由 Prasad Kanneganti 提交于 1月 11, 2017

Reduce the load time of the VF driver by decreasing the wait time between
iterations of the loop that polls for a mailbox response from the PF. Also
change the wait time units from jiffies to milliseconds.
Signed-off-by: NPrasad Kanneganti <prasad.kanneganti@cavium.com>
Signed-off-by: NFelix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: NRaghu Vatsavayi <raghu.vatsavayi@cavium.com>
Signed-off-by: NDerek Chickles <derek.chickles@cavium.com>
Signed-off-by: NSatanand Burla <satananda.burla@cavium.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a8ac1a55

liquidio: remove unnecessary code · cb2336b5

由 Felix Manlunas 提交于 1月 11, 2017

Remove code that's no longer needed.  It used to serve a purpose, which was
to fix a link-related bug.  For a while now, the NIC firmware has had a
more elegant fix for that bug.
Signed-off-by: NFelix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: NDerek Chickles <derek.chickles@cavium.com>
Signed-off-by: NSatanand Burla <satananda.burla@cavium.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cb2336b5

tilepro: Fix non-void return from void function · b65b09aa

由 Joe Perches 提交于 1月 11, 2017

commit bc1f4470 ("net: make ndo_get_stats64 a void function")
mistakenly used a return value for this void conversion.

Fix it.
Signed-off-by: NJoe Perches <joe@perches.com>
cc: stephen hemminger <stephen@networkplumber.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b65b09aa

Merge branch 'mdio-gpio-next' · 72d13c15

由 David S. Miller 提交于 1月 12, 2017

Florian Fainelli says:

====================
net: mdio-gpio: Use modern GPIO helpers

This patch series modernizes the mdio-gpio and makes it switch to the
latest and greatest API for manipulating GPIO lines, thus allowing
some simplifications in the driver.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

72d13c15

net: mdio-gpio: Use gpio subsystem to handle low-active pins · 52aab18e

由 Guenter Roeck 提交于 1月 11, 2017

gpiod functions support handling low-active pins, so we can move
thos code out of this driver into the gpio subsystem and simplify
the code a bit.
Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

52aab18e

net: mdio-gpio: Convert to use gpiod functions where possible · 7e5fbd1e

由 Guenter Roeck 提交于 1月 11, 2017

Using gpiod functions lets us use functionality which is not available
with gpio functions.

There is no gpiod function to match devm_gpio_request_one, so leave it
in place and use gpio_to_desc() to convert absolute pin numbers to gpio
descriptors.
Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7e5fbd1e

net: mdio-gpio: Use devm_gpio_request_one instead of devm_gpio_request · 08d9665c

由 Guenter Roeck 提交于 1月 11, 2017

Using devm_gpio_request_one lets us request gpio pins with initial state
in one go.
Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

08d9665c

cdc-ether: usbnet_cdc_zte_status() can be static · 37c9782c

由 Wei Yongjun 提交于 1月 12, 2017

Fixes the following sparse warning:

drivers/net/usb/cdc_ether.c:469:6: warning:
 symbol 'usbnet_cdc_zte_status' was not declared. Should it be static?
Signed-off-by: NWei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

37c9782c

12 1月, 2017 13 次提交

tools: psock_lib: harden socket filter used by psock tests · 4d7b9dc1

由 Sowmini Varadhan 提交于 1月 12, 2017

The filter added by sock_setfilter is intended to only permit
packets matching the pattern set up by create_payload(), but
we only check the ip_len, and a single test-character in
the IP packet to ensure this condition.

Harden the filter by adding additional constraints so that we only
permit UDP/IPv4 packets that meet the ip_len and test-character
requirements. Include the bpf_asm src as a comment, in case this
needs to be enhanced in the future
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4d7b9dc1

lwt_bpf: bpf_lwt_prog_cmp() can be static · 79471b10

由 Wei Yongjun 提交于 1月 12, 2017

Fixes the following sparse warning:

net/core/lwt_bpf.c:355:5: warning:
 symbol 'bpf_lwt_prog_cmp' was not declared. Should it be static?
Signed-off-by: NWei Yongjun <weiyongjun1@huawei.com>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

79471b10

Merge branch 's390-qeth-next' · 5df285f6

由 David S. Miller 提交于 1月 12, 2017

Ursula Braun says:

====================
s390: qeth patches

yesterday I came up with 13 qeth patches. Since you have not been
happy with the 13th patch, I want to make sure that at least the
remaining 12 qeth patches can be applied to net-next. Here is the
resend of them.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5df285f6

s390/qeth: fix retrieval of vipa and proxy-arp addresses · e48b9eaa

由 Ursula Braun 提交于 1月 12, 2017

qeth devices in layer3 mode need a separate handling of vipa and proxy-arp
addresses. vipa and proxy-arp addresses processed by qeth can be read from
userspace. Introduced with commit 5f78e29c ("qeth: optimize IP handling
in rx_mode callback") the retrieval of vipa and proxy-arp addresses is
broken, if more than one vipa or proxy-arp address are set.

The qeth code used local variable "int i" for 2 different purposes. This
patch now spends 2 separate local variables of type "int".
While touching these functions hash_for_each_safe() is converted to
hash_for_each(), since there is no removal of hash entries.
Signed-off-by: NUrsula Braun <ubraun@linux.vnet.ibm.com>
Reviewed-by: NJulian Wiedmann <jwi@linux.vnet.ibm.com>
Reference-ID: RQM 3524
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e48b9eaa

s390/qeth: issue STARTLAN as first IPA command · 10340510

由 Julian Wiedmann 提交于 1月 12, 2017

STARTLAN needs to be the first IPA command after MPC initialization
completes.
So move the qeth_send_startlan() call from the layer disciplines
into the core path, right after the MPC handshake.
While at it, replace the magic LAN OFFLINE return code
with the existing enum.
Signed-off-by: NJulian Wiedmann <jwi@linux.vnet.ibm.com>
Reviewed-by: NThomas Richter <tmricht@linux.vnet.ibm.com>
Reviewed-by: NUrsula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

10340510

s390/qeth: shuffle MAC management functions around · ac988d78

由 Julian Wiedmann 提交于 1月 12, 2017

Move all MAC utility functions in one place, and drop the
forward declarations.
Signed-off-by: NJulian Wiedmann <jwi@linux.vnet.ibm.com>
Reviewed-by: NThomas Richter <tmricht@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ac988d78

s390/qeth: extract qeth_l2_remove_mac() · 979d7929

由 Julian Wiedmann 提交于 1月 12, 2017

This matches qeth_l2_write_mac().
Signed-off-by: NJulian Wiedmann <jwi@linux.vnet.ibm.com>
Reviewed-by: NThomas Richter <tmricht@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

979d7929

s390/qeth: consolidate errno translation · 754e0b8d

由 Julian Wiedmann 提交于 1月 12, 2017

Consolidate errno handling for MAC management: Instead of doing this in every
caller, do it in one place.
Signed-off-by: NJulian Wiedmann <jwi@linux.vnet.ibm.com>
Reviewed-by: NThomas Richter <tmricht@linux.vnet.ibm.com>
Suggested-by: NUrsula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

754e0b8d

s390/qeth: don't convert return code twice · 4b764d1d

由 Julian Wiedmann 提交于 1月 12, 2017

qeth_l2_send_groupmac() already translates the return code, so
calling qeth_setdel_makerc() a second time only produces garbage.
Signed-off-by: NJulian Wiedmann <jwi@linux.vnet.ibm.com>
Reviewed-by: NThomas Richter <tmricht@linux.vnet.ibm.com>
Reviewed-by: NUrsula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4b764d1d

s390/qeth: drop qeth_l2_del_all_macs() parameter · c07cbf2e

由 Julian Wiedmann 提交于 1月 12, 2017

The only caller passes del = 0, so remove both the parameter and
the code that handles != 0.
Signed-off-by: NJulian Wiedmann <jwi@linux.vnet.ibm.com>
Reviewed-by: NThomas Richter <tmricht@linux.vnet.ibm.com>
Acked-by: NUrsula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c07cbf2e

s390/qeth: Remove QETH_IP_HEADER_SIZE · c2a7ee2a

由 Julian Wiedmann 提交于 1月 12, 2017

Remove unused define QETH_IP_HEADER_SIZE.
Signed-off-by: NJulian Wiedmann <jwi@linux.vnet.ibm.com>
Reviewed-by: NThomas Richter <tmricht@linux.vnet.ibm.com>
Acked-by: NUrsula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c2a7ee2a

s390/qeth: Allow reading hsuid in state DOWN · dadc08c7

由 Julian Wiedmann 提交于 1月 12, 2017

Accessing the current hsuid via card->options.hsuid is perfectly
fine, even when the card is DOWN.
Signed-off-by: NJulian Wiedmann <jwi@linux.vnet.ibm.com>
Reviewed-by: NThomas Richter <tmricht@linux.vnet.ibm.com>
Acked-by: NUrsula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dadc08c7

s390/qeth: display warning for OSA3 RX/TX checksum offloading · dae84c8e

由 Thomas Richter 提交于 1月 12, 2017

When RX/TX checksum offloading is turned on and the adapter is
an OSA 3 card in layer 3 mode, the checksum offloading is only
performed when both peers use different adapters. If both peers
share an OSA 3 card, communication is a memory copy and
checksum offloading is not performed.

This patch adds a warning to inform the administrator.

OSA 3 in layer 2 mode does not offer the RX/TX checksum
offload feature.
Signed-off-by: NThomas Richter <tmricht@linux.vnet.ibm.com>
Reviewed-by: NJulian Wiedmann <jwi@linux.vnet.ibm.com>
Reviewed-by: NUrsula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dae84c8e