提交 · 357b6cc5834eabc1be7c28a9faae7da061df097d · openeuler / Kernel

19 3月, 2020 13 次提交

netfilter: revert introduction of egress hook · 357b6cc5

由 Daniel Borkmann 提交于 3月 18, 2020

This reverts the following commits:

  8537f786 ("netfilter: Introduce egress hook")
  5418d388 ("netfilter: Generalize ingress hook")
  b030f194 ("netfilter: Rename ingress hook include file")

>From the discussion in [0], the author's main motivation to add a hook
in fast path is for an out of tree kernel module, which is a red flag
to begin with. Other mentioned potential use cases like NAT{64,46}
is on future extensions w/o concrete code in the tree yet. Revert as
suggested [1] given the weak justification to add more hooks to critical
fast-path.

  [0] https://lore.kernel.org/netdev/cover.1583927267.git.lukas@wunner.de/
  [1] https://lore.kernel.org/netdev/20200318.011152.72770718915606186.davem@davemloft.net/Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Cc: David Miller <davem@davemloft.net>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Nacked-by: NPablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

357b6cc5

Merge branch 's390-qeth-next' · ce7964bd

由 David S. Miller 提交于 3月 18, 2020

Julian Wiedmann says:

====================
s390/qeth: updates 2020-03-18

please apply the following patch series for qeth to netdev's net-next
tree.

This consists of three parts:
1) support for __GFP_MEMALLOC,
2) several ethtool enhancements (.set_channels, SW Timestamping),
3) the usual cleanups.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ce7964bd

s390/qeth: use dev->reg_state · cd652be5

由 Julian Wiedmann 提交于 3月 18, 2020

To check whether a netdevice has already been registered, look at
NETREG_REGISTERED to replace some hacks I added a while ago.
Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cd652be5

s390/qeth: remove gratuitous NULL checks · 5bcd8ad9

由 Julian Wiedmann 提交于 3月 18, 2020

qeth_do_ioctl() is only reached through our own net_device_ops, so we
can trust that dev->ml_priv still contains what we put there earlier.

qeth_bridgeport_an_set() is an internal function that doesn't require
such sanity checks.
Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5bcd8ad9

s390/qeth: add phys_to_virt() translation for AOB · 86e7a4e4

由 Julian Wiedmann 提交于 3月 18, 2020

Data addresses in the AOB are absolute, and need to be translated before
being fed into kmem_cache_free(). Currently this phys_to_virt() is a no-op.
Also see commit 2db01da8 ("s390/qdio: fill SBALEs with absolute addresses").
Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

86e7a4e4

s390/qeth: don't report hard-coded driver version · 54e73b9c

由 Julian Wiedmann 提交于 3月 18, 2020

Versions are meaningless for an in-kernel driver.
Instead use the UTS_RELEASE that is set by ethtool_get_drvinfo().

Cc: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

54e73b9c

s390/qeth: add SW timestamping support for IQD devices · 8d145da2

由 Julian Wiedmann 提交于 3月 18, 2020

This adds support for SOF_TIMESTAMPING_TX_SOFTWARE.
No support for non-IQD devices, since they orphan the skb in their xmit
path.

To play nice with TX bulking, set the timestamp when the buffer that
contains the skb(s) is actually flushed out to HW.
Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8d145da2

s390/qeth: balance the TX queue selection for IQD devices · 5d8ce41c

由 Julian Wiedmann 提交于 3月 18, 2020

For ucast traffic, qeth_iqd_select_queue() falls back to
netdev_pick_tx(). This will potentially use skb_tx_hash() to distribute
the flow over all active TX queues - so txq 0 is a valid selection, and
qeth_iqd_select_queue() needs to check for this and put it on some other
queue. As a result, the distribution for ucast flows is unbalanced and
hits QETH_IQD_MIN_UCAST_TXQ heavier than the other queues.

Open-coding a custom variant of skb_tx_hash() isn't an option, since
netdev_pick_tx() also gives us eg. access to XPS. But we can pull a
little trick: add a single TC class that excludes the mcast txq, and
thus encourage skb_tx_hash() to not pick the mcast txq.
Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5d8ce41c

s390/qeth: allow configuration of TX queues for IQD devices · 66cddf10

由 Julian Wiedmann 提交于 3月 18, 2020

Similar to the support for z/VM NICs, but we need to take extra care
about the dedicated mcast queue:

1. netdev_pick_tx() is unaware of this limitation and might select the
   mcast txq. Catch this.
2. require at least _two_ TX queues - one for ucast, one for mcast.
3. when reducing the number of TX queues, there's a potential race
   where netdev_cap_txqueue() over-rules the selected txq index and
   falls back to index 0. This would place ucast traffic on the mcast
   queue, and result in TX errors.
   So for IQD, reject a reduction while the interface is running.
Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

66cddf10

s390/qeth: allow configuration of TX queues for z/VM NICs · fcc2df8b

由 Julian Wiedmann 提交于 3月 18, 2020

Add support for ETHTOOL_SCHANNELS to change the count of active
TX queues.

Since all TX queue structs are pre-allocated and -registered, we just
need to trivially adjust dev->real_num_tx_queues.
Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fcc2df8b

s390/qeth: remove prio-queueing support for z/VM NICs · 1c103cf8

由 Julian Wiedmann 提交于 3月 18, 2020

z/VM NICs don't offer HW QoS for TX rings. So just use netdev_pick_tx()
to distribute the connections equally over all enabled TX queues.

We start with just 1 enabled TX queue (this matches the typical
configuration without prio-queueing). A follow-on patch will allow users
to enable additional TX queues.
Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1c103cf8

s390/qeth: use memory reserves in TX slow path · b413ff8a

由 Julian Wiedmann 提交于 3月 18, 2020

When falling back to an allocation from the HW header cache, check if
the skb is eligible for using memory reserves.
This only makes a difference if the cache is empty and needs to be
refilled.
Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b413ff8a

s390/qeth: use memory reserves to back RX buffers · 714c9108

由 Julian Wiedmann 提交于 3月 18, 2020

Use dev_alloc_page() for backing the RX buffers with pages. This way we
pick up __GFP_MEMALLOC.
Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

714c9108

18 3月, 2020 27 次提交

Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next · a58741ef

由 David S. Miller 提交于 3月 17, 2020

Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for net-next:

1) Use nf_flow_offload_tuple() to fetch flow stats, from Paul Blakey.

2) Add new xt_IDLETIMER hard mode, from Manoj Basapathi.
   Follow up patch to clean up this new mode, from Dan Carpenter.

3) Add support for geneve tunnel options, from Xin Long.

4) Make sets built-in and remove modular infrastructure for sets,
   from Florian Westphal.

5) Remove unused TEMPLATE_NULLS_VAL, from Li RongQing.

6) Statify nft_pipapo_get, from Chen Wandun.

7) Use C99 flexible-array member, from Gustavo A. R. Silva.

8) More descriptive variable names for bitwise, from Jeremy Sowden.

9) Four patches to add tunnel device hardware offload to the flowtable
   infrastructure, from wenxu.

10) pipapo set supports for 8-bit grouping, from Stefano Brivio.

11) pipapo can switch between nibble and byte grouping, also from
    Stefano.

12) Add AVX2 vectorized version of pipapo, from Stefano Brivio.

13) Update pipapo to be use it for single ranges, from Stefano.

14) Add stateful expression support to elements via control plane,
    eg. counter per element.

15) Re-visit sysctls in unprivileged namespaces, from Florian Westphal.

15) Add new egress hook, from Lukas Wunner.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a58741ef

mptcp: move msk state update to subflow_syn_recv_sock() · 7f20d5fc

由 Paolo Abeni 提交于 3月 17, 2020

After commit 58b09919 ("mptcp: create msk early"), the
msk socket is already available at subflow_syn_recv_sock()
time. Let's move there the state update, to mirror more
closely the first subflow state.

The above will also help multiple subflow supports.
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Reviewed-by: NMatthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7f20d5fc

Merge branch 'net-add-phylink-support-for-PCS' · 5dd32845

由 David S. Miller 提交于 3月 17, 2020

Russell King says:

====================
net: add phylink support for PCS

This series adds support for IEEE 802.3 register set compliant PCS
for phylink.  In order to do this, we:

1. convert BUG_ON() in existing accessors to WARN_ON_ONCE() and return
   an error.
2. add accessors for modifying a MDIO device register, and use them in
   phylib, rather than duplicating the code from phylib.
3. add support for decoding the advertisement from clause 22 compatible
   register sets for clause 37 advertisements and SGMII advertisements.
4. add support for clause 45 register sets for 10GBASE-R PCS.

These have been tested on the LX2160A Clearfog-CX platform.

v2: eliminate use of BUG_ON() in the accessors.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5dd32845

net: phylink: pcs: add 802.3 clause 45 helpers · b8679ef8

由 Russell King 提交于 3月 17, 2020

Implement helpers for PCS accessed via the MII bus using 802.3 clause
45 cycles for 10GBASE-R. Only link up/down is supported, 10G full
duplex is assumed.
Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b8679ef8

net: phylink: pcs: add 802.3 clause 22 helpers · 74db1c18

由 Russell King 提交于 3月 17, 2020

Implement helpers for PCS accessed via the MII bus using 802.3 clause
22 cycles, conforming to 802.3 clause 37 and Cisco SGMII specifications
for the advertisement word.
Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

74db1c18

net: mdiobus: add APIs for modifying a MDIO device register · 6cc7cf81

由 Russell King 提交于 3月 17, 2020

Add APIs for modifying a MDIO device register, similar to the existing
phy_modify() group of functions, but at mdiobus level instead.  Adapt
__phy_modify_changed() to use the new mdiobus level helper.
Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6cc7cf81

net: mdiobus: avoid BUG_ON() in mdiobus accessors · 89e3e3dd

由 Russell King 提交于 3月 17, 2020

Avoid using BUG_ON() in the mdiobus accessors, prefering instead to use
WARN_ON_ONCE() and returning an error.
Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

89e3e3dd

Merge branch 'net-bridge-vlan-options-add-support-for-tunnel-mapping' · 54e1dc70

由 David S. Miller 提交于 3月 17, 2020

Nikolay Aleksandrov says:

====================
net: bridge: vlan options: add support for tunnel mapping

In order to bring the new vlan API on par with the old one and be able
to completely migrate to the new one we need to support vlan tunnel mapping
and statistics. This patch-set takes care of the former by making it a
vlan option. There are two notable issues to deal with:
 - vlan range to tunnel range mapping
   * The tunnel ids are globally unique for the vlan code and a vlan can
     be mapped to one tunnel, so the old API took care of ranges by
     taking the starting tunnel id value and incrementally mapping
     vlan id(i) -> tunnel id(i). This set takes the same approach and
     uses one new attribute - BRIDGE_VLANDB_ENTRY_TUNNEL_ID. If used
     with a vlan range then it's the starting tunnel id to map.

 - tunnel mapping removal
   * Since there are no reserved/special tunnel ids defined, we can't
     encode mapping removal within the new attribute, in order to be
     able to remove a mapping we add a vlan flag which makes the new
     tunnel option remove the mapping

The rest is pretty straight-forward, in fact we directly re-use the old
code for manipulating tunnels by just mapping the command (set/del). In
order to be able to keep detecting vlan ranges we check that the current
vlan has a tunnel and it's extending the current vlan range end's tunnel
id.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

54e1dc70

net: bridge: vlan options: add support for tunnel mapping set/del · 569da082

由 Nikolay Aleksandrov 提交于 3月 17, 2020

This patch adds support for manipulating vlan/tunnel mappings. The
tunnel ids are globally unique and are one per-vlan. There were two
trickier issues - first in order to support vlan ranges we have to
compute the current tunnel id in the following way:
 - base tunnel id (attr) + current vlan id - starting vlan id
This is in line how the old API does vlan/tunnel mapping with ranges. We
already have the vlan range present, so it's redundant to add another
attribute for the tunnel range end. It's simply base tunnel id + vlan
range. And second to support removing mappings we need an out-of-band way
to tell the option manipulating function because there are no
special/reserved tunnel id values, so we use a vlan flag to denote the
operation is tunnel mapping removal.
Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

569da082

net: bridge: vlan options: add support for tunnel id dumping · 188c67dd

由 Nikolay Aleksandrov 提交于 3月 17, 2020

Add a new option - BRIDGE_VLANDB_ENTRY_TUNNEL_ID which is used to dump
the tunnel id mapping. Since they're unique per vlan they can enter a
vlan range if they're consecutive, thus we can calculate the tunnel id
range map simply as: vlan range end id - vlan range start id. The
starting point is the tunnel id in BRIDGE_VLANDB_ENTRY_TUNNEL_ID. This
is similar to how the tunnel entries can be created in a range via the
old API (a vlan range maps to a tunnel range).
Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

188c67dd

net: bridge: vlan tunnel: constify bridge and port arguments · 53e96632

由 Nikolay Aleksandrov 提交于 3月 17, 2020

The vlan tunnel code changes vlan options, it shouldn't touch port or
bridge options so we can constify the port argument. This would later help
us to re-use these functions from the vlan options code.
Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

53e96632

net: bridge: vlan options: rename br_vlan_opts_eq to br_vlan_opts_eq_range · 99f7c5e0

由 Nikolay Aleksandrov 提交于 3月 17, 2020

It is more appropriate name as it shows the intent of why we need to
check the options' state. It also allows us to give meaning to the two
arguments of the function: the first is the current vlan (v_curr) being
checked if it could enter the range ending in the second one (range_end).
Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

99f7c5e0

Merge branch 'stmmac-100GB-Enterprise-MAC-support' · 0419c450

由 David S. Miller 提交于 3月 17, 2020

Jose Abreu says:

====================
net: stmmac: 100GB Enterprise MAC support

Adds the support for Enterprise MAC IP version which allows operating
speeds up to 100GB.

Patch 1/4, adds the support in XPCS for XLGMII interface that is used in
this kind of Enterprise MAC IPs.

Patch 2/4, adds the XLGMII interface support in stmmac.

Patch 3/4, adds the HW specific support for Enterprise MAC.

We end in patch 4/4, by updating stmmac documentation to mention the
support for this new IP version.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0419c450

Documentation: networking: stmmac: Mention new XLGMAC support · 2462a82c

由 Jose Abreu 提交于 3月 17, 2020

Add the Enterprise MAC support to the list of supported IP versions and
the newly added XLGMII interface support.
Signed-off-by: NJose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2462a82c

net: stmmac: Add support for Enterprise MAC version · 4a4ccde0

由 Jose Abreu 提交于 3月 17, 2020

Adds the support for Enterprise MAC IP version which is very similar to
XGMAC. It's so similar that we just need to check the device id and add
new speeds definitions and some minor callbacks.
Signed-off-by: NJose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4a4ccde0

net: stmmac: Add XLGMII support · 8a880936

由 Jose Abreu 提交于 3月 17, 2020

Add XLGMII support for stmmac including the list of speeds and defines
for them.
Signed-off-by: NJose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8a880936

net: phy: xpcs: Add XLGMII support · 7c6dbd29

由 Jose Abreu 提交于 3月 17, 2020

Add XLGMII support for XPCS. This does not include Autoneg feature.
Signed-off-by: NJose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7c6dbd29

Merge branch 'ionic-bits-and-bytes' · 9f57db9b

由 David S. Miller 提交于 3月 17, 2020

Shannon Nelson says:

====================
ionic bits and bytes

These are a few little updates to the ionic driver while we are in between
other feature work.  While these are mostly Fixes, they are almost all low
priority and needn't be promoted to net.  The one higher need is patch 1,
but it is fixing something that hasn't made it out of net-next yet.

v3: allow decode of unknown transciever and use type
    codes from sfp.h
v2: add Fixes tags to patches 1-4, and a little
    description for patch 5
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9f57db9b

ionic: add decode for IONIC_RC_ENOSUPP · b2133d8d

由 Shannon Nelson 提交于 3月 16, 2020

Add decoding for a new firmware error code.
Signed-off-by: NShannon Nelson <snelson@pensando.io>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b2133d8d

ionic: print data for unknown xcvr type · 840eef59

由 Shannon Nelson 提交于 3月 16, 2020

If we don't recognize the transceiver type, set the xcvr type
and data length such that ethtool can at least print the first
256 bytes and the reader can figure out why the transceiver
is not recognized.

While we're here, we can update the phy_id type values to use
the enum values in sfp.h.

Fixes: 4d03e00a ("ionic: Add initial ethtool support")
Signed-off-by: NShannon Nelson <snelson@pensando.io>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

840eef59

ionic: remove adminq napi instance · ba8fb6c8

由 Shannon Nelson 提交于 3月 16, 2020

Remove the adminq's napi struct when tearing down
the adminq.

Fixes: 1d062b7b ("ionic: Add basic adminq support")
Signed-off-by: NShannon Nelson <snelson@pensando.io>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ba8fb6c8

ionic: deinit rss only if selected · ad6fd4d3

由 Shannon Nelson 提交于 3月 16, 2020

Don't bother de-initing RSS if it wasn't selected.

Fixes: aa319881 ("ionic: Add RSS support")
Signed-off-by: NShannon Nelson <snelson@pensando.io>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ad6fd4d3

ionic: stop devlink warn on mgmt device · ecd2d8b0

由 Shannon Nelson 提交于 3月 16, 2020

If we don't set a port type, the devlink code will eventually
print a WARN in the kernel log.  Because the mgmt device is
not really a useful port, don't register it as a devlink port.

Fixes: b3f064e9 ("ionic: add support for device id 0x1004")
Signed-off-by: NShannon Nelson <snelson@pensando.io>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ecd2d8b0

Merge branch 'net_sched-allow-use-of-hrtimer-slack' · c7cba832

由 David S. Miller 提交于 3月 17, 2020

Eric Dumazet says:

====================
net_sched: allow use of hrtimer slack

Packet schedulers have used hrtimers with exact expiry times.

Some of them can afford having a slack, in order to reduce
the number of timer interrupts and feed bigger batches
to increase efficiency.

FQ for example does not care if throttled packets are
sent with an additional (small) delay.

Original observation of having maybe too many interrupts
was made by Willem de Bruijn.

v2: added strict netlink checking (Jakub Kicinski)
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c7cba832

net_sched: sch_fq: enable use of hrtimer slack · 583396f4

由 Eric Dumazet 提交于 3月 16, 2020

Add a new attribute to control the fq qdisc hrtimer slack.

Default is set to 10 usec.

When/if packets are throttled, fq set up an hrtimer that can
lead to one interrupt per packet in the throttled queue.

By using a timer slack, we allow better use of timer interrupts,
by giving them a chance to call multiple timer callbacks
at each hardware interrupt.

Also, giving a slack allows FQ to dequeue batches of packets
instead of a single one, thus increasing xmit_more efficiency.

This has no negative effect on the rate a TCP flow can sustain,
since each TCP flow maintains its own precise vtime (tp->tcp_wstamp_ns)

v2: added strict netlink checking (as feedback from Jakub Kicinski)

Tested:
1000 concurrent flows all using paced packets.
1,000,000 packets sent per second.

Before the patch :

$ vmstat 2 10
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 0 60726784 23628 3485992 0 0 138 1 977 535 0 12 87 0 0
0 0 0 60714700 23628 3485628 0 0 0 0 1568827 26462 0 22 78 0 0
1 0 0 60716012 23628 3485656 0 0 0 0 1570034 26216 0 22 78 0 0
0 0 0 60722420 23628 3485492 0 0 0 0 1567230 26424 0 22 78 0 0
0 0 0 60727484 23628 3485556 0 0 0 0 1568220 26200 0 22 78 0 0
2 0 0 60718900 23628 3485380 0 0 0 40 1564721 26630 0 22 78 0 0
2 0 0 60718096 23628 3485332 0 0 0 0 1562593 26432 0 22 78 0 0
0 0 0 60719608 23628 3485064 0 0 0 0 1563806 26238 0 22 78 0 0
1 0 0 60722876 23628 3485236 0 0 0 130 1565874 26566 0 22 78 0 0
1 0 0 60722752 23628 3484908 0 0 0 0 1567646 26247 0 22 78 0 0

After the patch, slack of 10 usec, we can see a reduction of interrupts
per second, and a small decrease of reported cpu usage.

$ vmstat 2 10
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 0 60722564 23628 3484728 0 0 133 1 696 545 0 13 87 0 0
1 0 0 60722568 23628 3484824 0 0 0 0 977278 25469 0 20 80 0 0
0 0 0 60716396 23628 3484764 0 0 0 0 979997 25326 0 20 80 0 0
0 0 0 60713844 23628 3484960 0 0 0 0 981394 25249 0 20 80 0 0
2 0 0 60720468 23628 3484916 0 0 0 0 982860 25062 0 20 80 0 0
1 0 0 60721236 23628 3484856 0 0 0 0 982867 25100 0 20 80 0 0
1 0 0 60722400 23628 3484456 0 0 0 8 982698 25303 0 20 80 0 0
0 0 0 60715396 23628 3484428 0 0 0 0 981777 25176 0 20 80 0 0
0 0 0 60716520 23628 3486544 0 0 0 36 978965 27857 0 21 79 0 0
0 0 0 60719592 23628 3486516 0 0 0 22 977318 25106 0 20 80 0 0
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

583396f4

net_sched: do not reprogram a timer about to expire · b88948fb

由 Eric Dumazet 提交于 3月 16, 2020

qdisc_watchdog_schedule_range_ns() can use the newly added slack
and avoid rearming the hrtimer a bit earlier than the current
value. This patch has no effect if delta_ns parameter
is zero.

Note that this means the max slack is potentially doubled.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b88948fb

net_sched: add qdisc_watchdog_schedule_range_ns() · efe074c2

由 Eric Dumazet 提交于 3月 16, 2020

Some packet schedulers might want to add a slack
when programming hrtimers. This can reduce number
of interrupts and increase batch sizes and thus
give good xmit_more savings.

This commit adds qdisc_watchdog_schedule_range_ns()
helper, with an extra delta_ns parameter.

Legacy qdisc_watchdog_schedule_n() becomes an inline
passing a zero slack.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

efe074c2

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功