提交 · dc05360fee660a9dbe59824b3f7896534210432b · openeuler / Kernel

24 3月, 2019 2 次提交

net: convert rps_needed and rfs_needed to new static branch api · dc05360f

由 Eric Dumazet 提交于 3月 22, 2019

We prefer static_branch_unlikely() over static_key_false() these days.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NSoheil Hassas Yeganeh <soheil@google.com>
Acked-by: NWillem de Bruijn <willemb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dc05360f

net: dev: introduce support for sch BYPASS for lockless qdisc · ba27b4cd

由 Paolo Abeni 提交于 3月 22, 2019

With commit c5ad119f ("net: sched: pfifo_fast use skb_array")
pfifo_fast no longer benefit from the TCQ_F_CAN_BYPASS optimization.
Due to retpolines the cost of the enqueue()/dequeue() pair has become
relevant and we observe measurable regression for the uncontended
scenario when the packet-rate is below line rate.

After commit 46b1c18f ("net: sched: put back q.qlen into a
single location") we can check for empty qdisc with a reasonably
fast operation even for nolock qdiscs.

This change extends TCQ_F_CAN_BYPASS support to nolock qdisc.
The new chunk of code mirrors closely the existing one for traditional
qdisc, leveraging a newly introduced helper to read atomically the
qdisc length.

Tested with pktgen in queue xmit mode, with pfifo_fast, a MQ
device, and MQ root qdisc:

threads         vanilla         patched
                kpps            kpps
1               2465            2889
2               4304            5188
4               7898            9589

Same as above, but with a single queue device:

threads         vanilla         patched
                kpps            kpps
1               2556            2827
2               2900            2900
4               5000            5000
8               4700            4700

No mesaurable changes in the contended scenarios, and more 10%
improvement in the uncontended ones.

 v1 -> v2:
  - rebased after flag name change
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Tested-by: NIvan Vecera <ivecera@redhat.com>
Reviewed-by: NEric Dumazet <edumazet@google.com>
Reviewed-by: NIvan Vecera <ivecera@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ba27b4cd

21 3月, 2019 3 次提交

net: remove 'fallback' argument from dev->ndo_select_queue() · a350ecce

由 Paolo Abeni 提交于 3月 20, 2019

After the previous patch, all the callers of ndo_select_queue()
provide as a 'fallback' argument netdev_pick_tx.
The only exceptions are nested calls to ndo_select_queue(),
which pass down the 'fallback' available in the current scope
- still netdev_pick_tx.

We can drop such argument and replace fallback() invocation with
netdev_pick_tx(). This avoids an indirect call per xmit packet
in some scenarios (TCP syn, UDP unconnected, XDP generic, pktgen)
with device drivers implementing such ndo. It also clean the code
a bit.

Tested with ixgbe and CONFIG_FCOE=m

With pktgen using queue xmit:
threads		vanilla 	patched
		(kpps)		(kpps)
1		2334		2428
2		4166		4278
4		7895		8100

 v1 -> v2:
 - rebased after helper's name change
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a350ecce

packet: rework packet_pick_tx_queue() to use common code selection · b71b5837

由 Paolo Abeni 提交于 3月 20, 2019

Currently packet_pick_tx_queue() is the only caller of
ndo_select_queue() using a fallback argument other than
netdev_pick_tx.

Leveraging rx queue, we can obtain a similar queue selection
behavior using core helpers. After this change, ndo_select_queue()
is always invoked with netdev_pick_tx() as fallback.
We can change ndo_select_queue() signature in a followup patch,
dropping an indirect call per transmitted packet in some scenarios
(e.g. TCP syn and XDP generic xmit)

This changes slightly how af packet queue selection happens when
PACKET_QDISC_BYPASS is set. It's now more similar to plan dev_queue_xmit()
tacking in account both XPS and TC mapping.

 v1  -> v2:
  - rebased after helper name change
 RFC -> v1:
  - initialize sender_cpu to the expected value
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b71b5837

net: dev: rename queue selection helpers. · 4bd97d51

由 Paolo Abeni 提交于 3月 20, 2019

With the following patches, we are going to use __netdev_pick_tx() in
many modules. Rename it to netdev_pick_tx(), to make it clear is
a public API.

Also rename the existing netdev_pick_tx() to netdev_core_pick_tx(),
to avoid name clashes.
Suggested-by: NEric Dumazet <edumazet@google.com>
Suggested-by: NDavid Miller <davem@davemloft.net>
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4bd97d51

25 2月, 2019 2 次提交

net: dev: add generic protodown handler · b5899679

由 Andy Roulin 提交于 2月 22, 2019

Introduce dev_change_proto_down_generic, a generic ndo_change_proto_down
implementation, which sets the netdev carrier state according to proto_down.

This adds the ability to set protodown on vxlan and macvlan devices in a
generic way for use by control protocols like VRRPD.
Signed-off-by: NAndy Roulin <aroulin@cumulusnetworks.com>
Acked-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b5899679

net: Skip GSO length estimation if transport header is not set · a0dce875

由 Maxim Mikityanskiy 提交于 2月 22, 2019

qdisc_pkt_len_init expects transport_header to be set for GSO packets.
Patch [1] skips transport_header validation for GSO packets that don't
have network_header set at the moment of calling virtio_net_hdr_to_skb,
and allows them to pass into the stack. After patch [2] no placeholder
value is assigned to transport_header if dissection fails, so this patch
adds a check to the place where the value of transport_header is used.

[1] https://patchwork.ozlabs.org/patch/1044429/
[2] https://patchwork.ozlabs.org/patch/1046122/Signed-off-by: NMaxim Mikityanskiy <maximmi@mellanox.com>
Acked-by: NWillem de Bruijn <willemb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a0dce875

16 2月, 2019 1 次提交

net: Fix for_each_netdev_feature on Big endian · 3b89ea9c

由 Hauke Mehrtens 提交于 2月 15, 2019

The features attribute is of type u64 and stored in the native endianes on
the system. The for_each_set_bit() macro takes a pointer to a 32 bit array
and goes over the bits in this area. On little Endian systems this also
works with an u64 as the most significant bit is on the highest address,
but on big endian the words are swapped. When we expect bit 15 here we get
bit 47 (15 + 32).

This patch converts it more or less to its own for_each_set_bit()
implementation which works on 64 bit integers directly. This is then
completely in host endianness and should work like expected.

Fixes: fd867d51 ("net/core: generic support for disabling netdev features down stack")
Signed-off-by: NHauke Mehrtens <hauke.mehrtens@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3b89ea9c

07 2月, 2019 1 次提交

net: Introduce ndo_get_port_parent_id() · d6abc596

由 Florian Fainelli 提交于 2月 06, 2019

In preparation for getting rid of switchdev_ops, create a dedicated NDO
operation for getting the port's parent identifier. There are
essentially two classes of drivers that need to implement getting the
port's parent ID which are VF/PF drivers with a built-in switch, and
pure switchdev drivers such as mlxsw, ocelot, dsa etc.

We introduce a helper function: dev_get_port_parent_id() which supports
recursion into the lower devices to obtain the first port's parent ID.

Convert the bridge, core and ipv4 multicast routing code to check for
such ndo_get_port_parent_id() and call the helper function when valid
before falling back to switchdev_port_attr_get(). This will allow us to
convert all relevant drivers in one go instead of having to implement
both switchdev_port_attr_get() and ndo_get_port_parent_id() operations,
then get rid of switchdev_port_attr_get().
Acked-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d6abc596

06 2月, 2019 1 次提交

net: xdp: allow generic and driver XDP on one interface · 9ee963d6

由 Jakub Kicinski 提交于 2月 05, 2019

Since commit a25717d2 ("xdp: support simultaneous driver and
hw XDP attachment") users can load an XDP program for offload and
in native driver mode simultaneously.  Allow a similar mix of
offload and SKB mode/generic XDP.
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

9ee963d6

02 2月, 2019 1 次提交

xdp: Provide extack messages when prog attachment failed · 01dde20c

由 Maciej Fijalkowski 提交于 2月 01, 2019

In order to provide more meaningful messages to user when the process of
loading xdp program onto network interface failed, let's add extack
messages within dev_change_xdp_fd.
Suggested-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com>
Acked-by: NJesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

01dde20c

30 1月, 2019 1 次提交

net: set default network namespace in init_dummy_netdev() · 35edfdc7

由 Josh Elsasser 提交于 1月 26, 2019

Assign a default net namespace to netdevs created by init_dummy_netdev().
Fixes a NULL pointer dereference caused by busy-polling a socket bound to
an iwlwifi wireless device, which bumps the per-net BUSYPOLLRXPACKETS stat
if napi_poll() received packets:

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000190
  IP: napi_busy_loop+0xd6/0x200
  Call Trace:
    sock_poll+0x5e/0x80
    do_sys_poll+0x324/0x5a0
    SyS_poll+0x6c/0xf0
    do_syscall_64+0x6b/0x1f0
    entry_SYSCALL_64_after_hwframe+0x3d/0xa2

Fixes: 7db6b048 ("net: Commonize busy polling code to focus on napi_id instead of socket")
Signed-off-by: NJosh Elsasser <jelsasser@appneta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

35edfdc7

06 1月, 2019 1 次提交

jump_label: move 'asm goto' support test to Kconfig · e9666d10

由 Masahiro Yamada 提交于 12月 31, 2018

Currently, CONFIG_JUMP_LABEL just means "I _want_ to use jump label".

The jump label is controlled by HAVE_JUMP_LABEL, which is defined
like this:

  #if defined(CC_HAVE_ASM_GOTO) && defined(CONFIG_JUMP_LABEL)
  # define HAVE_JUMP_LABEL
  #endif

We can improve this by testing 'asm goto' support in Kconfig, then
make JUMP_LABEL depend on CC_HAS_ASM_GOTO.

Ugly #ifdef HAVE_JUMP_LABEL will go away, and CONFIG_JUMP_LABEL will
match to the real kernel capability.
Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Tested-by: NSedat Dilek <sedat.dilek@gmail.com>

e9666d10

16 12月, 2018 1 次提交

net: use indirect call wrappers at GRO network layer · aaa5d90b

由 Paolo Abeni 提交于 12月 14, 2018

This avoids an indirect calls for L3 GRO receive path, both
for ipv4 and ipv6, if the latter is not compiled as a module.

Note that when IPv6 is compiled as builtin, it will be checked first,
so we have a single additional compare for the more common path.

v1 -> v2:
 - adapted to INDIRECT_CALL_ changes
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aaa5d90b

14 12月, 2018 3 次提交

net: dev: Issue NETDEV_PRE_CHANGEADDR · d59cdf94

由 Petr Machata 提交于 12月 13, 2018

When a device address is about to be changed, or an address added to the
list of device HW addresses, it is necessary to ensure that all
interested parties can support the address. Therefore, send the
NETDEV_PRE_CHANGEADDR notification, and if anyone bails on it, do not
change the address.
Signed-off-by: NPetr Machata <petrm@mellanox.com>
Acked-by: NJiri Pirko <jiri@mellanox.com>
Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d59cdf94

net: dev: Add NETDEV_PRE_CHANGEADDR · 1570415f

由 Petr Machata 提交于 12月 13, 2018

The NETDEV_CHANGEADDR notification is emitted after a device address
changes. Extending this message to allow vetoing is certainly possible,
but several other notification types have instead adopted a simple
two-stage approach: first a "pre" notification is sent to make sure all
interested parties are OK with a change that's about to be done. Then
the change is done, and afterwards a "post" notification is sent.

This dual approach is easier to use: when the change is vetoed, nothing
has changed yet, and it's therefore unnecessary to roll anything back.
Therefore adopt it for NETDEV_CHANGEADDR as well.

To that end, add NETDEV_PRE_CHANGEADDR and an info structure to go along
with it.
Signed-off-by: NPetr Machata <petrm@mellanox.com>
Acked-by: NJiri Pirko <jiri@mellanox.com>
Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1570415f

net: dev: Add extack argument to dev_set_mac_address() · 3a37a963

由 Petr Machata 提交于 12月 13, 2018

A follow-up patch will add a notifier type NETDEV_PRE_CHANGEADDR, which
allows vetoing of MAC address changes. One prominent path to that
notification is through dev_set_mac_address(). Therefore give this
function an extack argument, so that it can be packed together with the
notification. Thus a textual reason for rejection (or a warning) can be
communicated back to the user.
Signed-off-by: NPetr Machata <petrm@mellanox.com>
Acked-by: NJiri Pirko <jiri@mellanox.com>
Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3a37a963

07 12月, 2018 5 次提交

net: core: dev: Attach extack to NETDEV_PRE_UP · 40c900aa