提交 · 36fd633ec98acd2028585c22128fcaa3da6d5770 · openanolis / cloud-kernel

25 1月, 2018 1 次提交

net: separate SIOCGIFCONF handling from dev_ioctl() · 36fd633e

由 Al Viro 提交于 6月 26, 2017

Only two of dev_ioctl() callers may pass SIOCGIFCONF to it.
Separating that codepath from the rest of dev_ioctl() allows both
to simplify dev_ioctl() itself (all other cases work with struct ifreq *)
*and* seriously simplify the compat side of that beast: all it takes
is passing to inet_gifconf() an extra argument - the size of individual
records (sizeof(struct ifreq) or sizeof(struct compat_ifreq)).  With
dev_ifconf() called directly from sock_do_ioctl()/compat_dev_ifconf()
that's easy to arrange.

As the result, compat side of SIOCGIFCONF doesn't need any
allocations, copy_in_user() back and forth, etc.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

36fd633e

24 1月, 2018 1 次提交

net: core: Fix kernel-doc for carrier_* attributes · 9e55e5d3

由 Florian Fainelli 提交于 1月 22, 2018

Fix the documentation warning:

include/linux/netdevice.h:1939: warning: Excess struct member 'carrier_changes' description in 'net_device'
Reported-by: Nkbuild test robot <fengguang.wu@intel.com>
Fixes: b2d3bcfa ("net: core: Expose number of link up/down transitions")
Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9e55e5d3

23 1月, 2018 1 次提交

net: core: Expose number of link up/down transitions · b2d3bcfa

由 David Decotigny 提交于 1月 18, 2018

Expose the number of times the link has been going UP or DOWN, and
update the "carrier_changes" counter to be the sum of these two events.
While at it, also update the sysfs-class-net documentation to cover:
carrier_changes (3.15), carrier_up_count (4.16) and carrier_down_count
(4.16)
Signed-off-by: NDavid Decotigny <decot@googlers.com>
[Florian:
* rebase
* add documentation
* merge carrier_changes with up/down counters]
Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b2d3bcfa

15 1月, 2018 2 次提交

bpf: offload: add map offload infrastructure · a3884572

由 Jakub Kicinski 提交于 1月 11, 2018

BPF map offload follow similar path to program offload.  At creation
time users may specify ifindex of the device on which they want to
create the map.  Map will be validated by the kernel's
.map_alloc_check callback and device driver will be called for the
actual allocation.  Map will have an empty set of operations
associated with it (save for alloc and free callbacks).  The real
device callbacks are kept in map->offload->dev_ops because they
have slightly different signatures.  Map operations are called in
process context so the driver may communicate with HW freely,
msleep(), wait() etc.

Map alloc and free callbacks are muxed via existing .ndo_bpf, and
are always called with rtnl lock held.  Maps and programs are
guaranteed to be destroyed before .ndo_uninit (i.e. before
unregister_netdev() returns).  Map callbacks are invoked with
bpf_devs_lock *read* locked, drivers must take care of exclusive
locking if necessary.

All offload-specific branches are marked with unlikely() (through
bpf_map_is_dev_bound()), given that branch penalty will be
negligible compared to IO anyway, and we don't want to penalize
SW path unnecessarily.
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

a3884572

net: sch: prio: Add offload ability to PRIO qdisc · 7fdb61b4

由 Nogah Frankel 提交于 1月 14, 2018

Add the ability to offload PRIO qdisc by using ndo_setup_tc.
There are three commands for PRIO offloading:
* TC_PRIO_REPLACE: handles set and tune
* TC_PRIO_DESTROY: handles qdisc destroy
* TC_PRIO_STATS: updates the qdiscs counters (given as reference)

Like RED qdisc, the indication of whether PRIO is being offloaded is being
set and updated as part of the dump function. It is so because the driver
could decide to offload or not based on the qdisc parent, which could
change without notifying the qdisc.
Signed-off-by: NNogah Frankel <nogahf@mellanox.com>
Reviewed-by: NYuval Mintz <yuvalm@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7fdb61b4

11 1月, 2018 1 次提交

net: fix xdp_rxq_info build issue when CONFIG_SYSFS is not set · fd3ba214

由 Jesper Dangaard Brouer 提交于 1月 09, 2018

The commit e817f856 ("xdp: generic XDP handling of xdp_rxq_info")
removed some ifdef CONFIG_SYSFS in net/core/dev.c, but forgot to
remove the corresponding ifdef's in include/linux/netdevice.h.

Fixes: e817f856 ("xdp: generic XDP handling of xdp_rxq_info")
Reported-by: NGuenter Roeck <linux@roeck-us.net>
Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
Tested-by: NGuenter Roeck <linux@roeck-us.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fd3ba214

09 1月, 2018 2 次提交

net: No line break on netdev_WARN* formatting · e1cfe3d0

由 Gal Pressman 提交于 1月 07, 2018

Remove the unnecessary line break between the netdev name and reg state
to the actual message that should be printed.

For example, this:
[86730.307236] ------------[ cut here ]------------
[86730.313496] netdevice: enp27s0f0
Message from the driver
[...]

Will be replaced with:
[86770.259289] ------------[ cut here ]------------
[86770.265191] netdevice: enp27s0f0: Message from the driver
[...]
Signed-off-by: NGal Pressman <galp@mellanox.com>
Reviewed-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e1cfe3d0

net: Fix netdev_WARN_ONCE macro · 72dd831e

由 Gal Pressman 提交于 1月 07, 2018

netdev_WARN_ONCE is broken (whoops..), this fix will remove the
unnecessary "condition" parameter, add the missing comma and change
"arg" to "args".

Fixes: 375ef2b1 ("net: Introduce netdev_*_once functions")
Signed-off-by: NGal Pressman <galp@mellanox.com>
Reviewed-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

72dd831e

06 1月, 2018 1 次提交

xdp: generic XDP handling of xdp_rxq_info · e817f856

由 Jesper Dangaard Brouer 提交于 1月 03, 2018

Hook points for xdp_rxq_info:
 * reg  : netif_alloc_rx_queues
 * unreg: netif_free_rx_queues

The net_device have some members (num_rx_queues + real_num_rx_queues)
and data-area (dev->_rx with struct netdev_rx_queue's) that were
primarily used for exporting information about RPS (CONFIG_RPS) queues
to sysfs (CONFIG_SYSFS).

For generic XDP extend struct netdev_rx_queue with the xdp_rxq_info,
and remove some of the CONFIG_SYSFS ifdefs.
Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>

e817f856

31 12月, 2017 1 次提交

bpf: offload: allow netdev to disappear while verifier is running · cae1927c

由 Jakub Kicinski 提交于 12月 27, 2017

To allow verifier instruction callbacks without any extra locking
NETDEV_UNREGISTER notification would wait on a waitqueue for verifier
to finish.  This design decision was made when rtnl lock was providing
all the locking.  Use the read/write lock instead and remove the
workqueue.

Verifier will now call into the offload code, so dev_ops are moved
to offload structure.  Since verifier calls are all under
bpf_prog_is_dev_bound() we no longer need static inline implementations
to please builds with CONFIG_NET=n.
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

cae1927c

21 12月, 2017 1 次提交

xfrm: wrap xfrmdev_ops with offload config · 9cb0d21d

由 Shannon Nelson 提交于 12月 19, 2017

There's no reason to define netdev->xfrmdev_ops if
the offload facility is not CONFIG'd in.
Signed-off-by: NShannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

9cb0d21d

20 12月, 2017 1 次提交

net: Add asynchronous callbacks for xfrm on layer 2. · f53c7239

由 Steffen Klassert 提交于 12月 20, 2017

This patch implements asynchronous crypto callbacks
and a backlog handler that can be used when IPsec
is done at layer 2 in the TX path. It also extends
the skb validate functions so that we can update
the driver transmit return codes based on async
crypto operation or to indicate that we queued the
packet in a backlog queue.

Joint work with: Aviv Heller <avivh@mellanox.com>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

f53c7239

03 12月, 2017 2 次提交

net: xdp: report flags program was installed with on query · 92f0292b

由 Jakub Kicinski 提交于 12月 01, 2017

Some drivers enforce that flags on program replacement and
removal must match the flags passed on install.  This leaves
the possibility open to enable simultaneous loading
of XDP programs both to HW and DRV.

Allow such drivers to report the flags back to the stack.
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: NSimon Horman <simon.horman@netronome.com>
Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

92f0292b

net: xdp: avoid output parameters when querying XDP prog · 118b4aa2

由 Jakub Kicinski 提交于 12月 01, 2017

The output parameters will get unwieldy if we want to add more
information about the program.  Simply pass the entire
struct netdev_bpf in.
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: NSimon Horman <simon.horman@netronome.com>
Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

118b4aa2

24 11月, 2017 1 次提交

net: accept UFO datagrams from tuntap and packet · 0c19f846

由 Willem de Bruijn 提交于 11月 21, 2017

Tuntap and similar devices can inject GSO packets. Accept type
VIRTIO_NET_HDR_GSO_UDP, even though not generating UFO natively.

Processes are expected to use feature negotiation such as TUNSETOFFLOAD
to detect supported offload types and refrain from injecting other
packets. This process breaks down with live migration: guest kernels
do not renegotiate flags, so destination hosts need to expose all
features that the source host does.

Partially revert the UFO removal from 182e0b6b~1..d9d30adf.
This patch introduces nearly(*) no new code to simplify verification.
It brings back verbatim tuntap UFO negotiation, VIRTIO_NET_HDR_GSO_UDP
insertion and software UFO segmentation.

It does not reinstate protocol stack support, hardware offload
(NETIF_F_UFO), SKB_GSO_UDP tunneling in SKB_GSO_SOFTWARE or reception
of VIRTIO_NET_HDR_GSO_UDP packets in tuntap.

To support SKB_GSO_UDP reappearing in the stack, also reinstate
logic in act_csum and openvswitch. Achieve equivalence with v4.13 HEAD
by squashing in commit 93991221 ("net: skb_needs_check() removes
CHECKSUM_UNNECESSARY check for tx.") and reverting commit 8d63bee6
("net: avoid skb_warn_bad_offload false positives on UFO").

(*) To avoid having to bring back skb_shinfo(skb)->ip6_frag_id,
ipv6_proxy_select_ident is changed to return a __be32 and this is
assigned directly to the frag_hdr. Also, SKB_GSO_UDP is inserted
at the end of the enum to minimize code churn.

Tested
  Booted a v4.13 guest kernel with QEMU. On a host kernel before this
  patch `ethtool -k eth0` shows UFO disabled. After the patch, it is
  enabled, same as on a v4.13 host kernel.

  A UFO packet sent from the guest appears on the tap device:
    host:
      nc -l -p -u 8000 &
      tcpdump -n -i tap0

    guest:
      dd if=/dev/zero of=payload.txt bs=1 count=2000
      nc -u 192.16.1.1 8000 < payload.txt

  Direct tap to tap transmission of VIRTIO_NET_HDR_GSO_UDP succeeds,
  packets arriving fragmented:

    ./with_tap_pair.sh ./tap_send_ufo tap0 tap1
    (from https://github.com/wdebruij/kerneltools/tree/master/tests)

Changes
  v1 -> v2
    - simplified set_offload change (review comment)
    - documented test procedure

Link: http://lkml.kernel.org/r/<CAF=yD-LuUeDuL9YWPJD9ykOZ0QCjNeznPDr6whqZ9NGMNF12Mw@mail.gmail.com>
Fixes: fb652fdf ("macvlan/macvtap: Remove NETIF_F_UFO advertisement.")
Reported-by: NMichal Kubecek <mkubecek@suse.cz>
Signed-off-by: NWillem de Bruijn <willemb@google.com>
Acked-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0c19f846

10 11月, 2017 1 次提交

net: fix incorrect comment with regard to VLAN packet handling · 54985120

由 Girish Moodalbail 提交于 11月 07, 2017

The commit bcc6d479 ("net: vlan: make non-hw-accel rx path similar
to hw-accel") unified accel and non-accel path for VLAN RX. With that
fix we do not register any packet_type handler for VLANs anymore, so fix
the incorrect comment.
Signed-off-by: NGirish Moodalbail <girish.moodalbail@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

54985120

09 11月, 2017 1 次提交

net: Introduce netdev_*_once functions · 375ef2b1

由 Gal Pressman 提交于 9月 17, 2017

Extend the net device error logging with netdev_*_once macros.
netdev_*_once are the equivalents of the dev_*_once macros which are
useful for messages that should only be logged once.

Also add netdev_WARN_ONCE, which is the "once" extension for the already
existing netdev_WARN macro.
Signed-off-by: NGal Pressman <galp@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

375ef2b1

08 11月, 2017 3 次提交

net_sch: cbs: Change TC_SETUP_CBS to TC_SETUP_QDISC_CBS · 8521db4c

由 Nogah Frankel 提交于 11月 06, 2017

Change TC_SETUP_CBS to TC_SETUP_QDISC_CBS to match the new convention..
Signed-off-by: NNogah Frankel <nogahf@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Reviewed-by: NSimon Horman <simon.horman@netronome.com>
Acked-by: NVinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8521db4c

net_sch: mqprio: Change TC_SETUP_MQPRIO to TC_SETUP_QDISC_MQPRIO · 575ed7d3

由 Nogah Frankel 提交于 11月 06, 2017

Change TC_SETUP_MQPRIO to TC_SETUP_QDISC_MQPRIO to match the new
convention.
Signed-off-by: NNogah Frankel <nogahf@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Reviewed-by: NSimon Horman <simon.horman@netronome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

575ed7d3

net_sch: red: Add offload ability to RED qdisc · 602f3baf

由 Nogah Frankel 提交于 11月 06, 2017

Add the ability to offload RED qdisc by using ndo_setup_tc.
There are four commands for RED offloading:
* TC_RED_SET: handles set and change.
* TC_RED_DESTROY: handle qdisc destroy.
* TC_RED_STATS: update the qdiscs counters (given as reference)
* TC_RED_XSTAT: returns red xstats.

Whether RED is being offloaded is being determined every time dump action
is being called because parent change of this qdisc could change its
offload state but doesn't require any RED function to be called.
Signed-off-by: NNogah Frankel <nogahf@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Reviewed-by: NSimon Horman <simon.horman@netronome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

602f3baf

05 11月, 2017 2 次提交

bpf: offload: add infrastructure for loading programs for a specific netdev · ab3f0063

由 Jakub Kicinski 提交于 11月 03, 2017

The fact that we don't know which device the program is going
to be used on is quite limiting in current eBPF infrastructure.
We have to reverse or limit the changes which kernel makes to
the loaded bytecode if we want it to be offloaded to a networking
device.  We also have to invent new APIs for debugging and
troubleshooting support.

Make it possible to load programs for a specific netdev.  This
helps us to bring the debug information closer to the core
eBPF infrastructure (e.g. we will be able to reuse the verifer
log in device JIT).  It allows device JITs to perform translation
on the original bytecode.

__bpf_prog_get() when called to get a reference for an attachment
point will now refuse to give it if program has a device assigned.
Following patches will add a version of that function which passes
the expected netdev in. @type argument in __bpf_prog_get() is
renamed to attach_type to make it clearer that it's only set on
attachment.

All calls to ndo_bpf are protected by rtnl, only verifier callbacks
are not.  We need a wait queue to make sure netdev doesn't get
destroyed while verifier is still running and calling its driver.
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: NSimon Horman <simon.horman@netronome.com>
Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ab3f0063

net: bpf: rename ndo_xdp to ndo_bpf · f4e63525

由 Jakub Kicinski 提交于 11月 03, 2017

ndo_xdp is a control path callback for setting up XDP in the
driver.  We can reuse it for other forms of communication
between the eBPF stack and the drivers.  Rename the callback
and associated structures and definitions.
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: NSimon Horman <simon.horman@netronome.com>
Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f4e63525

03 11月, 2017 1 次提交

net: core: introduce mini_Qdisc and eliminate usage of tp->q for clsact fastpath · 46209401

由 Jiri Pirko 提交于 11月 03, 2017

In sch_handle_egress and sch_handle_ingress tp->q is used only in order
to update stats. So stats and filter list are the only things that are
needed in clsact qdisc fastpath processing. Introduce new mini_Qdisc
struct to hold those items. Also, introduce a helper to swap the
mini_Qdisc structures in case filter list head changes.

This removes need for tp->q usage without added overhead.
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

46209401

28 10月, 2017 1 次提交

net/sched: Add support for HW offloading for CBS · 3d0bd028

由 Vinicius Costa Gomes 提交于 10月 16, 2017

This adds support for offloading the CBS algorithm to the controller,
if supported. Drivers wanting to support CBS offload must implement
the .ndo_setup_tc callback and handle the TC_SETUP_CBS (introduced
here) type.
Signed-off-by: NVinicius Costa Gomes <vinicius.gomes@intel.com>
Tested-by: NHenrik Austad <henrik@austad.us>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

3d0bd028

21 10月, 2017 1 次提交

net: sched: add block bind/unbind notif. and extended block_get/put · 8c4083b3

由 Jiri Pirko 提交于 10月 19, 2017

Introduce new type of ndo_setup_tc message to propage binding/unbinding
of a block to driver. Call this ndo whenever qdisc gets/puts a block.
Alongside with this, there's need to propagate binder type from qdisc
code down to the notifier. So introduce extended variants of
block_get/put in order to pass this info.
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8c4083b3

18 10月, 2017 1 次提交

bpf: cpumap xdp_buff to skb conversion and allocation · 1c601d82

由 Jesper Dangaard Brouer 提交于 10月 16, 2017

This patch makes cpumap functional, by adding SKB allocation and
invoking the network stack on the dequeuing CPU.

For constructing the SKB on the remote CPU, the xdp_buff in converted
into a struct xdp_pkt, and it mapped into the top headroom of the
packet, to avoid allocating separate mem.  For now, struct xdp_pkt is
just a cpumap internal data structure, with info carried between
enqueue to dequeue.

If a driver doesn't have enough headroom it is simply dropped, with
return code -EOVERFLOW.  This will be picked up the xdp tracepoint
infrastructure, to allow users to catch this.

V2: take into account xdp->data_meta

V4:
 - Drop busypoll tricks, keeping it more simple.
 - Skip RPS and Generic-XDP-recursive-reinjection, suggested by Alexei

V5: correct RCU read protection around __netif_receive_skb_core.

V6: Setting TASK_RUNNING vs TASK_INTERRUPTIBLE based on talk with Rik van Riel
Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1c601d82

17 10月, 2017 1 次提交

tun: call dev_get_valid_name() before register_netdevice() · 0ad646c8

由 Cong Wang 提交于 10月 13, 2017

register_netdevice() could fail early when we have an invalid
dev name, in which case ->ndo_uninit() is not called. For tun
device, this is a problem because a timer etc. are already
initialized and it expects ->ndo_uninit() to clean them up.

We could move these initializations into a ->ndo_init() so
that register_netdevice() knows better, however this is still
complicated due to the logic in tun_detach().

Therefore, I choose to just call dev_get_valid_name() before
register_netdevice(), which is quicker and much easier to audit.
And for this specific case, it is already enough.

Fixes: 96442e42 ("tuntap: choose the txq based on rxq")
Reported-by: NDmitry Alexeev <avekceeb@gmail.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0ad646c8

05 10月, 2017 3 次提交

net: Add extack to upper device linking · 42ab19ee

由 David Ahern 提交于 10月 04, 2017

Add extack arg to netdev_upper_dev_link and netdev_master_upper_dev_link
Signed-off-by: NDavid Ahern <dsahern@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

42ab19ee

net: Add extack to ndo_add_slave · 33eaf2a6

由 David Ahern 提交于 10月 04, 2017

Pass extack to do_set_master and down to ndo_add_slave
Signed-off-by: NDavid Ahern <dsahern@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

33eaf2a6

net: Add extack to netdev_notifier_info · 51d0c047

由 David Ahern 提交于 10月 04, 2017

Add netlink_ext_ack to netdev_notifier_info to allow notifier
handlers to return errors to userspace.

Clean up the initialization in dev.c such that extack is easily
added in subsequent patches where relevant. Specifically, remove
the init call in call_netdevice_notifiers_info and have callers
initalize on stack when info is declared.
Signed-off-by: NDavid Ahern <dsahern@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

51d0c047

04 10月, 2017 1 次提交

net: core: decouple ifalias get/set from rtnl lock · 6c557001

由 Florian Westphal 提交于 10月 02, 2017

Device alias can be set by either rtnetlink (rtnl is held) or sysfs.

rtnetlink hold the rtnl mutex, sysfs acquires it for this purpose.
Add an extra mutex for it and use rcu to protect concurrent accesses.

This allows the sysfs path to not take rtnl and would later allow
to not hold it when dumping ifalias.

Based on suggestion from Eric Dumazet.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6c557001

01 10月, 2017 1 次提交

net: dsa: change dsa_ptr for a dsa_port · 2f657a60

由 Vivien Didelot 提交于 9月 29, 2017

With DSA, a master net device (CPU facing interface) has a dsa_ptr
pointer to which hangs a dsa_switch_tree. This is not correct because a
master interface is wired to a dedicated switch port, and because we can
theoretically have several master interfaces pointing to several CPU
ports of the same switch fabric.

Change the master interface's dsa_ptr for the CPU dsa_port pointer.
This is a step towards supporting multiple CPU ports.
Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2f657a60

02 9月, 2017 1 次提交

mlxsw: spectrum: Forbid linking to devices that have uppers · 25cc72a3

由 Ido Schimmel 提交于 9月 01, 2017

The mlxsw driver relies on NETDEV_CHANGEUPPER events to configure the
device in case a port is enslaved to a master netdev such as bridge or
bond.

Since the driver ignores events unrelated to its ports and their
uppers, it's possible to engineer situations in which the device's data
path differs from the kernel's.

One example to such a situation is when a port is enslaved to a bond
that is already enslaved to a bridge. When the bond was enslaved the
driver ignored the event - as the bond wasn't one of its uppers - and
therefore a bridge port instance isn't created in the device.

Until such configurations are supported forbid them by checking that the
upper device doesn't have uppers of its own.

Fixes: 0d65fc13 ("mlxsw: spectrum: Implement LAG port join/leave")
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Reported-by: NNogah Frankel <nogahf@mellanox.com>
Tested-by: NNogah Frankel <nogahf@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

25cc72a3

01 9月, 2017 1 次提交

net: fix two typos in net_device_ops documentation. · f16ded59

由 Rami Rosen 提交于 8月 31, 2017

This patch fixes two trivial typos in net_device_ops documentation,
related to ndo_xdp_flush callback.
Signed-off-by: NRami Rosen <rami.rosen@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f16ded59

30 8月, 2017 1 次提交

net: remove dmaengine.h inclusion from netdevice.h · 0dd5759d

由 Dave Jiang 提交于 8月 29, 2017

Since the removal of NET_DMA, dmaengine.h header file shouldn't be needed
by netdevice.h anymore.
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0dd5759d

29 8月, 2017 1 次提交

smp: Avoid using two cache lines for struct call_single_data · 966a9671

由 Ying Huang 提交于 8月 08, 2017

struct call_single_data is used in IPIs to transfer information between
CPUs.  Its size is bigger than sizeof(unsigned long) and less than
cache line size.  Currently it is not allocated with any explicit alignment
requirements.  This makes it possible for allocated call_single_data to
cross two cache lines, which results in double the number of the cache lines
that need to be transferred among CPUs.

This can be fixed by requiring call_single_data to be aligned with the
size of call_single_data. Currently the size of call_single_data is the
power of 2.  If we add new fields to call_single_data, we may need to
add padding to make sure the size of new definition is the power of 2
as well.

Fortunately, this is enforced by GCC, which will report bad sizes.

To set alignment requirements of call_single_data to the size of
call_single_data, a struct definition and a typedef is used.

To test the effect of the patch, I used the vm-scalability multiple
thread swap test case (swap-w-seq-mt).  The test will create multiple
threads and each thread will eat memory until all RAM and part of swap
is used, so that huge number of IPIs are triggered when unmapping
memory.  In the test, the throughput of memory writing improves ~5%
compared with misaligned call_single_data, because of faster IPIs.
Suggested-by: NPeter Zijlstra <peterz@infradead.org>
Signed-off-by: NHuang, Ying <ying.huang@intel.com>
[ Add call_single_data_t and align with size of call_single_data. ]
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Aaron Lu <aaron.lu@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/87bmnqd6lz.fsf@yhuang-mobile.sh.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

966a9671

28 8月, 2017 1 次提交

netfilter: convert hook list to an array · 960632ec

由 Aaron Conole 提交于 8月 24, 2017

This converts the storage and layout of netfilter hook entries from a
linked list to an array.  After this commit, hook entries will be
stored adjacent in memory.  The next pointer is no longer required.

The ops pointers are stored at the end of the array as they are only
used in the register/unregister path and in the legacy br_netfilter code.

nf_unregister_net_hooks() is slower than needed as it just calls
nf_unregister_net_hook in a loop (i.e. at least n synchronize_net()
calls), this will be addressed in followup patch.

Test setup:
 - ixgbe 10gbit
 - netperf UDP_STREAM, 64 byte packets
 - 5 hooks: (raw + mangle prerouting, mangle+filter input, inet filter):
empty mangle and raw prerouting, mangle and filter input hooks:
353.9
this patch:
364.2
Signed-off-by: NAaron Conole <aconole@bytheb.org>
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

960632ec

19 8月, 2017 3 次提交

net: drop unused attribute argument from sysfs queue funcs · 718ad681

由 stephen hemminger 提交于 8月 18, 2017

The show and store functions don't need/use the attribute.
Signed-off-by: NStephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

718ad681

net: constify net_ns_type_operations · 737aec57

由 stephen hemminger 提交于 8月 18, 2017

This can be const.
Signed-off-by: NStephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

737aec57

net: constify netdev_class_file · b793dc5c

由 stephen hemminger 提交于 8月 18, 2017

These functions are wrapper arount class_create_file which can take a
const attribute.
Signed-off-by: NStephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b793dc5c

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功