提交 · 8b69a803814bb8b14155ea60df83f6d57527e69e · openeuler / Kernel

10 1月, 2020 4 次提交

skb: add helpers to allocate ext independently from sk_buff · 8b69a803

由 Paolo Abeni 提交于 1月 09, 2020

Currently we can allocate the extension only after the skb,
this change allows the user to do the opposite, will simplify
allocation failure handling from MPTCP.
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8b69a803

mptcp: Add MPTCP to skb extensions · 3ee17bc7

由 Mat Martineau 提交于 1月 09, 2020

Add enum value for MPTCP and update config dependencies

v5 -> v6:
 - fixed '__unused' field size
Co-developed-by: NMatthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: NMatthieu Baerts <matthieu.baerts@tessares.net>
Co-developed-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3ee17bc7

sock: Make sk_protocol a 16-bit value · bf976514

由 Mat Martineau 提交于 1月 09, 2020

Match the 16-bit width of skbuff->protocol. Fills an 8-bit hole so
sizeof(struct sock) does not change.

Also take care of BPF field access for sk_type/sk_protocol. Both of them
are now outside the bitfield, so we can use load instructions without
further shifting/masking.

v5 -> v6:
 - update eBPF accessors, too (Intel's kbuild test robot)
v2 -> v3:
 - keep 'sk_type' 2 bytes aligned (Eric)
v1 -> v2:
 - preserve sk_pacing_shift as bit field (Eric)

Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: bpf@vger.kernel.org
Co-developed-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Co-developed-by: NMatthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: NMatthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bf976514

flow_dissector: fix document for skb_flow_get_icmp_tci · 6b3acfc3

由 Li RongQing 提交于 1月 09, 2020

using correct input parameter name to fix the below warning:

net/core/flow_dissector.c:242: warning: Function parameter or member 'thoff' not described in 'skb_flow_get_icmp_tci'
net/core/flow_dissector.c:242: warning: Excess function parameter 'toff' description in 'skb_flow_get_icmp_tci'
Signed-off-by: NLi RongQing <lirongqing@baidu.com>
Reviewed-by: NSimon Horman <simon.horman@netronome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6b3acfc3

09 1月, 2020 2 次提交

devlink: add devink notification when reporter update health state · 97ff3bd3

由 Vikas Gupta 提交于 1月 02, 2020

add a devlink notification when reporter update the health
state.
Signed-off-by: NVikas Gupta <vikas.gupta@broadcom.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

97ff3bd3

devlink: add support for reporter recovery completion · 6181e5cb

由 Vikas Gupta 提交于 1月 02, 2020

It is possible that a reporter recovery completion do not finish
successfully when recovery is triggered via
devlink_health_reporter_recover as recovery could be processed in
different context. In such scenario an error is returned by driver when
recover hook is invoked and successful recovery completion is
intimated later.
Expose devlink recover done API to update recovery stats.
Signed-off-by: NVikas Gupta <vikas.gupta@broadcom.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6181e5cb

03 1月, 2020 2 次提交

page_pool: help compiler remove code in case CONFIG_NUMA=n · f13fc107

由 Jesper Dangaard Brouer 提交于 12月 27, 2019

When kernel is compiled without NUMA support, then page_pool NUMA
config setting (pool->p.nid) doesn't make any practical sense. The
compiler cannot see that it can remove the code paths.

This patch avoids reading pool->p.nid setting in case of !CONFIG_NUMA,
in allocation and numa check code, which helps compiler to see the
optimisation potential. It leaves update code intact to keep API the
same.

 $ ./scripts/bloat-o-meter net/core/page_pool.o-numa-enabled \
                           net/core/page_pool.o-numa-disabled
 add/remove: 0/0 grow/shrink: 0/3 up/down: 0/-113 (-113)
 Function                                     old     new   delta
 page_pool_create                             401     398      -3
 __page_pool_alloc_pages_slow                 439     426     -13
 page_pool_refill_alloc_cache                 425     328     -97
 Total: Before=3611, After=3498, chg -3.13%
Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f13fc107

page_pool: handle page recycle for NUMA_NO_NODE condition · 44768dec

由 Jesper Dangaard Brouer 提交于 12月 27, 2019

The check in pool_page_reusable (page_to_nid(page) == pool->p.nid) is
not valid if page_pool was configured with pool->p.nid = NUMA_NO_NODE.

The goal of the NUMA changes in commit d5394610 ("page_pool: Don't
recycle non-reusable pages"), were to have RX-pages that belongs to the
same NUMA node as the CPU processing RX-packet during softirq/NAPI. As
illustrated by the performance measurements.

This patch moves the NAPI checks out of fast-path, and at the same time
solves the NUMA_NO_NODE issue.

First realize that alloc_pages_node() with pool->p.nid = NUMA_NO_NODE
will lookup current CPU nid (Numa ID) via numa_mem_id(), which is used
as the the preferred nid. It is only in rare situations, where
e.g. NUMA zone runs dry, that page gets doesn't get allocated from
preferred nid. The page_pool API allows drivers to control the nid
themselves via controlling pool->p.nid.

This patch moves the NAPI check to when alloc cache is refilled, via
dequeuing/consuming pages from the ptr_ring. Thus, we can allow placing
pages from remote NUMA into the ptr_ring, as the dequeue/consume step
will check the NUMA node. All current drivers using page_pool will
alloc/refill RX-ring from same CPU running softirq/NAPI process.

Drivers that control the nid explicitly, also use page_pool_update_nid
when changing nid runtime. To speed up transision to new nid the alloc
cache is now flushed on nid changes. This force pages to come from
ptr_ring, which does the appropate nid check.

For the NUMA_NO_NODE case, when a NIC IRQ is moved to another NUMA
node, we accept that transitioning the alloc cache doesn't happen
immediately. The preferred nid change runtime via consulting
numa_mem_id() based on the CPU processing RX-packets.

Notice, to avoid stressing the page buddy allocator and avoid doing too
much work under softirq with preempt disabled, the NUMA check at
ptr_ring dequeue will break the refill cycle, when detecting a NUMA
mismatch. This will cause a slower transition, but its done on purpose.

Fixes: d5394610 ("page_pool: Don't recycle non-reusable pages")
Reported-by: NLi RongQing <lirongqing@baidu.com>
Reported-by: NYunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

44768dec

26 12月, 2019 2 次提交

net: Introduce peer to peer one step PTP time stamping. · b6fd7b96

由 Richard Cochran 提交于 12月 25, 2019

The 1588 standard defines one step operation for both Sync and
PDelay_Resp messages.  Up until now, hardware with P2P one step has
been rare, and kernel support was lacking.  This patch adds support of
the mode in anticipation of new hardware developments.
Signed-off-by: NRichard Cochran <richardcochran@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b6fd7b96

net: Introduce a new MII time stamping interface. · 4715f65f

由 Richard Cochran 提交于 12月 25, 2019

Currently the stack supports time stamping in PHY devices.  However,
there are newer, non-PHY devices that can snoop an MII bus and provide
time stamps.  In order to support such devices, this patch introduces
a new interface to be used by both PHY and non-PHY devices.

In addition, the one and only user of the old PHY time stamping API is
converted to the new interface.
Signed-off-by: NRichard Cochran <richardcochran@gmail.com>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4715f65f

25 12月, 2019 2 次提交

net: Rephrased comments section of skb_mpls_pop() · 76f99f98

由 Martin Varghese 提交于 12月 21, 2019

Rephrased comments section of skb_mpls_pop() to align it with
comments section of skb_mpls_push().
Signed-off-by: NMartin Varghese <martin.varghese@nokia.com>
Acked-by: NPravin B Shelar <pshelar@ovn.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

76f99f98

net: skb_mpls_push() modified to allow MPLS header push at start of packet. · e7dbfed1

由 Martin Varghese 提交于 12月 21, 2019

The existing skb_mpls_push() implementation always inserts mpls header
after the mac header. L2 VPN use cases requires MPLS header to be
inserted before the ethernet header as the ethernet packet gets tunnelled
inside MPLS header in those cases.
Signed-off-by: NMartin Varghese <martin.varghese@nokia.com>
Acked-by: NPravin B Shelar <pshelar@ovn.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e7dbfed1

20 12月, 2019 6 次提交

xdp: Simplify __bpf_tx_xdp_map() · 1170beaa