提交 · f4069cd7fa6583e7094001c6fce6f426d17a4c76 · openeuler / Kernel

12 8月, 2019 1 次提交

net: phy: prepare phylib to deal with PHY's extending Clause 22 · f4069cd7

由 Heiner Kallweit 提交于 8月 09, 2019

The integrated PHY in 2.5Gbps chip RTL8125 is the first (known to me)
PHY that uses standard Clause 22 for all modes up to 1Gbps and adds
2.5Gbps control using vendor-specific registers. To use phylib for
the standard part little extensions are needed:
- Move most of genphy_config_aneg to a new function
  __genphy_config_aneg that takes a parameter whether restarting
  auto-negotiation is needed (depending on whether content of
  vendor-specific advertisement register changed).
- Don't clear phydev->lp_advertising in genphy_read_status so that
  we can set non-C22 mode flags before.

Basically both changes mimic the behavior of the equivalent Clause 45
functions.
Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f4069cd7

11 8月, 2019 2 次提交

mlx5: no need to check return value of debugfs_create functions · 9f818c8a

由 Greg Kroah-Hartman 提交于 8月 10, 2019

When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

This cleans up a lot of unneeded code and logic around the debugfs
files, making all of this much simpler and easier to understand as we
don't need to keep the dentries saved anymore.

Cc: Saeed Mahameed <saeedm@mellanox.com>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: netdev@vger.kernel.org
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9f818c8a

wimax: no need to check return value of debugfs_create functions · a62052ba

由 Greg Kroah-Hartman 提交于 8月 10, 2019

When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

This cleans up a lot of unneeded code and logic around the debugfs wimax
files, making all of this much simpler and easier to understand.

Cc: Inaky Perez-Gonzalez <inaky.perez-gonzalez@intel.com>
Cc: linux-wimax@intel.com
Cc: netdev@vger.kernel.org
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a62052ba

10 8月, 2019 3 次提交

net/mlx5: E-switch, Removed unused hwid · ef2e4094

由 Parav Pandit 提交于 7月 26, 2019

Currently mlx5_eswitch_rep stores same hw ID for all representors.
However it is never used from this structure.
It is always used from mlx5_vport.

Hence, remove unused field.
Signed-off-by: NParav Pandit <parav@mellanox.com>
Reviewed-by: NVu Pham <vuhuong@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

ef2e4094

net/mlx5e: Protect mod_hdr hash table with mutex · d2faae25

由 Vlad Buslov 提交于 8月 09, 2019

To remove dependency on rtnl lock, protect mod_hdr hash table from
concurrent modifications with new mutex.

Implement helper function to get flow namespace to prevent code
duplication.
Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
Reviewed-by: NRoi Dayan <roid@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

d2faae25

net/mlx5e: Extend mod header entry with reference counter · dd58edc3

由 Vlad Buslov 提交于 6月 01, 2018

List of flows attached to mod header entry is used as implicit reference
counter (mod header entry is deallocated when list becomes free) and as a
mechanism to obtain mod header entry that flow is attached to (through list
head). This is not safe when concurrent modification of list of flows
attached to mod header entry is possible. Proper atomic reference counter
is required to support concurrent access.

As a preparation for extending mod header with reference counting, extract
code that lookups and deletes mod header entry into standalone put/get
helpers. In order to remove this dependency on external locking, extend mod
header entry with reference counter to manage its lifetime and extend flow
structure with direct pointer to mod header entry that flow is attached to.

To remove code duplication between legacy and switchdev mode
implementations that both support mod_hdr functionality, store mod_hdr
table in dedicated structure used by both fdb and kernel namespaces. New
table structure is extended with table lock by one of the following patches
in this series. Implement helper function to get correct mod_hdr table
depending on flow namespace.
Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
Reviewed-by: NJianbo Liu <jianbol@mellanox.com>
Reviewed-by: NRoi Dayan <roid@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

dd58edc3

09 8月, 2019 3 次提交

net: stmmac: Implement RSS and enable it in XGMAC core · 76067459

由 Jose Abreu 提交于 8月 07, 2019

Implement the RSS functionality and add the corresponding callbacks in
XGMAC core.

Changes from v1:
	- Do not use magic constants (Jakub)
	- Use ethtool_rxfh_indir_default() (Jakub)
Signed-off-by: NJose Abreu <joabreu@synopsys.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

76067459

net: use listified RX for handling GRO_NORMAL skbs · 323ebb61

由 Edward Cree 提交于 8月 06, 2019

When GRO decides not to coalesce a packet, in napi_frags_finish(), instead
 of passing it to the stack immediately, place it on a list in the napi
 struct.  Then, at flush time (napi_complete_done(), napi_poll(), or
 napi_busy_loop()), call netif_receive_skb_list_internal() on the list.
We'd like to do that in napi_gro_flush(), but it's not called if
 !napi->gro_bitmask, so we have to do it in the callers instead.  (There are
 a handful of drivers that call napi_gro_flush() themselves, but it's not
 clear why, or whether this will affect them.)
Because a full 64 packets is an inefficiently large batch, also consume the
 list whenever it exceeds gro_normal_batch, a new net/core sysctl that
 defaults to 8.
Signed-off-by: NEdward Cree <ecree@solarflare.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

323ebb61

qed: Add new ethtool supported port types based on media. · 5e6d9fc7

由 Rahul Verma 提交于 8月 05, 2019

Supported ports in ethtool <eth1> are displayed based on media type.
For media type fibre and twinaxial, port type is "FIBRE". Media type
Base-T is "TP" and media KR is "Backplane".

V1->V2:
Corrected the subject.
Signed-off-by: NRahul Verma <rahulv@marvell.com>
Signed-off-by: NMichal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5e6d9fc7

03 8月, 2019 1 次提交

page flags: prioritize kasan bits over last-cpuid · ee38d94a

由 Arnd Bergmann 提交于 8月 02, 2019

ARM64 randdconfig builds regularly run into a build error, especially
when NUMA_BALANCING and SPARSEMEM are enabled but not SPARSEMEM_VMEMMAP:

  #error "KASAN: not enough bits in page flags for tag"

The last-cpuid bits are already contitional on the available space, so
the result of the calculation is a bit random on whether they were
already left out or not.

Adding the kasan tag bits before last-cpuid makes it much more likely to
end up with a successful build here, and should be reliable for
randconfig at least, as long as that does not randomize NR_CPUS or
NODES_SHIFT but uses the defaults.

In order for the modified check to not trigger in the x86 vdso32 code
where all constants are wrong (building with -m32), enclose all the
definitions with an #ifdef.

[arnd@arndb.de: build fix]
  Link: http://lkml.kernel.org/r/CAK8P3a3Mno1SWTcuAOT0Wa9VS15pdU6EfnkxLbDpyS55yO04+g@mail.gmail.com
Link: http://lkml.kernel.org/r/20190722115520.3743282-1-arnd@arndb.de
Link: https://lore.kernel.org/lkml/20190618095347.3850490-1-arnd@arndb.de/
Fixes: 2813b9c0 ("kasan, mm, arm64: tag non slab memory allocated via pagealloc")
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Reviewed-by: NAndrey Konovalov <andreyknvl@google.com>
Reviewed-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ee38d94a

02 8月, 2019 5 次提交

net/mlx5: Add flow counter pool · 558101f1

由 Gavi Teitz 提交于 6月 27, 2019

Add a pool of flow counters, based on flow counter bulks, removing the
need to allocate a new counter via a costly FW command during the flow
creation process. The time it takes to acquire/release a flow counter
is cut from ~50 [us] to ~50 [ns].

The pool is part of the mlx5 driver instance, and provides flow
counters for aging flows. mlx5_fc_create() was modified to provide
counters for aging flows from the pool by default, and
mlx5_destroy_fc() was modified to release counters back to the pool
for later reuse. If bulk allocation is not supported or fails, and for
non-aging flows, the fallback behavior is to allocate and free
individual counters.

The pool is comprised of three lists of flow counter bulks, one of
fully used bulks, one of partially used bulks, and one of unused
bulks. Counters are provided from the partially used bulks first, to
help limit bulk fragmentation.

The pool maintains a threshold, and strives to maintain the amount of
available counters below it. The pool is increased in size when a
counter acquisition request is made and there are no available
counters, and it is decreased in size when the last counter in a bulk
is released and there are more available counters than the threshold.
All pool size changes are done in the context of the
acquiring/releasing process.

The value of the threshold is directly correlated to the amount of
used counters the pool is providing, while constrained by a hard
maximum, and is recalculated every time a bulk is allocated/freed.
This ensures that the pool only consumes large amounts of memory for
available counters if the pool is being used heavily. When fully
populated and at the hard maximum, the buffer of available counters
consumes ~40 [MB].
Signed-off-by: NGavi Teitz <gavi@mellanox.com>
Reviewed-by: NVlad Buslov <vladbu@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

558101f1

net/mlx5: E-Switch, Verify support QoS element type · 6cedde45

由 Eli Cohen 提交于 7月 29, 2019

Check if firmware supports the requested element type before
attempting to create the element type.
In addition, explicitly specify the request element type and tsar type.
Signed-off-by: NEli Cohen <eli@mellanox.com>
Reviewed-by: NPaul Blakey <paulb@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

6cedde45

net/mlx5: Fix offset of tisc bits reserved field · 7761f9ee

由 Saeed Mahameed 提交于 7月 29, 2019

First reserved field is off by one instead of reserved_at_1 it should be
reserved_at_2, fix that.

Fixes: a12ff35e ("net/mlx5: Introduce TLS TX offload hardware bits and structures")
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

7761f9ee

net/mlx5: Add flow counter bulk allocation hardware bits and command · 8536a6bf

由 Gavi Teitz 提交于 7月 29, 2019

Add a handle to invoke the new FW capability of allocating a bulk of
flow counters.
Signed-off-by: NGavi Teitz <gavi@mellanox.com>
Reviewed-by: NVlad Buslov <vladbu@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

8536a6bf

net/mlx5: Refactor and optimize flow counter bulk query · 6f06e04b

由 Gavi Teitz 提交于 7月 29, 2019

Towards introducing the ability to allocate bulks of flow counters,
refactor the flow counter bulk query process, removing functions and
structs whose names indicated being used for flow counter bulk
allocation FW commands, despite them actually only being used to
support bulk querying, and migrate their functionality to correctly
named functions in their natural location, fs_counters.c.

Additionally, optimize the bulk query process by:
 * Extracting the memory used for the query to mlx5_fc_stats so
   that it is only allocated once, and not for each bulk query.
 * Querying all the counters in one function call.
Signed-off-by: NGavi Teitz <gavi@mellanox.com>
Reviewed-by: NVlad Buslov <vladbu@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

6f06e04b

01 8月, 2019 1 次提交

xen/swiotlb: remember having called xen_create_contiguous_region() · b877ac98

由 Juergen Gross 提交于 6月 14, 2019

Instead of always calling xen_destroy_contiguous_region() in case the
memory is DMA-able for the used device, do so only in case it has been
made DMA-able via xen_create_contiguous_region() before.

This will avoid a lot of xen_destroy_contiguous_region() calls for
64-bit capable devices.

As the memory in question is owned by swiotlb-xen the PG_owner_priv_1
flag of the first allocated page can be used for remembering.
Signed-off-by: NJuergen Gross <jgross@suse.com>
Acked-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: NJuergen Gross <jgross@suse.com>

b877ac98

31 7月, 2019 7 次提交

vsock/virtio: fix locking in virtio_transport_inc_tx_pkt() · 9632e9f6

由 Stefano Garzarella 提交于 7月 30, 2019

fwd_cnt and last_fwd_cnt are protected by rx_lock, so we should use
the same spinlock also if we are in the TX path.

Move also buf_alloc under the same lock.
Signed-off-by: NStefano Garzarella <sgarzare@redhat.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9632e9f6

vsock/virtio: reduce credit update messages · b89d882d

由 Stefano Garzarella 提交于 7月 30, 2019

In order to reduce the number of credit update messages,
we send them only when the space available seen by the
transmitter is less than VIRTIO_VSOCK_MAX_PKT_BUF_SIZE.
Signed-off-by: NStefano Garzarella <sgarzare@redhat.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b89d882d

vsock/virtio: limit the memory used per-socket · 473c7391

由 Stefano Garzarella 提交于 7月 30, 2019

Since virtio-vsock was introduced, the buffers filled by the host
and pushed to the guest using the vring, are directly queued in
a per-socket list. These buffers are preallocated by the guest
with a fixed size (4 KB).

The maximum amount of memory used by each socket should be
controlled by the credit mechanism.
The default credit available per-socket is 256 KB, but if we use
only 1 byte per packet, the guest can queue up to 262144 of 4 KB
buffers, using up to 1 GB of memory per-socket. In addition, the
guest will continue to fill the vring with new 4 KB free buffers
to avoid starvation of other sockets.

This patch mitigates this issue copying the payload of small
packets (< 128 bytes) into the buffer of last packet queued, in
order to avoid wasting memory.
Signed-off-by: NStefano Garzarella <sgarzare@redhat.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

473c7391

compat_ioctl: pppoe: fix PPPOEIOCSFWD handling · 055d8824

由 Arnd Bergmann 提交于 7月 30, 2019

Support for handling the PPPOEIOCSFWD ioctl in compat mode was added in
linux-2.5.69 along with hundreds of other commands, but was always broken
sincen only the structure is compatible, but the command number is not,
due to the size being sizeof(size_t), or at first sizeof(sizeof((struct
sockaddr_pppox)), which is different on 64-bit architectures.

Guillaume Nault adds:

  And the implementation was broken until 2016 (see 29e73269 ("pppoe:
  fix reference counting in PPPoE proxy")), and nobody ever noticed. I
  should probably have removed this ioctl entirely instead of fixing it.
  Clearly, it has never been used.

Fix it by adding a compat_ioctl handler for all pppoe variants that
translates the command number and then calls the regular ioctl function.

All other ioctl commands handled by pppoe are compatible between 32-bit
and 64-bit, and require compat_ptr() conversion.

This should apply to all stable kernels.
Acked-by: NGuillaume Nault <g.nault@alphalink.fr>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

055d8824

linux: Remove bvec page_offset, use bv_offset · 65c84f14

由 Jonathan Lemon 提交于 7月 30, 2019

Now that page_offset is referenced through accessors, remove
the union, and use bv_offset.
Signed-off-by: NJonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

65c84f14

linux: Add skb_frag_t page_offset accessors · 7240b60c

由 Jonathan Lemon 提交于 7月 30, 2019

Add skb_frag_off(), skb_frag_off_add(), skb_frag_off_set(),
and skb_frag_off_copy() accessors for page_offset.
Signed-off-by: NJonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7240b60c

loop: Fix mount(2) failure due to race with LOOP_SET_FD · 89e524c0

由 Jan Kara 提交于 7月 30, 2019

Commit 33ec3e53 ("loop: Don't change loop device under exclusive
opener") made LOOP_SET_FD ioctl acquire exclusive block device reference
while it updates loop device binding. However this can make perfectly
valid mount(2) fail with EBUSY due to racing LOOP_SET_FD holding
temporarily the exclusive bdev reference in cases like this:

for i in {a..z}{a..z}; do
        dd if=/dev/zero of=$i.image bs=1k count=0 seek=1024
        mkfs.ext2 $i.image
        mkdir mnt$i
done

echo "Run"
for i in {a..z}{a..z}; do
        mount -o loop -t ext2 $i.image mnt$i &
done

Fix the problem by not getting full exclusive bdev reference in
LOOP_SET_FD but instead just mark the bdev as being claimed while we
update the binding information. This just blocks new exclusive openers
instead of failing them with EBUSY thus fixing the problem.

Fixes: 33ec3e53 ("loop: Don't change loop device under exclusive opener")
Cc: stable@vger.kernel.org
Tested-by: NKai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

89e524c0

29 7月, 2019 3 次提交

NFC: nxp-nci: Get rid of platform data · 3b0b2783

由 Andy Shevchenko 提交于 7月 29, 2019

Legacy platform data must go away. We are on the safe side here since
there are no users of it in the kernel.

If anyone by any odd reason needs it the GPIO lookup tables and
built-in device properties at your service.
Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: NSedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3b0b2783

mac80211: add support for the ADDBA extension element · 2ab45876

由 John Crispin 提交于 7月 29, 2019

HE allows peers to negotiate the aggregation fragmentation level to be used
during transmission. The level can be 1-3. The Ext element is added behind
the ADDBA request inside the action frame. The responder will then reply
with the same level or a lower one if the requested one is not supported.
This patch only handles the negotiation part as the ADDBA frames get passed
to the ATH11k firmware, which does the rest of the magic for us aswell as
generating the requests.
Signed-off-by: NShashidhar Lakkavalli <slakkavalli@datto.com>
Signed-off-by: NJohn Crispin <john@phrozen.org>
Link: https://lore.kernel.org/r/20190729104512.27615-1-john@phrozen.orgSigned-off-by: NJohannes Berg <johannes.berg@intel.com>

2ab45876

mac80211: fix ieee80211_he_oper_size() comment · 90d4962c

由 John Crispin 提交于 7月 29, 2019

Johannes mentioned that the comment should not reference mac80211 as other
subsystems might call the helper.
Signed-off-by: NJohn Crispin <john@phrozen.org>
Signed-off-by: NKalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20190729102342.8659-1-john@phrozen.orgSigned-off-by: NJohannes Berg <johannes.berg@intel.com>

90d4962c

28 7月, 2019 2 次提交

gpio: don't WARN() on NULL descs if gpiolib is disabled · ffe0bbab

由 Bartosz Golaszewski 提交于 7月 08, 2019

If gpiolib is disabled, we use the inline stubs from gpio/consumer.h
instead of regular definitions of GPIO API. The stubs for 'optional'
variants of gpiod_get routines return NULL in this case as if the
relevant GPIO wasn't found. This is correct so far.

Calling other (non-gpio_get) stubs from this header triggers a warning
because the GPIO descriptor couldn't have been requested. The warning
however is unconditional (WARN_ON(1)) and is emitted even if the passed
descriptor pointer is NULL.

We don't want to force the users of 'optional' gpio_get to check the
returned pointer before calling e.g. gpiod_set_value() so let's only
WARN on non-NULL descriptors.

Cc: stable@vger.kernel.org
Reported-by: NClaus H. Stovgaard <cst@phaseone.com>
Signed-off-by: NBartosz Golaszewski <bgolaszewski@baylibre.com>

ffe0bbab

net: stmmac: Make MDIO bus reset optional · 1a981c05

由 Thierry Reding 提交于 7月 26, 2019

The Tegra EQOS driver already resets the MDIO bus at probe time via the
reset GPIO specified in the phy-reset-gpios device tree property. There
is no need to reset the bus again later on.

This avoids the need to query the device tree for the snps,reset GPIO,
which is not part of the Tegra EQOS device tree bindings. This quiesces
an error message from the generic bus reset code if it doesn't find the
snps,reset related delays.
Signed-off-by: NThierry Reding <treding@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1a981c05

27 7月, 2019 2 次提交

of: Fix typo in kerneldoc · f1765a18

由 Thierry Reding 提交于 7月 26, 2019

"Findfrom" is not a word. Replace the function synopsis by something
that makes sense.
Signed-off-by: NThierry Reding <treding@nvidia.com>
Signed-off-by: NRob Herring <robh@kernel.org>

f1765a18

net: qualcomm: rmnet: Fix incorrect UL checksum offload logic · a7cf3d24

由 Subash Abhinov Kasiviswanathan 提交于 7月 25, 2019

The udp_ip4_ind bit is set only for IPv4 UDP non-fragmented packets
so that the hardware can flip the checksum to 0xFFFF if the computed
checksum is 0 per RFC768.

However, this bit had to be set for IPv6 UDP non fragmented packets
as well per hardware requirements. Otherwise, IPv6 UDP packets
with computed checksum as 0 were transmitted by hardware and were
dropped in the network.

In addition to setting this bit for IPv6 UDP, the field is also
appropriately renamed to udp_ind as part of this change.

Fixes: 5eb5f860 ("net: qualcomm: rmnet: Add support for TX checksum offload")
Cc: Sean Tranchetti <stranche@codeaurora.org>
Signed-off-by: NSubash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a7cf3d24

26 7月, 2019 7 次提交

mac80211: HE: add Spatial Reuse element parsing support · ef11a931

由 John Crispin 提交于 6月 18, 2019

Add support to mac80211 for parsing SPR elements as per
P802.11ax_D4.0 section 9.4.2.241.
Signed-off-by: NShashidhar Lakkavalli <slakkavalli@datto.com>
Signed-off-by: NJohn Crispin <john@phrozen.org>
Link: https://lore.kernel.org/r/20190618061915.7102-2-john@phrozen.orgSigned-off-by: NJohannes Berg <johannes.berg@intel.com>

ef11a931

mac80211: add support for parsing ADDBA_EXT IEs · 2aa485e1

由 John Crispin 提交于 7月 13, 2019

ADDBA_EXT IEs can be used to negotiate the BA fragmentation level.
Signed-off-by: NJohn Crispin <john@phrozen.org>
Link: https://lore.kernel.org/r/20190713163642.18491-2-john@phrozen.orgSigned-off-by: NJohannes Berg <johannes.berg@intel.com>

2aa485e1

net/mlx5e: Prevent encap flow counter update async to user query · 90bb7692

由 Ariel Levkovich 提交于 7月 06, 2019

This patch prevents a race between user invoked cached counters
query and a neighbor last usage updater.

The cached flow counter stats can be queried by calling
"mlx5_fc_query_cached" which provides the number of bytes and
packets that passed via this flow since the last time this counter
was queried.
It does so by reducting the last saved stats from the current, cached
stats and then updating the last saved stats with the cached stats.
It also provide the lastuse value for that flow.

Since "mlx5e_tc_update_neigh_used_value" needs to retrieve the
last usage time of encapsulation flows, it calls the flow counter
query method periodically and async to user queries of the flow counter
using cls_flower.
This call is causing the driver to update the last reported bytes and
packets from the cache and therefore, future user queries of the flow
stats will return lower than expected number for bytes and packets
since the last saved stats in the driver was updated async to the last
saved stats in cls_flower.

This causes wrong stats presentation of encapsulation flows to user.

Since the neighbor usage updater only needs the lastuse stats from the
cached counter, the fix is to use a dedicated lastuse query call that
returns the lastuse value without synching between the cached stats and
the last saved stats.

Fixes: f6dfb4c3 ("net/mlx5e: Update neighbour 'used' state using HW flow rules counters")
Signed-off-by: NAriel Levkovich <lariel@mellanox.com>
Reviewed-by: NRoi Dayan <roid@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

90bb7692

net/mlx5: Fix modify_cq_in alignment · 7a32f296

由 Edward Srouji 提交于 7月 23, 2019

Fix modify_cq_in alignment to match the device specification.
After this fix the 'cq_umem_valid' field will be in the right offset.

Cc: <stable@vger.kernel.org> # 4.19
Fixes: bd371975 ("net/mlx5: Update mlx5_ifc with DEVX UID bits")
Signed-off-by: NEdward Srouji <edwards@mellanox.com>
Reviewed-by: NYishai Hadas <yishaih@mellanox.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

7a32f296

mm/hmm: move hmm_vma_range_done and hmm_vma_fault to nouveau · 02712bc3

由 Christoph Hellwig 提交于 7月 24, 2019

These two functions are marked as a legacy APIs to get rid of, but seem to
suit the current nouveau flow. Move it to the only user in preparation
for fixing a locking bug involving caller and callee. All comments
referring to the old API have been removed as this now is a driver private
helper.

Link: https://lore.kernel.org/r/20190724065258.16603-3-hch@lst.deTested-by: NRalph Campbell <rcampbell@nvidia.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

02712bc3

lib/dim: Fix -Wunused-const-variable warnings · f8be17b8

由 Leon Romanovsky 提交于 7月 23, 2019

DIM causes to the following warnings during kernel compilation
which indicates that tx_profile and rx_profile are supposed to
be declared in *.c and not in *.h files.

In file included from ./include/rdma/ib_verbs.h:64,
                 from ./include/linux/mlx5/device.h:37,
                 from ./include/linux/mlx5/driver.h:51,
                 from ./include/linux/mlx5/vport.h:36,
                 from drivers/infiniband/hw/mlx5/ib_virt.c:34:
./include/linux/dim.h:326:1: warning: _tx_profile_ defined but not used [-Wunused-const-variable=]
  326 | tx_profile[DIM_CQ_PERIOD_NUM_MODES][NET_DIM_PARAMS_NUM_PROFILES] = {
      | ^~~~~~~~~~
./include/linux/dim.h:320:1: warning: _rx_profile_ defined but not used [-Wunused-const-variable=]
  320 | rx_profile[DIM_CQ_PERIOD_NUM_MODES][NET_DIM_PARAMS_NUM_PROFILES] = {
      | ^~~~~~~~~~

Fixes: 4f75da36 ("linux/dim: Move implementation to .c files")
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Reviewed-by: NBart Van Assche <bvanassche@acm.org>
Acked-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f8be17b8

platform/x86: wmi: add missing struct parameter description · 8732d85a

由 Mattias Jacobsson 提交于 7月 19, 2019

Add a description for the context parameter in the struct wmi_device_id.
Reported-by: Nkbuild test robot <lkp@intel.com>
Fixes: a48e2338 ("platform/x86: wmi: add context pointer field to struct wmi_device_id")
Signed-off-by: NMattias Jacobsson <2pi@mok.nu>
Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>

8732d85a

25 7月, 2019 3 次提交

sched/fair: Use RCU accessors consistently for ->numa_group · cb361d8c

由 Jann Horn 提交于 7月 16, 2019

The old code used RCU annotations and accessors inconsistently for
->numa_group, which can lead to use-after-frees and NULL dereferences.

Let all accesses to ->numa_group use proper RCU helpers to prevent such
issues.
Signed-off-by: NJann Horn <jannh@google.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Fixes: 8c8a743c ("sched/numa: Use {cpu, pid} to create task groups for shared faults")
Link: https://lkml.kernel.org/r/20190716152047.14424-3-jannh@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

cb361d8c

sched/fair: Don't free p->numa_faults with concurrent readers · 16d51a59

由 Jann Horn 提交于 7月 16, 2019

When going through execve(), zero out the NUMA fault statistics instead of
freeing them.

During execve, the task is reachable through procfs and the scheduler. A
concurrent /proc/*/sched reader can read data from a freed ->numa_faults
allocation (confirmed by KASAN) and write it back to userspace.
I believe that it would also be possible for a use-after-free read to occur
through a race between a NUMA fault and execve(): task_numa_fault() can
lead to task_numa_compare(), which invokes task_weight() on the currently
running task of a different CPU.

Another way to fix this would be to make ->numa_faults RCU-managed or add
extra locking, but it seems easier to wipe the NUMA fault statistics on
execve.
Signed-off-by: NJann Horn <jannh@google.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Fixes: 82727018 ("sched/numa: Call task_numa_free() from do_execve()")
Link: https://lkml.kernel.org/r/20190716152047.14424-1-jannh@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

16d51a59

access: avoid the RCU grace period for the temporary subjective credentials · d7852fbd

由 Linus Torvalds 提交于 7月 11, 2019

It turns out that 'access()' (and 'faccessat()') can cause a lot of RCU
work because it installs a temporary credential that gets allocated and
freed for each system call.

The allocation and freeing overhead is mostly benign, but because
credentials can be accessed under the RCU read lock, the freeing
involves a RCU grace period.

Which is not a huge deal normally, but if you have a lot of access()
calls, this causes a fair amount of seconday damage: instead of having a
nice alloc/free patterns that hits in hot per-CPU slab caches, you have
all those delayed free's, and on big machines with hundreds of cores,
the RCU overhead can end up being enormous.

But it turns out that all of this is entirely unnecessary.  Exactly
because access() only installs the credential as the thread-local
subjective credential, the temporary cred pointer doesn't actually need
to be RCU free'd at all.  Once we're done using it, we can just free it
synchronously and avoid all the RCU overhead.

So add a 'non_rcu' flag to 'struct cred', which can be set by users that
know they only use it in non-RCU context (there are other potential
users for this).  We can make it a union with the rcu freeing list head
that we need for the RCU case, so this doesn't need any extra storage.

Note that this also makes 'get_current_cred()' clear the new non_rcu
flag, in case we have filesystems that take a long-term reference to the
cred and then expect the RCU delayed freeing afterwards.  It's not
entirely clear that this is required, but it makes for clear semantics:
the subjective cred remains non-RCU as long as you only access it
synchronously using the thread-local accessors, but you _can_ use it as
a generic cred if you want to.

It is possible that we should just remove the whole RCU markings for
->cred entirely.  Only ->real_cred is really supposed to be accessed
through RCU, and the long-term cred copies that nfs uses might want to
explicitly re-enable RCU freeing if required, rather than have
get_current_cred() do it implicitly.

But this is a "minimal semantic changes" change for the immediate
problem.
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: NEric Dumazet <edumazet@google.com>
Acked-by: NPaul E. McKenney <paulmck@linux.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Jan Glauber <jglauber@marvell.com>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Jayachandran Chandrasekharan Nair <jnair@marvell.com>
Cc: Greg KH <greg@kroah.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d7852fbd

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功