提交 · 032a9966a22a3596addf81dacf0c1736dfedc32a · openeuler / Kernel

25 6月, 2020 6 次提交

nvme-rdma: assign completion vector correctly · 032a9966

由 Max Gurtovoy 提交于 6月 23, 2020

The completion vector index that is given during CQ creation can't
exceed the number of support vectors by the underlying RDMA device. This
violation currently can accure, for example, in case one will try to
connect with N regular read/write queues and M poll queues and the sum
of N + M > num_supported_vectors. This will lead to failure in establish
a connection to remote target. Instead, in that case, share a completion
vector between queues.
Signed-off-by: NMax Gurtovoy <maxg@mellanox.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

032a9966

nvme-loop: initialize tagset numa value to the value of the ctrl · 1b4ad7a5

由 Max Gurtovoy 提交于 6月 16, 2020

Both admin's and drive's tagsets should be set according the numa
node of the controller.
Signed-off-by: NMax Gurtovoy <maxg@mellanox.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

1b4ad7a5

nvme-tcp: initialize tagset numa value to the value of the ctrl · 610c8235

由 Max Gurtovoy 提交于 6月 16, 2020

Both admin's and drive's tagsets should be set according the numa
node of the controller.
Signed-off-by: NMax Gurtovoy <maxg@mellanox.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

610c8235

nvme-pci: initialize tagset numa value to the value of the ctrl · d4ec47f1

由 Max Gurtovoy 提交于 6月 16, 2020

Both admin's and drive's tagsets should be set according the numa node
of the controller.
Signed-off-by: NMax Gurtovoy <maxg@mellanox.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

d4ec47f1

nvme-pci: override the value of the controller's numa node · 635333e4

由 Max Gurtovoy 提交于 6月 16, 2020

Set the node value according to the PCI device numa node.
Signed-off-by: NMax Gurtovoy <maxg@mellanox.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

635333e4

nvme: set initial value for controller's numa node · 4fea243e

由 Max Gurtovoy 提交于 6月 16, 2020

Initialize the node to NUMA_NO_NODE value. Transports that are aware of
numa node affinity can override it (e.g. RDMA transport set the affinity
according to the RDMA HCA).
Signed-off-by: NMax Gurtovoy <maxg@mellanox.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

4fea243e

18 6月, 2020 1 次提交

loop: replace kill_bdev with invalidate_bdev · f4bd34b1

由 Zheng Bin 提交于 6月 18, 2020

When a filesystem is mounted on a loop device and on a loop ioctl
LOOP_SET_STATUS64, because of kill_bdev, buffer_head mappings are getting
destroyed.
kill_bdev
  truncate_inode_pages
    truncate_inode_pages_range
      do_invalidatepage
        block_invalidatepage
          discard_buffer  -->clear BH_Mapped flag

sb_bread
  __bread_gfp
  bh = __getblk_gfp
  -->discard_buffer clear BH_Mapped flag
  __bread_slow
    submit_bh
      submit_bh_wbc
        BUG_ON(!buffer_mapped(bh))  --> hit this BUG_ON

Fixes: 5db470e2 ("loop: drop caches if offset or block_size are changed")
Signed-off-by: NZheng Bin <zhengbin13@huawei.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBart Van Assche <bvanassche@acm.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

f4bd34b1

15 6月, 2020 4 次提交

bcache: pr_info() format clean up in bcache_device_init() · 4b25bbf5

由 Coly Li 提交于 6月 15, 2020

scripts/checkpatch.pl reports following warning for patch
("bcache: check and adjust logical block size for backing devices"),
    WARNING: quoted string split across lines
    #146: FILE: drivers/md/bcache/super.c:896:
    +  pr_info("%s: sb/logical block size (%u) greater than page size "
    +	       "(%lu) falling back to device logical block size (%u)",

There are two things to fix up,
- The kernel message print should be in a single line.
- pr_info() won't automatically add new line since v5.8, a '\n' should
  be added.

This patch just does the above cleanup in bcache_device_init().
Signed-off-by: NColy Li <colyli@suse.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

4b25bbf5

bcache: use delayed kworker fo asynchronous devices registration · ee4a36f4

由 Coly Li 提交于 6月 15, 2020

This patch changes the asynchronous registration kworker to a delayed
kworker. There is probability queue_work() queues the async registration
kworker to the same CPU (even though very little), then the process
which writing sysfs interface to reigster bcache device may won't return
immeidately. queue_delayed_work() in this patch will delay 10 jiffies
before insert the kworker to run queue, which makes sure the registering
process may always returns to user space in time.

Fixes: 9e23ccf8 ("bcache: asynchronous devices registration")
Signed-off-by: NColy Li <colyli@suse.de>
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

ee4a36f4

bcache: check and adjust logical block size for backing devices · dcacbc12

由 Mauricio Faria de Oliveira 提交于 6月 15, 2020

It's possible for a block driver to set logical block size to
a value greater than page size incorrectly; e.g. bcache takes
the value from the superblock, set by the user w/ make-bcache.

This causes a BUG/NULL pointer dereference in the path:

  __blkdev_get()
  -> set_init_blocksize() // set i_blkbits based on ...
     -> bdev_logical_block_size()
        -> queue_logical_block_size() // ... this value
  -> bdev_disk_changed()
     ...
     -> blkdev_readpage()
        -> block_read_full_page()
           -> create_page_buffers() // size = 1 << i_blkbits
              -> create_empty_buffers() // give size/take pointer
                 -> alloc_page_buffers() // return NULL
                 .. BUG!

Because alloc_page_buffers() is called with size > PAGE_SIZE,
thus it initializes head = NULL, skips the loop, return head;
then create_empty_buffers() gets (and uses) the NULL pointer.

This has been around longer than commit ad6bf88a ("block:
fix an integer overflow in logical block size"); however, it
increased the range of values that can trigger the issue.

Previously only 8k/16k/32k (on x86/4k page size) would do it,
as greater values overflow unsigned short to zero, and queue_
logical_block_size() would then use the default of 512.

Now the range with unsigned int is much larger, and users w/
the 512k value, which happened to be zero'ed previously and
work fine, started to hit this issue -- as the zero is gone,
and queue_logical_block_size() does return 512k (>PAGE_SIZE.)

Fix this by checking the bcache device's logical block size,
and if it's greater than page size, fallback to the backing/
cached device's logical page size.

This doesn't affect cache devices as those are still checked
for block/page size in read_super(); only the backing/cached
devices are not.

Apparently it's a regression from commit 2903381f ("bcache:
Take data offset from the bdev superblock."), moving the check
into BCACHE_SB_VERSION_CDEV only. Now that we have superblocks
of backing devices out there with this larger value, we cannot
refuse to load them (i.e., have a similar check in _BDEV.)

Ideally perhaps bcache should use all values from the backing
device (physical/logical/io_min block size)? But for now just
fix the problematic case.

Test-case:

    # IMG=/root/disk.img
    # dd if=/dev/zero of=$IMG bs=1 count=0 seek=1G
    # DEV=$(losetup --find --show $IMG)
    # make-bcache --bdev $DEV --block 8k
      < see dmesg >

Before:

    # uname -r
    5.7.0-rc7

    [   55.944046] BUG: kernel NULL pointer dereference, address: 0000000000000000
    ...
    [   55.949742] CPU: 3 PID: 610 Comm: bcache-register Not tainted 5.7.0-rc7 #4
    ...
    [   55.952281] RIP: 0010:create_empty_buffers+0x1a/0x100
    ...
    [   55.966434] Call Trace:
    [   55.967021]  create_page_buffers+0x48/0x50
    [   55.967834]  block_read_full_page+0x49/0x380
    [   55.972181]  do_read_cache_page+0x494/0x610
    [   55.974780]  read_part_sector+0x2d/0xaa
    [   55.975558]  read_lba+0x10e/0x1e0
    [   55.977904]  efi_partition+0x120/0x5a6
    [   55.980227]  blk_add_partitions+0x161/0x390
    [   55.982177]  bdev_disk_changed+0x61/0xd0
    [   55.982961]  __blkdev_get+0x350/0x490
    [   55.983715]  __device_add_disk+0x318/0x480
    [   55.984539]  bch_cached_dev_run+0xc5/0x270
    [   55.986010]  register_bcache.cold+0x122/0x179
    [   55.987628]  kernfs_fop_write+0xbc/0x1a0
    [   55.988416]  vfs_write+0xb1/0x1a0
    [   55.989134]  ksys_write+0x5a/0xd0
    [   55.989825]  do_syscall_64+0x43/0x140
    [   55.990563]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
    [   55.991519] RIP: 0033:0x7f7d60ba3154
    ...

After:

    # uname -r
    5.7.0.bcachelbspgsz

    [   31.672460] bcache: bcache_device_init() bcache0: sb/logical block size (8192) greater than page size (4096) falling back to device logical block size (512)
    [   31.675133] bcache: register_bdev() registered backing device loop0

    # grep ^ /sys/block/bcache0/queue/*_block_size
    /sys/block/bcache0/queue/logical_block_size:512
    /sys/block/bcache0/queue/physical_block_size:8192
Reported-by: NRyan Finnie <ryan@finnie.org>
Reported-by: NSebastian Marsching <sebastian@marsching.com>
Signed-off-by: NMauricio Faria de Oliveira <mfo@canonical.com>
Signed-off-by: NColy Li <colyli@suse.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

dcacbc12

bcache: fix potential deadlock problem in btree_gc_coalesce · be23e837

由 Zhiqiang Liu 提交于 6月 15, 2020

coccicheck reports:
  drivers/md//bcache/btree.c:1538:1-7: preceding lock on line 1417

In btree_gc_coalesce func, if the coalescing process fails, we will goto
to out_nocoalesce tag directly without releasing new_nodes[i]->write_lock.
Then, it will cause a deadlock when trying to acquire new_nodes[i]->
write_lock for freeing new_nodes[i] before return.

btree_gc_coalesce func details as follows:
	if alloc new_nodes[i] fails:
		goto out_nocoalesce;
	// obtain new_nodes[i]->write_lock
	mutex_lock(&new_nodes[i]->write_lock)
	// main coalescing process
	for (i = nodes - 1; i > 0; --i)
		[snipped]
		if coalescing process fails:
			// Here, directly goto out_nocoalesce
			 // tag will cause a deadlock
			goto out_nocoalesce;
		[snipped]
	// release new_nodes[i]->write_lock
	mutex_unlock(&new_nodes[i]->write_lock)
	// coalesing succ, return
	return;
out_nocoalesce:
	btree_node_free(new_nodes[i])	// free new_nodes[i]
	// obtain new_nodes[i]->write_lock
	mutex_lock(&new_nodes[i]->write_lock);
	// set flag for reuse
	clear_bit(BTREE_NODE_dirty, &ew_nodes[i]->flags);
	// release new_nodes[i]->write_lock
	mutex_unlock(&new_nodes[i]->write_lock);

To fix the problem, we add a new tag 'out_unlock_nocoalesce' for
releasing new_nodes[i]->write_lock before out_nocoalesce tag. If
coalescing process fails, we will go to out_unlock_nocoalesce tag
for releasing new_nodes[i]->write_lock before free new_nodes[i] in
out_nocoalesce tag.

(Coly Li helps to clean up commit log format.)

Fixes: 2a285686 ("bcache: btree locking rework")
Signed-off-by: NZhiqiang Liu <liuzhiqiang26@huawei.com>
Signed-off-by: NColy Li <colyli@suse.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

be23e837

14 6月, 2020 3 次提交

net: ethernet: ti: ale: fix allmulti for nu type ale · bc139119

由 Grygorii Strashko 提交于 6月 13, 2020

On AM65xx MCU CPSW2G NUSS and 66AK2E/L NUSS allmulti setting does not allow
unregistered mcast packets to pass.

This happens, because ALE VLAN entries on these SoCs do not contain port
masks for reg/unreg mcast packets, but instead store indexes of
ALE_VLAN_MASK_MUXx_REG registers which intended for store port masks for
reg/unreg mcast packets.
This path was missed by commit 9d1f6447 ("net: ethernet: ti: ale: fix
seeing unreg mcast packets with promisc and allmulti disabled").

Hence, fix it by taking into account ALE type in cpsw_ale_set_allmulti().

Fixes: 9d1f6447 ("net: ethernet: ti: ale: fix seeing unreg mcast packets with promisc and allmulti disabled")
Signed-off-by: NGrygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bc139119

net: ethernet: ti: am65-cpsw-nuss: fix ale parameters init · 2074f9ea

由 Grygorii Strashko 提交于 6月 13, 2020

The ALE parameters structure is created on stack, so it has to be reset
before passing to cpsw_ale_create() to avoid garbage values.

Fixes: 93a76530 ("net: ethernet: ti: introduce am65x/j721e gigabit eth subsystem driver")
Signed-off-by: NGrygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2074f9ea

treewide: replace '---help---' in Kconfig files with 'help' · a7f7f624

由 Masahiro Yamada 提交于 6月 14, 2020

Since commit 84af7a61 ("checkpatch: kconfig: prefer 'help' over
'---help---'"), the number of '---help---' has been gradually
decreasing, but there are still more than 2400 instances.

This commit finishes the conversion. While I touched the lines,
I also fixed the indentation.

There are a variety of indentation styles found.

  a) 4 spaces + '---help---'
  b) 7 spaces + '---help---'
  c) 8 spaces + '---help---'
  d) 1 space + 1 tab + '---help---'
  e) 1 tab + '---help---'    (correct indentation)
  f) 1 tab + 1 space + '---help---'
  g) 1 tab + 2 spaces + '---help---'

In order to convert all of them to 1 tab + 'help', I ran the
following commend:

  $ find . -name 'Kconfig*' | xargs sed -i 's/^[[:space:]]*---help---/\thelp/'
Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>

a7f7f624

13 6月, 2020 1 次提交

ibmvnic: Flush existing work items before device removal · 6954a9e4

由 Thomas Falcon 提交于 6月 12, 2020

Ensure that all scheduled work items have completed before continuing
with device removal and after further event scheduling has been
halted. This patch fixes a bug where a scheduled driver reset event
is processed following device removal.
Signed-off-by: NThomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6954a9e4

12 6月, 2020 25 次提交

net: ipa: header pad field only valid for AP->modem endpoint · f330fda3

由 Alex Elder 提交于 6月 11, 2020

Only QMAP endpoints should be configured to find a pad size field
within packet headers.  They are found in the first byte of the QMAP
header (and the hardware fills only the 6 bits in that byte that
constitute the pad_len field).

The RMNet driver assumes the pad_len field is valid for received
packets, so we want to ensure the pad_len field is filled in that
case.  That driver also assumes the length in the QMAP header
includes the pad bytes.

The RMNet driver does *not* pad the packets it sends, so the pad_len
field can be ignored.

Fix ipa_endpoint_init_hdr_ext() so it only marks the pad field
offset valid for QMAP RX endpoints, and in that case indicates
that the length field in the header includes the pad bytes.
Signed-off-by: NAlex Elder <elder@linaro.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f330fda3

net: ipa: program upper nibbles of sequencer type · 636edeaa

由 Alex Elder 提交于 6月 11, 2020

The upper two nibbles of the sequencer type were not used for
SDM845, and were assumed to be 0.  But for SC7180 they are used, and
so they must be programmed by ipa_endpoint_init_seq().  Fix this bug.

IPA_SEQ_PKT_PROCESS_NO_DEC_NO_UCP_DMAP doesn't have a descriptive
comment, so add one.
Signed-off-by: NAlex Elder <elder@linaro.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

636edeaa

net: ipa: fix modem LAN RX endpoint id · 9b8ad8da

由 Alex Elder 提交于 6月 11, 2020

The endpoint id assigned to the modem LAN RX endpoint for the SC7180 SoC
is incorrect. The erroneous value might have been copied from SDM845 and
never updated. The correct endpoint id to use for this SoC is 11.
Signed-off-by: NAlex Elder <elder@linaro.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9b8ad8da

net: ipa: program metadata mask differently · 8730f45d

由 Alex Elder 提交于 6月 11, 2020

The way the mask value is programmed for QMAP RX endpoints was based
on some wrong assumptions about the way metadata containing the QMAP
mux_id value is formatted.  The metadata value supplied by the
modem is *not* in QMAP format, and in fact contains the mux_id we
want in its (big endian) low-order byte.  That byte must be written
by the IPA into offset 1 of the QMAP header it inserts before the
received packet.

QMAP TX endpoints *do* use a QMAP header as the metadata sent with
each packet.  The modem assumes this, and based on that assumes the
mux_id is in the second byte.  To match those assumptions we must
program the modem TX (QMAP) endpoint HDR register to indicate the
metadata will be found at offset 0 in the message header.

The previous configuration managed to work, but it was not working
correctly.  This patch fixes a bug whose symptom was receipt of
messages containing the wrong QMAP mux_id.

In fixing this, get rid of ipa_rmnet_mux_id_metadata_mask(), which
was more or less defined so there was a separate place to explain
what was happening as we generated the mask value.  Instead, put a
longer description of how this works above ipa_endpoint_init_hdr(),
and define the metadata mask to use as a simple constant.
Signed-off-by: NAlex Elder <elder@linaro.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8730f45d

ionic: add pcie_print_link_status · c25cba36

由 Shannon Nelson 提交于 6月 11, 2020

Print the PCIe link information for our device.

Fixes: 77f972a7 ("ionic: remove support for mgmt device")
Signed-off-by: NShannon Nelson <snelson@pensando.io>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c25cba36

amdgpu: a NULL ->mm does not mean a thread is a kthread · 8449d150

由 Christoph Hellwig 提交于 6月 11, 2020

Use the proper API instead.

Fixes: 70539bd7 ("drm/amd: Update MEC HQD loading code for KFD")
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Tested-by: NJens Axboe <axboe@kernel.dk>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: NJens Axboe <axboe@kernel.dk>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Cc: Felipe Balbi <balbi@kernel.org>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: http://lkml.kernel.org/r/20200404094101.672954-1-hch@lst.de
Link: http://lkml.kernel.org/r/20200404094101.672954-2-hch@lst.deSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8449d150

net/mlx5: E-Switch, Fix some error pointer dereferences · 09a92975

由 Dan Carpenter 提交于 6月 03, 2020

We can't leave "counter" set to an error pointer. Otherwise either it
will lead to an error pointer dereference later in the function or it
leads to an error pointer dereference when we call mlx5_fc_destroy().

Fixes: 07bab950 ("net/mlx5: E-Switch, Refactor eswitch ingress acl codes")
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

09a92975

net/mlx5: Don't fail driver on failure to create debugfs · 17e73d47

由 Leon Romanovsky 提交于 6月 02, 2020

Clang warns:

drivers/net/ethernet/mellanox/mlx5/core/main.c:1278:6: warning: variable
'err' is used uninitialized whenever 'if' condition is true
[-Wsometimes-uninitialized]
        if (!priv->dbg_root) {
            ^~~~~~~~~~~~~~~
drivers/net/ethernet/mellanox/mlx5/core/main.c:1303:9: note:
uninitialized use occurs here
        return err;
               ^~~
drivers/net/ethernet/mellanox/mlx5/core/main.c:1278:2: note: remove the
'if' if its condition is always false
        if (!priv->dbg_root) {
        ^~~~~~~~~~~~~~~~~~~~~~
drivers/net/ethernet/mellanox/mlx5/core/main.c:1259:9: note: initialize
the variable 'err' to silence this warning
        int err;
               ^
                = 0
1 warning generated.

The check of returned value of debugfs_create_dir() is wrong because
by the design debugfs failures should never fail the driver and the
check itself was wrong too. The kernel compiled without CONFIG_DEBUG_FS
will return ERR_PTR(-ENODEV) and not NULL as expected.

Fixes: 11f3b84d ("net/mlx5: Split mdev init and pci init")
Link: https://github.com/ClangBuiltLinux/linux/issues/1042Reported-by: NNathan Chancellor <natechancellor@gmail.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Reviewed-by: NNathan Chancellor <natechancellor@gmail.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

17e73d47

net/mlx5e: CT: Fix ipv6 nat header rewrite actions · 0d156f2d

由 Oz Shlomo 提交于 6月 07, 2020

Set the ipv6 word fields according to the hardware definitions.

Fixes: ac991b48 ("net/mlx5e: CT: Offload established flows")
Signed-off-by: NOz Shlomo <ozsh@mellanox.com>
Reviewed-by: NRoi Dayan <roid@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

0d156f2d

net/mlx5: Fix devlink objects and devlink device unregister sequence · 98f91c45

由 Parav Pandit 提交于 5月 15, 2020

Current below problems exists.

1. devlink device is registered by mlx5_load_one(). But it is
not unregistered by mlx5_unload_one(). This is incorrect.

2. Above issue leads to,
When mlx5 PCI device is removed, currently devlink device is
unregistered before devlink ports are unregistered in below ladder
diagram.

remove_one()
  mlx5_devlink_unregister()
    [..]
    devlink_unregister() <- ports are still registered!
  mlx5_unload_one()
    mlx5_unregister_device()
      mlx5_remove_device()
        mlx5e_remove()
          mlx5e_devlink_port_unregister()
            devlink_port_unregister()

3. Condition checking for registering and unregister device are not
symmetric either in these routines.

Hence, fix the sequence by having load and unload routines symmetric
and in right order.
i.e.
(a) register devlink device followed by registering devlink ports
(b) unregister devlink ports followed by devlink device

Do this based on boot and cleanup flags instead of different
conditions.

Fixes: c6acd629 ("net/mlx5e: Add support for devlink-port in non-representors mode")
Fixes: f60f315d ("net/mlx5e: Register devlink ports for physical link, PCI PF, VFs")
Signed-off-by: NParav Pandit <parav@mellanox.com>
Reviewed-by: NMoshe Shemesh <moshe@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

98f91c45

net/mlx5: Disable reload while removing the device · 60904cd3

由 Parav Pandit 提交于 5月 14, 2020

While unregistration is in progress, user might be reloading the
interface.
This can race with unregistration in below flow which uses the
resources which are getting disabled by reload flow.

Hence, disable the devlink reloading first when removing the device.

     CPU0                                   CPU1
     ----                                   ----
local_pci_remove()                  devlink_mutex
  remove_one()                       devlink_nl_cmd_reload()
    mlx5_unregister_device()           devlink_reload()
                                       ops->reload_down()
                                         mlx5_unload_one()

Fixes: 4383cfcc ("net/mlx5: Add devlink reload")
Signed-off-by: NParav Pandit <parav@mellanox.com>
Reviewed-by: NMoshe Shemesh <moshe@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

60904cd3

net/mlx5e: Fix ethtool hfunc configuration change · 5f1572e6

由 Aya Levin 提交于 5月 17, 2020

Changing RX hash function requires rearranging of RQT internal indexes,
the user isn't exposed to such changes and these changes do not affect
the user configured indirection table. Rebuild RQ table on hfunc change.

Fixes: bdfc028d ("net/mlx5e: Fix ethtool RX hash func configuration change")
Signed-off-by: NAya Levin <ayal@mellanox.com>
Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

5f1572e6

net/mlx5e: Fix repeated XSK usage on one channel · 36d45fb9

由 Maxim Mikityanskiy 提交于 6月 01, 2020

After an XSK is closed, the relevant structures in the channel are not
zeroed. If an XSK is opened the second time on the same channel without
recreating channels, the stray values in the structures will lead to
incorrect operation of queues, which causes CQE errors, and the new
socket doesn't work at all.

This patch fixes the issue by explicitly zeroing XSK-related structs in
the channel on XSK close. Note that those structs are zeroed on channel
creation, and usually a configuration change (XDP program is set)
happens on XSK open, which leads to recreating channels, so typical XSK
usecases don't suffer from this issue. However, if XSKs are opened and
closed on the same channel without removing the XDP program, this bug
reproduces.

Fixes: db05815b ("net/mlx5e: Add XSK zero-copy support")
Signed-off-by: NMaxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

36d45fb9

net/mlx5: DR, Fix freeing in dr_create_rc_qp() · 47a357de

由 Denis Efremov 提交于 6月 01, 2020

Variable "in" in dr_create_rc_qp() is allocated with kvzalloc() and
should be freed with kvfree().

Fixes: 297ccceb ("net/mlx5: DR, Expose an internal API to issue RDMA operations")
Cc: stable@vger.kernel.org
Signed-off-by: NDenis Efremov <efremov@linux.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

47a357de

net/mlx5: Fix fatal error handling during device load · b6e0b6be

由 Shay Drory 提交于 5月 07, 2020

Currently, in case of fatal error during mlx5_load_one(), we cannot
enter error state until mlx5_load_one() is finished, what can take
several minutes until commands will get timeouts, because these commands
can't be processed due to the fatal error.
Fix it by setting dev->state as MLX5_DEVICE_STATE_INTERNAL_ERROR before
requesting the lock.

Fixes: c1d4d2e9 ("net/mlx5: Avoid calling sleeping function by the health poll thread")
Signed-off-by: NShay Drory <shayd@mellanox.com>
Reviewed-by: NMoshe Shemesh <moshe@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

b6e0b6be

net/mlx5: drain health workqueue in case of driver load error · 42ea9f1b

由 Shay Drory 提交于 5月 06, 2020

In case there is a work in the health WQ when we teardown the driver,
in driver load error flow, the health work will try to read dev->iseg,
which was already unmap in mlx5_pci_close().
Fix it by draining the health workqueue first thing in mlx5_pci_close().

Trace of the error:
BUG: unable to handle page fault for address: ffffb5b141c18014
PF: supervisor read access in kernel mode
PF: error_code(0x0000) - not-present page
PGD 1fe95d067 P4D 1fe95d067 PUD 1fe95e067 PMD 1b7823067 PTE 0
Oops: 0000 [#1] SMP PTI
CPU: 3 PID: 6755 Comm: kworker/u128:2 Not tainted 5.2.0-net-next-mlx5-hv_stats-over-last-worked-hyperv #1
Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006  04/28/2016
Workqueue: mlx5_healtha050:00:02.0 mlx5_fw_fatal_reporter_err_work [mlx5_core]
RIP: 0010:ioread32be+0x30/0x40
Code: 00 77 27 48 81 ff 00 00 01 00 76 07 0f b7 d7 ed 0f c8 c3 55 48 c7 c6 3b ee d5 9f 48 89 e5 e8 67 fc ff ff b8 ff ff ff ff 5d c3 <8b> 07 0f c8 c3 66 66 2e 0f 1f 84 00 00 00 00 00 48 81 fe ff ff 03
RSP: 0018:ffffb5b14c56fd78 EFLAGS: 00010292
RAX: ffffb5b141c18000 RBX: ffff8e9f78a801c0 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffff8e9f7ecd7628 RDI: ffffb5b141c18014
RBP: ffffb5b14c56fd90 R08: 0000000000000001 R09: 0000000000000000
R10: ffff8e9f372a2c30 R11: ffff8e9f87f4bc40 R12: ffff8e9f372a1fc0
R13: ffff8e9f78a80000 R14: ffffffffc07136a0 R15: ffff8e9f78ae6f20
FS:  0000000000000000(0000) GS:ffff8e9f7ecc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffb5b141c18014 CR3: 00000001c8f82006 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 ? mlx5_health_try_recover+0x4d/0x270 [mlx5_core]
 mlx5_fw_fatal_reporter_recover+0x16/0x20 [mlx5_core]
 devlink_health_reporter_recover+0x1c/0x50
 devlink_health_report+0xfb/0x240
 mlx5_fw_fatal_reporter_err_work+0x65/0xd0 [mlx5_core]
 process_one_work+0x1fb/0x4e0
 ? process_one_work+0x16b/0x4e0
 worker_thread+0x4f/0x3d0
 kthread+0x10d/0x140
 ? process_one_work+0x4e0/0x4e0
 ? kthread_cancel_delayed_work_sync+0x20/0x20
 ret_from_fork+0x1f/0x30
Modules linked in: nfsv3 rpcsec_gss_krb5 nfsv4 nfs fscache 8021q garp mrp stp llc ipmi_devintf ipmi_msghandler rpcrdma rdma_ucm ib_iser rdma_cm ib_umad iw_cm ib_ipoib libiscsi scsi_transport_iscsi ib_cm mlx5_ib ib_uverbs ib_core mlx5_core sb_edac crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 mlxfw crypto_simd cryptd glue_helper input_leds hyperv_fb intel_rapl_perf joydev serio_raw pci_hyperv pci_hyperv_mini mac_hid hv_balloon nfsd auth_rpcgss nfs_acl lockd grace sunrpc sch_fq_codel ip_tables x_tables autofs4 hv_utils hid_generic hv_storvsc ptp hid_hyperv hid hv_netvsc hyperv_keyboard pps_core scsi_transport_fc psmouse hv_vmbus i2c_piix4 floppy pata_acpi
CR2: ffffb5b141c18014
---[ end trace b12c5503157cad24 ]---
RIP: 0010:ioread32be+0x30/0x40
Code: 00 77 27 48 81 ff 00 00 01 00 76 07 0f b7 d7 ed 0f c8 c3 55 48 c7 c6 3b ee d5 9f 48 89 e5 e8 67 fc ff ff b8 ff ff ff ff 5d c3 <8b> 07 0f c8 c3 66 66 2e 0f 1f 84 00 00 00 00 00 48 81 fe ff ff 03
RSP: 0018:ffffb5b14c56fd78 EFLAGS: 00010292
RAX: ffffb5b141c18000 RBX: ffff8e9f78a801c0 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffff8e9f7ecd7628 RDI: ffffb5b141c18014
RBP: ffffb5b14c56fd90 R08: 0000000000000001 R09: 0000000000000000
R10: ffff8e9f372a2c30 R11: ffff8e9f87f4bc40 R12: ffff8e9f372a1fc0
R13: ffff8e9f78a80000 R14: ffffffffc07136a0 R15: ffff8e9f78ae6f20
FS:  0000000000000000(0000) GS:ffff8e9f7ecc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffb5b141c18014 CR3: 00000001c8f82006 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
BUG: sleeping function called from invalid context at ./include/linux/percpu-rwsem.h:38
in_atomic(): 0, irqs_disabled(): 1, pid: 6755, name: kworker/u128:2
INFO: lockdep is turned off.
CPU: 3 PID: 6755 Comm: kworker/u128:2 Tainted: G      D           5.2.0-net-next-mlx5-hv_stats-over-last-worked-hyperv #1
Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006  04/28/2016
Workqueue: mlx5_healtha050:00:02.0 mlx5_fw_fatal_reporter_err_work [mlx5_core]
Call Trace:
 dump_stack+0x63/0x88
 ___might_sleep+0x10a/0x130
 __might_sleep+0x4a/0x80
 exit_signals+0x33/0x230
 ? blocking_notifier_call_chain+0x16/0x20
 do_exit+0xb1/0xc30
 ? kthread+0x10d/0x140
 ? process_one_work+0x4e0/0x4e0

Fixes: 52c368dc ("net/mlx5: Move health and page alloc init to mdev_init")
Signed-off-by: NShay Drory <shayd@mellanox.com>
Reviewed-by: NMoshe Shemesh <moshe@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

42ea9f1b

iavf: increase reset complete wait time · 8e3e4b9d

由 Paul Greenwalt 提交于 6月 05, 2020

With an increased number of VFs, it's possible to encounter the following
issue during reset.

    iavf b8d4:00:02.0: Hardware reset detected
    iavf b8d4:00:02.0: Reset never finished (0)
    iavf b8d4:00:02.0: Reset task did not complete, VF disabled

Increase the reset complete wait count to allow for 128 VFs to complete
reset.
Signed-off-by: NPaul Greenwalt <paul.greenwalt@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

8e3e4b9d

iavf: Fix reporting 2.5 Gb and 5Gb speeds · 18c012d9

由 Brett Creeley 提交于 6月 05, 2020

Commit 4ae4916b ("i40e: fix 'Unknown bps' in dmesg for 2.5Gb/5Gb
speeds") added the ability for the PF to report 2.5 and 5Gb speeds,
however, the iavf driver does not recognize those speeds as the values were
not added there. Add the proper enums and values so that iavf can properly
deal with those speeds.

Fixes: 4ae4916b ("i40e: fix 'Unknown bps' in dmesg for 2.5Gb/5Gb speeds")
Signed-off-by: NBrett Creeley <brett.creeley@intel.com>
Signed-off-by: NWitold Fijalkowski <witoldx.fijalkowski@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

18c012d9

iavf: use appropriate enum for comparison · 5071bda2

由 Aleksandr Loktionov 提交于 6月 05, 2020

adapter->link_speed has type enum virtchnl_link_speed but our comparisons
are against enum iavf_aq_link_speed. Though they are, currently, the same
values, change the comparison to the matching enum virtchnl_link_speed
since that may not always be the case.
Signed-off-by: NAleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: NPiotr Kwapulinski <piotr.kwapulinski@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

5071bda2

iavf: fix speed reporting over virtchnl · e0ef26fb

由 Brett Creeley 提交于 6月 05, 2020

Link speeds are communicated over virtchnl using an enum
virtchnl_link_speed. Currently, the highest link speed is 40Gbps which
leaves us unable to reflect some speeds that an ice VF is capable of.
This causes link speed to be misreported on the iavf driver.

Allow for communicating link speeds using Mbps so that the proper speed can
be reported for an ice VF. Moving away from the enum allows us to
communicate future speed changes without requiring a new enum to be added.

In order to support communicating link speeds over virtchnl in Mbps the
following functionality was added:
    - Added u32 link_speed_mbps in the iavf_adapter structure.
    - Added the macro ADV_LINK_SUPPORT(_a) to determine if the VF
      driver supports communicating link speeds in Mbps.
    - Added the function iavf_get_vpe_link_status() to fill the
      correct link_status in the event_data union based on the
      ADV_LINK_SUPPORT(_a) macro.
    - Added the function iavf_set_adapter_link_speed_from_vpe()
      to determine whether or not to fill the u32 link_speed_mbps or
      enum virtchnl_link_speed link_speed field in the iavf_adapter
      structure based on the ADV_LINK_SUPPORT(_a) macro.
    - Do not free vf_res in iavf_init_get_resources() as vf_res will be
      accessed in iavf_get_link_ksettings(); memset to 0 instead. This
      memory is subsequently freed in iavf_remove().

Fixes: 7c710869 ("ice: Add handlers for VF netdevice operations")
Signed-off-by: NBrett Creeley <brett.creeley@intel.com>
Signed-off-by: NSergey Nemov <sergey.nemov@intel.com>
Signed-off-by: NPaul Greenwalt <paul.greenwalt@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

e0ef26fb

ionic: remove support for mgmt device · 77f972a7

由 Shannon Nelson 提交于 6月 10, 2020

We no longer support the mgmt device in the ionic driver,
so remove the device id and related code.

Fixes: b3f064e9 ("ionic: add support for device id 0x1004")
Signed-off-by: NShannon Nelson <snelson@pensando.io>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

77f972a7

drivers: dpaa2: Use devm_kcalloc() in setup_dpni() · 9334d5ba

由 Xu Wang 提交于 6月 11, 2020

A multiplication for the size determination of a memory allocation
indicated that an array data structure should be processed.
Thus use the corresponding function "devm_kcalloc".
Signed-off-by: NXu Wang <vulab@iscas.ac.cn>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9334d5ba

media: rkvdec: Fix H264 scaling list order · 2630e1bb

由 Jonas Karlman 提交于 5月 22, 2020

The Rockchip Video Decoder driver is expecting that the values in a
scaling list are in zig-zag order and applies the inverse scanning process
to get the values in matrix order.

Commit 0b0393d5 ("media: uapi: h264: clarify expected
scaling_list_4x4/8x8 order") clarified that the values in the scaling list
should already be in matrix order.

Fix this by removing the reordering and change to use two memcpy.

Fixes: cd33c830 ("media: rkvdec: Add the rkvdec driver")
Signed-off-by: NJonas Karlman <jonas@kwiboo.se>
Tested-by: NNicolas Dufresne <nicolas.dufresne@collabora.com>
Reviewed-by: NEzequiel Garcia <ezequiel@collabora.com>
[hverkuil-cisco@xs4all.nl: rkvdec_scaling_matrix -> rkvdec_h264_scaling_list]
Signed-off-by: NHans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: NMauro Carvalho Chehab <mchehab+huawei@kernel.org>

2630e1bb

media: v4l2-ctrls: Unset correct HEVC loop filter flag · 88441917

由 Jonas Karlman 提交于 5月 27, 2020

Wrong loop filter flag is unset when tiles enabled flag is not set,
this cause HEVC decoding issues with Rockchip Video Decoder.

Fix this by unsetting the loop filter across tiles enabled flag instead of
the pps loop filter across slices enabled flag when tiles are disabled.

Fixes: 256fa392 ("media: v4l: Add definitions for HEVC stateless decoding")
Signed-off-by: NJonas Karlman <jonas@kwiboo.se>
Signed-off-by: NHans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: NMauro Carvalho Chehab <mchehab+huawei@kernel.org>

88441917

media: videobuf2-dma-contig: fix bad kfree in vb2_dma_contig_clear_max_seg_size · 0d966872

由 Tomi Valkeinen 提交于 5月 27, 2020

Commit 9495b7e9 ("driver core: platform:
Initialize dma_parms for platform devices") in v5.7-rc5 causes
vb2_dma_contig_clear_max_seg_size() to kfree memory that was not
allocated by vb2_dma_contig_set_max_seg_size().

The assumption in vb2_dma_contig_set_max_seg_size() seems to be that
dev->dma_parms is always NULL when the driver is probed, and the case
where dev->dma_parms has bee initialized by someone else than the driver
(by calling vb2_dma_contig_set_max_seg_size) will cause a failure.

All the current users of these functions are platform devices, which now
always have dma_parms set by the driver core. To fix the issue for v5.7,
make vb2_dma_contig_set_max_seg_size() return an error if dma_parms is
NULL to be on the safe side, and remove the kfree code from
vb2_dma_contig_clear_max_seg_size().

For v5.8 we should remove the two functions and move the
dma_set_max_seg_size() calls into the drivers.
Signed-off-by: NTomi Valkeinen <tomi.valkeinen@ti.com>
Fixes: 9495b7e9 ("driver core: platform: Initialize dma_parms for platform devices")
Cc: stable@vger.kernel.org
Acked-by: NMarek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: NUlf Hansson <ulf.hansson@linaro.org>
Signed-off-by: NHans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: NMauro Carvalho Chehab <mchehab+huawei@kernel.org>

0d966872

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功