提交 · 2a3192a3f3bc4fe1b077c55fffb6d8afe3213d57 · openeuler / Kernel

20 3月, 2019 5 次提交

scsi: qla2xxx: Add Serdes support for ISP28XX · 2a3192a3

由 Joe Carnuccio 提交于 3月 12, 2019

This patch adds sysfs node for serdes_version and also cleans up port_speed
display.
Signed-off-by: NJoe Carnuccio <joe.carnuccio@cavium.com>
Signed-off-by: NHimanshu Madhani <hmadhani@marvell.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

2a3192a3

scsi: qla2xxx: Add Device ID for ISP28XX · ecc89f25

由 Joe Carnuccio 提交于 3月 12, 2019

This patch adds PCI device ID ISP28XX for Gen7 support.  Also signature
determination for primary/secondary flash image for ISP27XX/28XX is aded as
part of Gen7 support.
Signed-off-by: NJoe Carnuccio <joe.carnuccio@cavium.com>
Signed-off-by: NHimanshu Madhani <hmadhani@marvell.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

ecc89f25

scsi: qla2xxx: Fix routine qla27xx_dump_{mpi|ram}() · 24ef8f7e

由 Joe Carnuccio 提交于 3月 12, 2019

This patch fixes qla27xx_dump_{mpi|ram} api for ISP27XX.
Signed-off-by: NJoe Carnuccio <joe.carnuccio@cavium.com>
Signed-off-by: NHimanshu Madhani <hmadhani@marvell.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

24ef8f7e

scsi: qla2xxx: Remove FW default template · 2ff6ae85

由 Joe Carnuccio 提交于 3月 12, 2019

This patch removes FW default template as there will never be case where
the default template would be invoked.
Signed-off-by: NJoe Carnuccio <joe.carnuccio@cavium.com>
Signed-off-by: NHimanshu Madhani <hmadhani@marvell.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

2ff6ae85

scsi: qla2xxx: Add fw_attr and port_no SysFS node · df617ffb

由 Joe Carnuccio 提交于 3月 12, 2019

This patch adds new sysfs node to display firmware attributes and port
number.
Signed-off-by: NJoe Carnuccio <joe.carnuccio@cavium.com>
Signed-off-by: NHimanshu Madhani <hmadhani@marvell.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

df617ffb

19 3月, 2019 16 次提交

scsi: mpt3sas: Update mpt3sas driver version to 28.100.00.00 · 4bcb298e

由 Suganath Prabu 提交于 2月 15, 2019

Updated driver version to 28.100.00.00, which is equivalent to OOB Phase 9.
Signed-off-by: NSuganath Prabu <suganath-prabu.subramani@broadcom.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

4bcb298e

scsi: mpt3sas: Improve the threshold value and introduce module param · 288addd6

由 Suganath Prabu 提交于 2月 15, 2019

* Reduce the threshold value to 1/4 of the queue depth.

* With this FW can find enough entries to post the Reply Descriptors in the
  reply descriptor post queue.

* With module param, user can play with threshold value, the same
  irqpoll_weight is used as the budget in processing of reply descriptor
  post queues in _base_process_reply_queue.
Signed-off-by: NSuganath Prabu <suganath-prabu.subramani@broadcom.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

288addd6

scsi: mpt3sas: Load balance to improve performance and avoid soft lockups · 51e3b2ad

由 Suganath Prabu 提交于 2月 15, 2019

Driver uses "reply descriptor post queues" in round robin fashion so that
IO's are distributed to all the available reply descriptor post queues
equally.  With this each reply descriptor post queue load is balanced.

This is enabled only if CPUs count to MSI-X vector count ratio is X:1
(where X > 1) This improves performance and also fixes soft lockups.
Signed-off-by: NSuganath Prabu <suganath-prabu.subramani@broadcom.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

51e3b2ad

scsi: mpt3sas: Irq poll to avoid CPU hard lockups · 320e77ac

由 Suganath Prabu 提交于 2月 15, 2019

Issue Description:
We have seen cpu lock up issue from fields if system has greater (more than
96) logical cpu count.  SAS3.0 controller (Invader series) supports at max
96 msix vector and SAS3.5 product (Ventura) supports at max 128 msix
vectors.

This may be a generic issue (if PCI device supports completion on multiple
reply queues).  Let me explain it w.r.t to mpt3sas supported h/w just to
simplify the problem and possible changes to handle such issues. IT HBA
(mpt3sas) supports multiple reply queues in completion path. Driver creates
MSI-x vectors for controller as "min of (FW supported Reply queue, Logical
CPUs)". If submitter is not interrupted via completion on same CPU, there
is a loop in the IO path. This behavior can cause hard/soft CPU lockups, IO
timeout, system sluggish etc.

Example - one CPU (e.g. CPU A) is busy submitting the IOs and another CPU
(e.g. CPU B) is busy with processing the corresponding IO's reply
descriptors from reply descriptor queue upon receiving the interrupts from
HBA.  If the CPU A is continuously pumping the IOs then always CPU B (which
is executing the ISR) will see the valid reply descriptors in the reply
descriptor queue and it will be continuously processing those reply
descriptor in a loop without quitting the ISR handler.

Mpt3sas driver will exit ISR handler if it finds unused reply descriptor in
the reply descriptor queue. Since CPU A will be continuously sending the
IOs, CPU B may always see a valid reply descriptor (posted by HBA Firmware
after processing the IO) in the reply descriptor queue. In worst case,
driver will not quit from this loop in the ISR handler. Eventually, CPU
lockup will be detected by watchdog.

Above mentioned behavior is not common if "rq_affinity" set to 2 or
affinity_hint is honored by irqbalance as "exact". If rq_affinity is set
to 2, submitter will be always interrupted via completion on same CPU.  If
irqbalance is using "exact" policy, interrupt will be delivered to
submitter CPU.

If CPU counts to MSI-X vectors (reply descriptor Queues) count ratio is not
1:1, we still have exposure of issue explained above and for that we don't
have any solution.

Exposure of soft/hard lockup if CPU count is more than MSI-x supported by
device.

If CPUs count to MSI-x vectors count ratio is not 1:1, (Other way, if CPU
counts to MSI-x vector count ratio is something like X:1, where X > 1) then
'exact' irqbalance policy OR rq_affinity = 2 won't help to avoid CPU
hard/soft lockups. There won't be any one to one mapping between CPU to
MSI-x vector instead one MSI-x interrupt (or reply descriptor queue) is
shared with group/set of CPUs and there is a possibility of having a loop
in the IO path within that CPU group and may observe lockups.

For example: Consider a system having two NUMA nodes and each node having
four logical CPUs and also consider that number of MSI-x vectors enabled on
the HBA is two, then CPUs count to MSI-x vector count ratio as 4:1.  e.g.
MSIx vector 0 is affinity to CPU 0, CPU 1, CPU 2 & CPU 3 of NUMA node 0 and
MSI-x vector 1 is affinity to CPU 4, CPU 5, CPU 6 & CPU 7 of NUMA node 1.

numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3                 --> MSI-x 0
node 0 size: 65536 MB
node 0 free: 63176 MB
node 1 cpus: 4 5 6 7                 -->MSI-x 1
node 1 size: 65536 MB
node 1 free: 63176 MB

Assume that user started an application which uses all the CPUs of NUMA
node 0 for issuing the IOs.  Only one CPU from affinity list (it can be any
cpu since this behavior depends upon irqbalance) CPU0 will receive the
interrupts from MSIx vector 0 for all the IOs. Eventually, CPU 0 IO
submission percentage will be decreasing and ISR processing percentage will
be increasing as it is more busy with processing the interrupts.  Gradually
IO submission percentage on CPU 0 will be zero and it's ISR processing
percentage will be 100 percentage as IO loop has already formed within the
NUMA node 0, i.e. CPU 1, CPU 2 & CPU 3 will be continuously busy with
submitting the heavy IOs and only CPU 0 is busy in the ISR path as it
always find the valid reply descriptor in the reply descriptor
queue. Eventually, we will observe the hard lockup here.

Chances of occurring of hard/soft lockups are directly proportional to
value of X. If value of X is high, then chances of observing CPU lockups is
high.

Solution: Use IRQ poll interface defined in " irq_poll.c".  mpt3sas driver
will execute ISR routine in Softirq context and it will always quit the
loop based on budget provided in IRQ poll interface.

In these scenarios (i.e. where CPUs count to MSI-X vectors count ratio is
X:1 (where X > 1)), IRQ poll interface will avoid CPU hard lockups due to
voluntary exit from the reply queue processing based on budget.  Note -
Only one MSI-x vector is busy doing processing.

Irqstat output:

IRQs / 1 second(s)
IRQ#  TOTAL  NODE0   NODE1   NODE2   NODE3  NAME
  44    122871   122871   0       0       0  IR-PCI-MSI-edge mpt3sas0-msix0
  45        0              0           0       0       0  IR-PCI-MSI-edge mpt3sas0-msix1

We use this approach only if cpu count is more than FW supported MSI-x
vector
Signed-off-by: NSuganath Prabu <suganath-prabu.subramani@broadcom.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

320e77ac

scsi: mpt3sas: simplify interrupt handler · 233af108

由 Suganath Prabu 提交于 2月 15, 2019

Separate out processing of reply descriptor post queue from _base_interrupt
to _base_process_reply_queue.
Signed-off-by: NSuganath Prabu <suganath-prabu.subramani@broadcom.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

233af108

scsi: mpt3sas: Fix typo in request_desript_type · 2c063507

由 Suganath Prabu 提交于 2月 15, 2019

Fixed typo in request_desript_type.
request_desript_type --> request_descript_type.
Signed-off-by: NSuganath Prabu <suganath-prabu.subramani@broadcom.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

2c063507

scsi: target: Add device product id and revision configfs attributes · 0322913c

由 Alan Adamson 提交于 3月 01, 2019

The product_id and revision attributes will allow for the modification of
the T10 Model and Revision strings returned in inquiry responses.  Its
value can be viewed and modified via the ConfigFS path at:

target/core/$backstore/$name/wwn/product_id
target/core/$backstore/$name/wwn/revision

[mkp: dropped parentheses as requested by Bart]
Signed-off-by: NAlan Adamson <alan.adamson@oracle.com>
Reviewed-by: NBart Van Assche <bvanassche@acm.org>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

0322913c

scsi: smartpqi: bump driver version · 171f1887

由 Don Brace 提交于 3月 14, 2019

Reviewed-by: NGerry Morong <gerry.morong@microsemi.com>
Reviewed-by: NDavid Carroll <david.carroll@microsemi.com>
Reviewed-by: NScott Benesh <scott.benesh@microsemi.com>
Reviewed-by: NScott Teel <scott.teel@microsemi.com>
Reviewed-by: NKevin Barnett <kevin.barnett@microsemi.com>
Signed-off-by: NDon Brace <don.brace@microsemi.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

171f1887

scsi: smartpqi: add spdx · 2cc37b15

由 Don Brace 提交于 3月 14, 2019

Reviewed-by: NDavid Carroll <david.carroll@microsemi.com>
Reviewed-by: NScott Benesh <scott.benesh@microsemi.com>
Reviewed-by: NScott Teel <scott.teel@microsemi.com>
Reviewed-by: NKevin Barnett <kevin.barnett@microsemi.com>
Signed-off-by: NDon Brace <don.brace@microsemi.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

2cc37b15

scsi: smartpqi: update copyright · 2f4c4b92

由 Don Brace 提交于 3月 14, 2019

Reviewed-by: NGerry Morong <gerry.morong@microsemi.com>
Reviewed-by: NScott Benesh <scott.benesh@microsemi.com>
Reviewed-by: NScott Teel <scott.teel@microsemi.com>
Reviewed-by: NDavid Carroll <david.carroll@microsemi.com>
Reviewed-by: NKevin Barnett <kevin.barnett@microsemi.com>
Signed-off-by: NDon Brace <don.brace@microsemi.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

2f4c4b92

scsi: smartpqi: add H3C controller IDs · 0595a0b4

由 Ajish Koshy 提交于 3月 14, 2019

Reviewed-by: NScott Teel <scott.teel@microsemi.com>
Reviewed-by: NScott Benesh <scott.benesh@microsemi.com>
Reviewed-by: NDavid Carroll <david.carroll@microsemi.com>
Reviewed-by: NKevin Barnett <kevin.barnett@microsemi.com>
Signed-off-by: NAjish Koshy <ajish.koshy@microsemi.com>
Signed-off-by: NDon Brace <don.brace@microsemi.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

0595a0b4

scsi: smartpqi: increase LUN reset timeout · 429fab70

由 Kevin Barnett 提交于 3月 14, 2019

Reviewed-by: NScott Benesh <scott.benesh@microsemi.com>
Reviewed-by: NDavid Carroll <david.carroll@microsemi.com>
Reviewed-by: NKevin Barnett <kevin.barnett@microsemi.com>
Signed-off-by: NKevin Barnett <kevin.barnett@microsemi.com>
Signed-off-by: NDon Brace <don.brace@microsemi.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

429fab70

scsi: hpsa: bump driver version · c59c32cd

由 Don Brace 提交于 3月 12, 2019

Reviewed-by: NGerry Morong <gerry.morong@microsemi.com>
Reviewed-by: NScott Teel <scott.teel@microsemi.com>
Signed-off-by: NDon Brace <don.brace@microsemi.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

c59c32cd

scsi: hpsa: remove timeout from TURs · 1edb6934

由 Don Brace 提交于 3月 12, 2019

There are times when a TUR can take longer than the DEFAULT_TIMEOUT
value. The timeout code is not correct as the function exits with an
automatic as the completion variable...To be fixed later.

Remove the TUR timeout.
Reviewed-by: NScott Benesh <scott.benesh@microsemi.com>
Reviewed-by: NScott Teel <scott.teel@microsemi.com>
Signed-off-by: NDon Brace <don.brace@microsemi.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

1edb6934

scsi: hpsa: correct device id issues · a45bcc4e

由 Don Brace 提交于 3月 12, 2019

Correct a 'rare' race condition where a disk is failed after a device list
has been obtained from the controller and before attempting to get the
device id.
Reviewed-by: NScott Teel <scott.teel@microsemi.com>
Signed-off-by: NDon Brace <don.brace@microsemi.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

a45bcc4e

scsi: hpsa: check for lv removal · 49ea45cb

由 Don Brace 提交于 3月 12, 2019

Multipath failures are normally detected at the frequency of the event
thread. Detect LUN failures earlier by checking request completion status.
Reviewed-by: NBader Ali-saleh <bader.ali-saleh@microsemi.com>
Reviewed-by: NScott Benesh <scott.benesh@microsemi.com>
Reviewed-by: NScott Teel <scott.teel@microsemi.com>
Reviewed-by: NPrasad Munirathnam <Prasad.Munirathnam@microsemi.com>
Reviewed-by: NKevin Barnett <kevin.barnett@microsemi.com>
Signed-off-by: NDon Brace <don.brace@microsemi.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

49ea45cb

15 3月, 2019 3 次提交

iommu/amd: Fix NULL dereference bug in match_hid_uid · bb6bccba

由 Aaron Ma 提交于 3月 13, 2019

Add a non-NULL check to fix potential NULL pointer dereference
Cleanup code to call function once.
Signed-off-by: NAaron Ma <aaron.ma@canonical.com>
Fixes: 2bf9a0a1 ('iommu/amd: Add iommu support for ACPI HID devices')
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

bb6bccba

xen/balloon: Fix mapping PG_offline pages to user space · 0266def9

由 David Hildenbrand 提交于 3月 14, 2019

The XEN balloon driver - in contrast to other balloon drivers - allows
to map some inflated pages to user space. Such pages are allocated via
alloc_xenballooned_pages() and freed via free_xenballooned_pages().
The pfn space of these allocated pages is used to map other things
by the hypervisor using hypercalls.

Pages marked with PG_offline must never be mapped to user space (as
this page type uses the mapcount field of struct pages).

So what we can do is, clear/set PG_offline when allocating/freeing an
inflated pages. This way, most inflated pages can be excluded by
dumping tools and the "reused for other purpose" balloon pages are
correctly not marked as PG_offline.

Fixes: 77c4adf6 (xen/balloon: mark inflated pages PG_offline)
Reported-by: NJulien Grall <julien.grall@arm.com>
Tested-by: NJulien Grall <julien.grall@arm.com>
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Reviewed-by: NJuergen Gross <jgross@suse.com>
Signed-off-by: NJuergen Gross <jgross@suse.com>

0266def9

zram: default to lzo-rle instead of lzo · ce82f19f

由 Dave Rodgman 提交于 3月 13, 2019

lzo-rle gives higher performance and similar compression ratios to lzo.

Link: http://lkml.kernel.org/r/20190205155944.16007-4-dave.rodgman@arm.comSigned-off-by: NDave Rodgman <dave.rodgman@arm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ce82f19f

14 3月, 2019 16 次提交

scsi: aacraid: Fix performance issue on logical drives · 0015437c

由 Sagar Biradar 提交于 3月 07, 2019

Fix performance issue where the queue depth for SmartIOC logical volumes is
set to 1, and allow the usual logical volume code to be executed

Fixes: a052865f (aacraid: Set correct Queue Depth for HBA1000 RAW disks)
Cc: stable@vger.kernel.org
Signed-off-by: NSagar Biradar <Sagar.Biradar@microchip.com>
Reviewed-by: NDave Carroll <david.carroll@microsemi.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

0015437c

scsi: lpfc: Fix error codes in lpfc_sli4_pci_mem_setup() · 3a487ff7

由 Dan Carpenter 提交于 3月 07, 2019

It used to be that "error" was set to -ENODEV at the start of the function
but we shifted some code around an now "error" is set to zero for most
error paths. There is a mix of direct returns and "goto out" but I changed
everything to direct returns for consistency.

Fixes: 56de8357 ("scsi: lpfc: fix calls to dma_set_mask_and_coherent()")
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Acked-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

3a487ff7

pptp: dst_release sk_dst_cache in pptp_sock_destruct · 9417d81f

由 Xin Long 提交于 3月 13, 2019

sk_setup_caps() is called to set sk->sk_dst_cache in pptp_connect,
so we have to dst_release(sk->sk_dst_cache) in pptp_sock_destruct,
otherwise, the dst refcnt will leak.

It can be reproduced by this syz log:

  r1 = socket$pptp(0x18, 0x1, 0x2)
  bind$pptp(r1, &(0x7f0000000100)={0x18, 0x2, {0x0, @local}}, 0x1e)
  connect$pptp(r1, &(0x7f0000000000)={0x18, 0x2, {0x3, @remote}}, 0x1e)

Consecutive dmesg warnings will occur:

  unregister_netdevice: waiting for lo to become free. Usage count = 1

v1->v2:
  - use rcu_dereference_protected() instead of rcu_dereference_check(),
    as suggested by Eric.

Fixes: 00959ade ("PPTP: PPP over IPv4 (Point-to-Point Tunneling Protocol)")
Reported-by: NXiumei Mu <xmu@redhat.com>
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9417d81f

lan743x: Fix TX Stall Issue · deb6bfab

由 Bryan Whitehead 提交于 3月 13, 2019

It has been observed that tx queue may stall while downloading
from certain web sites (example www.speedtest.net)

The cause has been tracked down to a corner case where
the tx interrupt vector was disabled automatically, but
was not re enabled later.

The lan743x has two mechanisms to enable/disable individual
interrupts. Interrupts can be enabled/disabled by individual
source, and they can also be enabled/disabled by individual
vector which has been mapped to the source. Both must be
enabled for interrupts to work properly.

The TX code path, primarily uses the interrupt enable/disable of
the TX source bit, while leaving the vector enabled all the time.

However, while investigating this issue it was noticed that
the driver requested the use of the vector auto clear feature.

The test above revealed a case where the vector enable was
cleared unintentionally.

This patch fixes the issue by deleting the lines that request
the vector auto clear feature to be used.

Fixes: 23f0703c ("lan743x: Add main source files for new lan743x driver")
Signed-off-by: NBryan Whitehead <Bryan.Whitehead@microchip.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

deb6bfab

nvme-tcp: support C2HData with SUCCESS flag · 602d674c

由 Sagi Grimberg 提交于 3月 13, 2019

A C2HData PDU with the SUCCESS flag set indicates that the I/O was
completed by the controller successfully and means that a subsequent
completion response capsule PDU will be ommitted.

If we see this flag, fisrt we check that LAST_PDU flag is set as well,
and then we complete the request when the data transfer (and data digest
verification if its on) is done.

While we're at it, reuse a bit of code with nvme_fail_request.
Reported-by: NSteve Blightman <steve.blightman@oracle.com>
Suggested-by: NOliver Smith-Denny <osmithde@cisco.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NOliver Smith-Denny <osmithde@cisco.com>
Tested-by: NOliver Smith-Denny <osmithde@cisco.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

602d674c

nvmet: ignore EOPNOTSUPP for discard · 005c674f

由 Christoph Hellwig 提交于 3月 13, 2019

NVMe DSM is a pure hint, so if the underlying device / file system
does not support discard-like operations we should not fail the
operation but rather return success.

Fixes: 3b031d15 ("nvmet: add error log support for bdev backend")
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Tested-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

005c674f

nvme: add proper write zeroes setup for the multipath device · 9f0916ab

由 Christoph Hellwig 提交于 3月 13, 2019

Add a gendisk argument to nvme_config_write_zeroes so that the call to
nvme_update_disk_info for the multipath device node updates the
proper request_queue.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Tested-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

9f0916ab

nvme: add proper discard setup for the multipath device · 26318571

由 Christoph Hellwig 提交于 3月 13, 2019

Add a gendisk argument to nvme_config_discard so that the call to
nvme_update_disk_info for the multipath device node updates the
proper request_queue.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reported-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Tested-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

26318571

nvme: remove nvme_ns_config_oncs · b1aafb35

由 Christoph Hellwig 提交于 3月 13, 2019

Just opencode the two function calls in the caller.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Tested-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

b1aafb35

nvme: disable Write Zeroes for qemu controllers · 7b210e4e

由 Christoph Hellwig 提交于 3月 13, 2019

Qemu started out with a broken implementation of Write Zeroes written
by yours truly.  Disable Write Zeroes on qemu for now, eventually
we need to go back and make all the qemu quirks version specific,
but that is left for another time.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Tested-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

7b210e4e

nvmet-fc: bring Disconnect into compliance with FC-NVME spec · 404ec31d

由 James Smart 提交于 3月 13, 2019

The FC-NVME spec, when finally approved, modified the disconnect LS
such that the only scope available is the association.

Rework the Disconnect LS processing to be in accordance with the
change.
Signed-off-by: NNigel Kirkland <nigel.kirkland@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Reviewed-by: NEwan D. Milne <emilne@redhat.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

404ec31d

nvmet-fc: fix issues with targetport assoc_list list walking · 0191e740

由 James Smart 提交于 3月 13, 2019

There are two changes:

1) The logic in the __nvmet_fc_free_assoc() routine is bad. It uses
"safe" routines assuming pointers will come back valid.  However, the
intervening next structure being linked can be removed from the list and
the resulting safe pointers are bad, resulting in NULL ptrs being hit.

Correct by scheduling a work element to perform the association delete,
which can be done while under lock.

2) Prior patch that added the work element scheduling left a possible
reference on the object if the work element couldn't be scheduled.

Correct by doing the put on a failing schedule_work() call.
Signed-off-by: NNigel Kirkland <nigel.kirkland@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Reviewed-by: NEwan D. Milne <emilne@redhat.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

0191e740

nvme-fc: reject reconnect if io queue count is reduced to zero · 834d3710

由 James Smart 提交于 3月 13, 2019

If:

 - A successful connect has occurred with an io queue count greater than
   zero and namespaces detected and running.
 - An error or something occurs which causes a termination of the prior
   association and then starts a reconnect,
 - The reconnect then creates a new controller, but for whatever reason,
   nvme_set_queue_count() results in io queue count set to zero.  This
   will skip io queue and tag set changes.
 - But... the controller will transition to live, calling
   nvme_start_ctrl, which calls nvme_start_queues(), which then releases
   I/Os into the transport which then sends them to the driver.

As there are no queues, things eventually hit the driver looking for a
handle, which was cleared when the original controller was reset, and it
can't proceed. Worst case, things progress, but everything fails.

In the failing scenario, the nvme_set_features(NVME_FEAT_NUM_QUEUES)
command actually failed with a NVME_SC_INTERNAL error.  For some reason,
although nvme_set_queue_count() saw the error and set io queue count to
zero, it doesn't return a failure status to the transport, which allows
the transport to continue using the controller.

Fix the problem by simply rejecting the new association if at least 1
I/O queue can't be created. The association reject will fail the
reconnect attempt and fall into the reconnect retry policy.
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

834d3710

nvme-fc: fix numa_node when dev is null · 06f3d71e

由 James Smart 提交于 3月 13, 2019

A recent change added a numa_node field to the nvme controller
and has the transport assign the node using dev_to_node().
However, fcloop registers with a NULL device struct, so the
dev_to_node() call oops.

Revise the assignment to assign no node when device struct is null.

Fixes: 103e515e ("nvme: add a numa_node field to struct nvme_ctrl")
Reported-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Reviewed-by: NMike Snitzer <snitzer@redhat.com>
[hch: small coding style fixup]
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

06f3d71e

nvme-fc: use nr_phys_segments to determine existence of sgl · 9f7d8ae2

由 James Smart 提交于 3月 13, 2019

For some nvme command, when issued by the nvme core layer, there
is an internal buffer which can cause blk_rq_payload_bytes() to
return a non-zero value yet there is no actual/real command payload
and sg list.  An example is the WRITE ZEROES command.

To address this, when making choices on whether to dma map an sgl,
use blk_rq_nr_phys_segments() instead of blk_rq_payload_bytes().
When there is a sgl, blk_rq_payload_bytes() will return the amount
of data to be transferred by the sgl.
Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

9f7d8ae2

nvme-loop: init nvmet_ctrl fatal_err_work when allocate · d11de63f

由 Yufen Yu 提交于 3月 13, 2019

After commit 4d43d395 (workqueue: Try to catch flush_work() without
INIT_WORK()), it can cause warning when delete nvme-loop device, trace
like:

[   76.601272] Call Trace:
[   76.601646]  ? del_timer+0x72/0xa0
[   76.602156]  __cancel_work_timer+0x1ae/0x270
[   76.602791]  cancel_work_sync+0x14/0x20
[   76.603407]  nvmet_ctrl_free+0x1b7/0x2f0 [nvmet]
[   76.604091]  ? free_percpu+0x168/0x300
[   76.604652]  nvmet_sq_destroy+0x106/0x240 [nvmet]
[   76.605346]  nvme_loop_destroy_admin_queue+0x30/0x60 [nvme_loop]
[   76.606220]  nvme_loop_shutdown_ctrl+0xc3/0xf0 [nvme_loop]
[   76.607026]  nvme_loop_delete_ctrl_host+0x19/0x30 [nvme_loop]
[   76.607871]  nvme_do_delete_ctrl+0x75/0xb0
[   76.608477]  nvme_sysfs_delete+0x7d/0xc0
[   76.609057]  dev_attr_store+0x24/0x40
[   76.609603]  sysfs_kf_write+0x4c/0x60
[   76.610144]  kernfs_fop_write+0x19a/0x260
[   76.610742]  __vfs_write+0x1c/0x60
[   76.611246]  vfs_write+0xfa/0x280
[   76.611739]  ksys_write+0x6e/0x120
[   76.612238]  __x64_sys_write+0x1e/0x30
[   76.612787]  do_syscall_64+0xbf/0x3a0
[   76.613329]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

We fix it by moving fatal_err_work init to nvmet_alloc_ctrl(), which may
more reasonable.
Signed-off-by: NYufen Yu <yuyufen@huawei.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NBart Van Assche <bvanassche@acm.org>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

d11de63f

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功