提交 · 5eb6126e1c5c5b69d90003444acc99743881b7b7 · openanolis / cloud-kernel

23 3月, 2017 13 次提交

blk-mq: improve blk_mq_try_issue_directly · 5eb6126e

由 Christoph Hellwig 提交于 3月 22, 2017

Rename blk_mq_try_issue_directly to __blk_mq_try_issue_directly and add a
new wrapper that takes care of RCU / SRCU locking to avoid having
boileplate code in the caller which would get duplicated with new callers.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

5eb6126e

blk-mq: merge mq and sq make_request instances · 254d259d

由 Christoph Hellwig 提交于 3月 22, 2017

They are mostly the same code anyway - this just one small conditional
for the plug case that is different for both variants.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

254d259d

blk-mq: remove BLK_MQ_F_DEFER_ISSUE · 7642747d

由 Christoph Hellwig 提交于 3月 22, 2017

This flag was never used since it was introduced.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

7642747d

block: Fix oops scsi_disk_get() · d01b2dcb

由 Jan Kara 提交于 3月 23, 2017

When device open races with device shutdown, we can get the following
oops in scsi_disk_get():

[11863.044351] general protection fault: 0000 [#1] SMP
[11863.045561] Modules linked in: scsi_debug xfs libcrc32c netconsole btrfs raid6_pq zlib_deflate lzo_compress xor [last unloaded: loop]
[11863.047853] CPU: 3 PID: 13042 Comm: hald-probe-stor Tainted: G W      4.10.0-rc2-xen+ #35
[11863.048030] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[11863.048030] task: ffff88007f438200 task.stack: ffffc90000fd0000
[11863.048030] RIP: 0010:scsi_disk_get+0x43/0x70
[11863.048030] RSP: 0018:ffffc90000fd3a08 EFLAGS: 00010202
[11863.048030] RAX: 6b6b6b6b6b6b6b6b RBX: ffff88007f56d000 RCX: 0000000000000000
[11863.048030] RDX: 0000000000000001 RSI: 0000000000000004 RDI: ffffffff81a8d880
[11863.048030] RBP: ffffc90000fd3a18 R08: 0000000000000000 R09: 0000000000000001
[11863.059217] R10: 0000000000000000 R11: 0000000000000000 R12: 00000000fffffffa
[11863.059217] R13: ffff880078872800 R14: ffff880070915540 R15: 000000000000001d
[11863.059217] FS:  00007f2611f71800(0000) GS:ffff88007f0c0000(0000) knlGS:0000000000000000
[11863.059217] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[11863.059217] CR2: 000000000060e048 CR3: 00000000778d4000 CR4: 00000000000006e0
[11863.059217] Call Trace:
[11863.059217]  ? disk_get_part+0x22/0x1f0
[11863.059217]  sd_open+0x39/0x130
[11863.059217]  __blkdev_get+0x69/0x430
[11863.059217]  ? bd_acquire+0x7f/0xc0
[11863.059217]  ? bd_acquire+0x96/0xc0
[11863.059217]  ? blkdev_get+0x350/0x350
[11863.059217]  blkdev_get+0x126/0x350
[11863.059217]  ? _raw_spin_unlock+0x2b/0x40
[11863.059217]  ? bd_acquire+0x7f/0xc0
[11863.059217]  ? blkdev_get+0x350/0x350
[11863.059217]  blkdev_open+0x65/0x80
...

As you can see RAX value is already poisoned showing that gendisk we got
is already freed. The problem is that get_gendisk() looks up device
number in ext_devt_idr and then does get_disk() which does kobject_get()
on the disks kobject. However the disk gets removed from ext_devt_idr
only in disk_release() (through blk_free_devt()) at which moment it has
already 0 refcount and is already on its way to be freed. Indeed we've
got a warning from kobject_get() about 0 refcount shortly before the
oops.

We fix the problem by using kobject_get_unless_zero() in get_disk() so
that get_disk() cannot get reference on a disk that is already being
freed.
Tested-by: NLekshmi Pillai <lekshmicpillai@in.ibm.com>
Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com>
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NJens Axboe <axboe@fb.com>

d01b2dcb

kobject: Export kobject_get_unless_zero() · c70c176f

由 Jan Kara 提交于 3月 23, 2017

Make the function available for outside use and fortify it against NULL
kobject.

CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com>
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NJens Axboe <axboe@fb.com>

c70c176f

block: Fix oops in locked_inode_to_wb_and_lock_list() · f759741d

由 Jan Kara 提交于 3月 23, 2017

When block device is closed, we call inode_detach_wb() in __blkdev_put()
which sets inode->i_wb to NULL. That is contrary to expectations that
inode->i_wb stays valid once set during the whole inode's lifetime and
leads to oops in wb_get() in locked_inode_to_wb_and_lock_list() because
inode_to_wb() returned NULL.

The reason why we called inode_detach_wb() is not valid anymore though.
BDI is guaranteed to stay along until we call bdi_put() from
bdev_evict_inode() so we can postpone calling inode_detach_wb() to that
moment.

Also add a warning to catch if someone uses inode_detach_wb() in a
dangerous way.
Reported-by: NThiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NJens Axboe <axboe@fb.com>

f759741d

bdi: Rename cgwb_bdi_destroy() to cgwb_bdi_unregister() · b1c51afc

由 Jan Kara 提交于 3月 23, 2017

Rename cgwb_bdi_destroy() to cgwb_bdi_unregister() as it gets called
from bdi_unregister() which is not necessarily called from bdi_destroy()
and thus the name is somewhat misleading.
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NJens Axboe <axboe@fb.com>

b1c51afc

bdi: Do not wait for cgwbs release in bdi_unregister() · 4514451e

由 Jan Kara 提交于 3月 23, 2017

Currently we wait for all cgwbs to get released in cgwb_bdi_destroy()
(called from bdi_unregister()). That is however unnecessary now when
cgwb->bdi is a proper refcounted reference (thus bdi cannot get
released before all cgwbs are released) and when cgwb_bdi_destroy()
shuts down writeback directly.
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NJens Axboe <axboe@fb.com>

4514451e

bdi: Shutdown writeback on all cgwbs in cgwb_bdi_destroy() · 5318ce7d

由 Jan Kara 提交于 3月 23, 2017

Currently we waited for all cgwbs to get freed in cgwb_bdi_destroy()
which also means that writeback has been shutdown on them. Since this
wait is going away, directly shutdown writeback on cgwbs from
cgwb_bdi_destroy() to avoid live writeback structures after
bdi_unregister() has finished. To make that safe with concurrent
shutdown from cgwb_release_workfn(), we also have to make sure
wb_shutdown() returns only after the bdi_writeback structure is really
shutdown.
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NJens Axboe <axboe@fb.com>

5318ce7d

bdi: Unify bdi->wb_list handling for root wb_writeback · e8cb72b3

由 Jan Kara 提交于 3月 23, 2017

Currently root wb_writeback structure is added to bdi->wb_list in
bdi_init() and never removed. That is different from all other
wb_writeback structures which get added to the list when created and
removed from it before wb_shutdown().

So move list addition of root bdi_writeback to bdi_register() and list
removal of all wb_writeback structures to wb_shutdown(). That way a
wb_writeback structure is on bdi->wb_list if and only if it can handle
writeback and it will make it easier for us to handle shutdown of all
wb_writeback structures in bdi_unregister().
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NJens Axboe <axboe@fb.com>

e8cb72b3

bdi: Make wb->bdi a proper reference · 810df54a

由 Jan Kara 提交于 3月 23, 2017

Make wb->bdi a proper refcounted reference to bdi for all bdi_writeback
structures except for the one embedded inside struct backing_dev_info.
That will allow us to simplify bdi unregistration.
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NJens Axboe <axboe@fb.com>

810df54a

bdi: Mark congested->bdi as internal · b7d680d7

由 Jan Kara 提交于 3月 23, 2017

congested->bdi pointer is used only to be able to remove congested
structure from bdi->cgwb_congested_tree on structure release. Moreover
the pointer can become NULL when we unregister the bdi. Rename the field
to __bdi and add a comment to make it more explicit this is internal
stuff of memcg writeback code and people should not use the field as
such use will be likely race prone.

We do not bother with converting congested->bdi to a proper refcounted
reference. It will be slightly ugly to special-case bdi->wb.congested to
avoid effectively a cyclic reference of bdi to itself and the reference
gets cleared from bdi_unregister() making it impossible to reference
a freed bdi.
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NJens Axboe <axboe@fb.com>

b7d680d7

block: Fix bdi assignment to bdev inode when racing with disk delete · 03e26279

由 Jan Kara 提交于 3月 23, 2017

When disk->fops->open() in __blkdev_get() returns -ERESTARTSYS, we
restart the process of opening the block device. However we forget to
switch bdev->bd_bdi back to noop_backing_dev_info and as a result bdev
inode will be pointing to a stale bdi. Fix the problem by setting
bdev->bd_bdi later when __blkdev_get() is already guaranteed to succeed.
Acked-by: NTejun Heo <tj@kernel.org>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NJens Axboe <axboe@fb.com>

03e26279

22 3月, 2017 6 次提交

block: fix stacked driver stats init and free · a83b576c

由 Jens Axboe 提交于 3月 21, 2017

If a driver allocates a queue for stacked usage, then it does
not currently get stats allocated. This causes the later init
of, eg, writeback throttling to blow up. Move the init to the
queue allocation instead.

Additionally, allow a NULL callback unregistration. This avoids
having the caller check for that, fixing another oops on
removal of a block device that doesn't have poll stats allocated.

Fixes: 34dbad5d ("blk-stat: convert to callback-based statistics reporting")
Signed-off-by: NJens Axboe <axboe@fb.com>

a83b576c

blk-stat: convert to callback-based statistics reporting · 34dbad5d

由 Omar Sandoval 提交于 3月 21, 2017

Currently, statistics are gathered in ~0.13s windows, and users grab the
statistics whenever they need them. This is not ideal for both in-tree
users:

1. Writeback throttling wants its own dynamically sized window of
   statistics. Since the blk-stats statistics are reset after every
   window and the wbt windows don't line up with the blk-stats windows,
   wbt doesn't see every I/O.
2. Polling currently grabs the statistics on every I/O. Again, depending
   on how the window lines up, we may miss some I/Os. It's also
   unnecessary overhead to get the statistics on every I/O; the hybrid
   polling heuristic would be just as happy with the statistics from the
   previous full window.

This reworks the blk-stats infrastructure to be callback-based: users
register a callback that they want called at a given time with all of
the statistics from the window during which the callback was active.
Users can dynamically bucketize the statistics. wbt and polling both
currently use read vs. write, but polling can be extended to further
subdivide based on request size.

The callbacks are kept on an RCU list, and each callback has percpu
stats buffers. There will only be a few users, so the overhead on the
I/O completion side is low. The stats flushing is also simplified
considerably: since the timer function is responsible for clearing the
statistics, we don't have to worry about stale statistics.

wbt is a trivial conversion. After the conversion, the windowing problem
mentioned above is fixed.

For polling, we register an extra callback that caches the previous
window's statistics in the struct request_queue for the hybrid polling
heuristic to use.

Since we no longer have a single stats buffer for the request queue,
this also removes the sysfs and debugfs stats entries. To replace those,
we add a debugfs entry for the poll statistics.
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

34dbad5d

blk-stat: move BLK_RQ_STAT_BATCH definition to blk-stat.c · 4875253f

由 Omar Sandoval 提交于 3月 21, 2017

This is an implementation detail that no-one outside of blk-stat.c uses.
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

4875253f

blk-stat: use READ and WRITE instead of BLK_STAT_{READ,WRITE} · fa2e39cb

由 Omar Sandoval 提交于 3月 21, 2017

The stats buckets will become generic soon, so make the existing users
use the common READ and WRITE definitions instead of one internal to
blk-stat.
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

fa2e39cb

block: remove extra calls to wbt_exit() · 0315b159

由 Omar Sandoval 提交于 3月 21, 2017

We always call wbt_exit() from blk_release_queue(), so these are
unnecessary.
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

0315b159

blk-stat: fix blk_stat_sum() if all samples are batched · 7d8d0014

由 Omar Sandoval 提交于 3月 16, 2017

We need to flush the batch _before_ we check the number of samples,
otherwise we'll miss all of the batched samples.

Fixes: cf43e6be ("block: add scalable completion tracking of requests")
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

7d8d0014

20 3月, 2017 6 次提交

L

Linux 4.11-rc3 · 97da3854
由 Linus Torvalds 提交于 3月 19, 2017

97da3854

mm/swap: don't BUG_ON() due to uninitialized swap slot cache · 452b94b8

由 Linus Torvalds 提交于 3月 19, 2017

This BUG_ON() triggered for me once at shutdown, and I don't see a
reason for the check.  The code correctly checks whether the swap slot
cache is usable or not, so an uninitialized swap slot cache is not
actually problematic afaik.

I've temporarily just switched the BUG_ON() to a WARN_ON_ONCE(), since
I'm not sure why that seemingly pointless check was there.  I suspect
the real fix is to just remove it entirely, but for now we'll warn about
it but not bring the machine down.

Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

452b94b8

Merge tag 'powerpc-4.11-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · a07a6e41

由 Linus Torvalds 提交于 3月 19, 2017

Pull more powerpc fixes from Michael Ellerman:
 "A couple of minor powerpc fixes for 4.11:

   - wire up statx() syscall

   - don't print a warning on memory hotplug when HPT resizing isn't
     available

  Thanks to: David Gibson, Chandan Rajendra"

* tag 'powerpc-4.11-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/pseries: Don't give a warning when HPT resizing isn't available
  powerpc: Wire up statx() syscall

a07a6e41

Merge branch 'parisc-4.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux · 4571bc5a

由 Linus Torvalds 提交于 3月 19, 2017

Pull parisc fixes from Helge Deller:

 - Mikulas Patocka added support for R_PARISC_SECREL32 relocations in
   modules with CONFIG_MODVERSIONS.

 - Dave Anglin optimized the cache flushing for vmap ranges.

 - Arvind Yadav provided a fix for a potential NULL pointer dereference
   in the parisc perf code (and some code cleanups).

 - I wired up the new statx system call, fixed some compiler warnings
   with the access_ok() macro and fixed shutdown code to really halt a
   system at shutdown instead of crashing & rebooting.

* 'parisc-4.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Fix system shutdown halt
  parisc: perf: Fix potential NULL pointer dereference
  parisc: Avoid compiler warnings with access_ok()
  parisc: Wire up statx system call
  parisc: Optimize flush_kernel_vmap_range and invalidate_kernel_vmap_range
  parisc: support R_PARISC_SECREL32 relocation in modules

4571bc5a

Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending · 8aa34172

由 Linus Torvalds 提交于 3月 19, 2017

Pull SCSI target fixes from Nicholas Bellinger:
 "The bulk of the changes are in qla2xxx target driver code to address
  various issues found during Cavium/QLogic's internal testing (stable
  CC's included), along with a few other stability and smaller
  miscellaneous improvements.

  There are also a couple of different patch sets from Mike Christie,
  which have been a result of his work to use target-core ALUA logic
  together with tcm-user backend driver.

  Finally, a patch to address some long standing issues with
  pass-through SCSI export of TYPE_TAPE + TYPE_MEDIUM_CHANGER devices,
  which will make folks using physical (or virtual) magnetic tape happy"

* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (28 commits)
  qla2xxx: Update driver version to 9.00.00.00-k
  qla2xxx: Fix delayed response to command for loop mode/direct connect.
  qla2xxx: Change scsi host lookup method.
  qla2xxx: Add DebugFS node to display Port Database
  qla2xxx: Use IOCB interface to submit non-critical MBX.
  qla2xxx: Add async new target notification
  qla2xxx: Export DIF stats via debugfs
  qla2xxx: Improve T10-DIF/PI handling in driver.
  qla2xxx: Allow relogin to proceed if remote login did not finish
  qla2xxx: Fix sess_lock & hardware_lock lock order problem.
  qla2xxx: Fix inadequate lock protection for ABTS.
  qla2xxx: Fix request queue corruption.
  qla2xxx: Fix memory leak for abts processing
  qla2xxx: Allow vref count to timeout on vport delete.
  tcmu: Convert cmd_time_out into backend device attribute
  tcmu: make cmd timeout configurable
  tcmu: add helper to check if dev was configured
  target: fix race during implicit transition work flushes
  target: allow userspace to set state to transitioning
  target: fix ALUA transition timeout handling
  ...

8aa34172

Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm · 1b8df619

由 Linus Torvalds 提交于 3月 19, 2017

Pull device-dax fixes from Dan Williams:
 "The device-dax driver was not being careful to handle falling back to
  smaller fault-granularity sizes.

  The driver already fails fault attempts that are smaller than the
  device's alignment, but it also needs to handle the cases where a
  larger page mapping could be established. For simplicity of the
  immediate fix the implementation just signals VM_FAULT_FALLBACK until
  fault-size == device-alignment.

  One fix is for -stable to address pmd-to-pte fallback from the
  original implementation, another fix is for the new (introduced in
  4.11-rc1) pud-to-pmd regression, and a typo fix comes along for the
  ride.

  These have received a build success notification from the kbuild
  robot"

* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  device-dax: fix debug output typo
  device-dax: fix pud fault fallback handling
  device-dax: fix pmd/pte fault fallback handling

1b8df619

19 3月, 2017 15 次提交

qla2xxx: Update driver version to 9.00.00.00-k · 6c611d18

由 Himanshu Madhani 提交于 3月 15, 2017

Signed-off-by: NHimanshu Madhani <himanshu.madhani@cavium.com>
signed-off-by: NGiridhar Malavali <giridhar.malavali@cavium.com>
Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>

6c611d18

qla2xxx: Fix delayed response to command for loop mode/direct connect. · ec7193e2

由 Quinn Tran 提交于 3月 15, 2017

Current driver wait for FW to be in the ready state before
processing in-coming commands. For Arbitrated Loop or
Point-to- Point (not switch), FW Ready state can take a while.
FW will transition to ready state after all Nports have been
logged in. In the mean time, certain initiators have completed
the login and starts IO. Driver needs to start processing all
queues if FW is already started.
Signed-off-by: NQuinn Tran <quinn.tran@cavium.com>
Signed-off-by: NHimanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>

ec7193e2

qla2xxx: Change scsi host lookup method. · 482c9dc7

由 Quinn Tran 提交于 3月 15, 2017

For target mode, when new scsi command arrive, driver first performs
a look up of the SCSI Host. The current look up method is based on
the ALPA portion of the NPort ID. For Cisco switch, the ALPA can
not be used as the index. Instead, the new search method is based
on the full value of the Nport_ID via btree lib.
Signed-off-by: NQuinn Tran <quinn.tran@cavium.com>
Signed-off-by: NHimanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>

482c9dc7

qla2xxx: Add DebugFS node to display Port Database · c423437e

由 Himanshu Madhani 提交于 3月 15, 2017

Signed-off-by: NHimanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: NGiridhar Malavali <giridhar.malavali@cavium.com>
Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>

c423437e

qla2xxx: Use IOCB interface to submit non-critical MBX. · 15f30a57

由 Quinn Tran 提交于 3月 15, 2017

The Mailbox interface is currently over subscribed. We like
to reserve the Mailbox interface for the chip managment and
link initialization. Any non essential Mailbox command will
be routed through the IOCB interface. The IOCB interface is
able to absorb more commands.

Following commands are being routed through IOCB interface

- Get ID List (007Ch)
- Get Port DB (0064h)
- Get Link Priv Stats (006Dh)
Signed-off-by: NQuinn Tran <quinn.tran@cavium.com>
Signed-off-by: NHimanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>

15f30a57

qla2xxx: Add async new target notification · f1443eeb

由 Quinn Tran 提交于 3月 15, 2017

Signed-off-by: NQuinn Tran <quinn.tran@cavium.com>
Signed-off-by: NHimanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>

f1443eeb

qla2xxx: Export DIF stats via debugfs · 54b9993c

由 Anil Gurumurthy 提交于 3月 15, 2017

Signed-off-by: NAnil Gurumurthy <anil.gurumurthy@cavium.com>
Signed-off-by: NHimanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>

54b9993c

qla2xxx: Improve T10-DIF/PI handling in driver. · be25152c

由 Quinn Tran 提交于 3月 15, 2017

Add routines to support T10 DIF tag.
Signed-off-by: NQuinn Tran <quinn.tran@cavium.com>
Signed-off-by: NAnil Gurumurthy <anil.gurumurthy@cavium.com>
Signed-off-by: NHimanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>

be25152c

qla2xxx: Allow relogin to proceed if remote login did not finish · 5b33469a

由 Quinn Tran 提交于 3月 15, 2017

If the remote port have started the login process, then the
PLOGI and PRLI should be back to back. Driver will allow
the remote port to complete the process. For the case where
the remote port decide to back off from sending PRLI, this
local port sets an expiration timer for the PRLI. Once the
expiration time passes, the relogin retry logic is allowed
to go through and perform login with the remote port.
Signed-off-by: NQuinn Tran <quinn.tran@qlogic.com>
Signed-off-by: NHimanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>

5b33469a

qla2xxx: Fix sess_lock & hardware_lock lock order problem. · f159b3c7

由 Quinn Tran 提交于 3月 15, 2017

The main lock that needs to be held for CMD or TMR submission
to upper layer is the sess_lock. The sess_lock is used to
serialize cmd submission and session deletion. The addition
of hardware_lock being held is not necessary. This patch removes
hardware_lock dependency from CMD/TMR submission.

Use hardware_lock only for error response in this case.

Path1
       CPU0                    CPU1
       ----                    ----
  lock(&(&ha->tgt.sess_lock)->rlock);
                               lock(&(&ha->hardware_lock)->rlock);
                               lock(&(&ha->tgt.sess_lock)->rlock);
  lock(&(&ha->hardware_lock)->rlock);

Path2/deadlock
*** DEADLOCK ***
Call Trace:
dump_stack+0x85/0xc2
print_circular_bug+0x1e3/0x250
__lock_acquire+0x1425/0x1620
lock_acquire+0xbf/0x210
_raw_spin_lock_irqsave+0x53/0x70
qlt_sess_work_fn+0x21d/0x480 [qla2xxx]
process_one_work+0x1f4/0x6e0

Cc: <stable@vger.kernel.org>
Cc: Bart Van Assche <Bart.VanAssche@sandisk.com>
Reported-by: NBart Van Assche <Bart.VanAssche@sandisk.com>
Signed-off-by: NQuinn Tran <quinn.tran@cavium.com>
Signed-off-by: NHimanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>

f159b3c7

qla2xxx: Fix inadequate lock protection for ABTS. · 8f6fc8d4

由 Quinn Tran 提交于 3月 15, 2017

Normally, ABTS is sent to Target Core as Task MGMT command.
In the case of error, qla2xxx needs to send response, hardware_lock
is required to prevent request queue corruption.

Cc: <stable@vger.kernel.org>
Signed-off-by: NQuinn Tran <quinn.tran@cavium.com>
Signed-off-by: NHimanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>

8f6fc8d4

qla2xxx: Fix request queue corruption. · 8b666809

由 Quinn Tran 提交于 3月 15, 2017

When FW notify driver or driver detects low FW resource,
driver tries to send out Busy SCSI Status to tell Initiator
side to back off. During the send process, the lock was not held.

Cc: <stable@vger.kernel.org>
Signed-off-by: NQuinn Tran <quinn.tran@qlogic.com>
Signed-off-by: NHimanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>

8b666809

qla2xxx: Fix memory leak for abts processing · ae940f2c

由 Quinn Tran 提交于 3月 15, 2017

Cc: <stable@vger.kernel.org>
Signed-off-by: NQuinn Tran <quinn.tran@cavium.com>
Signed-off-by: NHimanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>

ae940f2c

qla2xxx: Allow vref count to timeout on vport delete. · c4a9b538

由 Joe Carnuccio 提交于 3月 15, 2017

Cc: <stable@vger.kernel.org>
Signed-off-by: NJoe Carnuccio <joe.carnuccio@cavium.com>
Signed-off-by: NHimanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>

c4a9b538

tcmu: Convert cmd_time_out into backend device attribute · 7d7a7435

由 Nicholas Bellinger 提交于 3月 18, 2017

Instead of putting cmd_time_out under ../target/core/user_0/foo/control,
which has historically been used by parameters needed for initial
backend device configuration, go ahead and move cmd_time_out into
a backend device attribute.

In order to do this, tcmu_module_init() has been updated to create
a local struct configfs_attribute **tcmu_attrs, that is based upon
the existing passthrough_attrib_attrs along with the new cmd_time_out
attribute.  Once **tcm_attrs has been setup, go ahead and point
it at tcmu_ops->tb_dev_attrib_attrs so it's picked up by target-core.

Also following MNC's previous change, ->cmd_time_out is stored in
milliseconds but exposed via configfs in seconds.  Also, note this
patch restricts the modification of ->cmd_time_out to before +
after the TCMU device has been configured, but not while it has
active fabric exports.

Cc: Mike Christie <mchristi@redhat.com>
Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>

7d7a7435

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功