提交 · c84b023a4c1461498abf0eda54f60e2fd64a1ca2 · openanolis / cloud-kernel

27 6月, 2018 1 次提交

scsi: read host_busy via scsi_host_busy() · c84b023a

由 Ming Lei 提交于 6月 24, 2018

No functional change.

Just introduce scsi_host_busy() and replace the direct read of
scsi_host->host_busy with this new API.

Cc: Omar Sandoval <osandov@fb.com>,
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>,
Cc: James Bottomley <james.bottomley@hansenpartnership.com>,
Cc: Christoph Hellwig <hch@lst.de>,
Cc: Don Brace <don.brace@microsemi.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Laurence Oberman <loberman@redhat.com>
Cc: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: NMing Lei <ming.lei@redhat.com>
Reviewed-by: NBart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

c84b023a

13 3月, 2018 1 次提交

scsi: libsas: defer ata device eh commands to libata · 318aaf34

由 Jason Yan 提交于 3月 08, 2018

When ata device doing EH, some commands still attached with tasks are
not passed to libata when abort failed or recover failed, so libata did
not handle these commands. After these commands done, sas task is freed,
but ata qc is not freed. This will cause ata qc leak and trigger a
warning like below:

WARNING: CPU: 0 PID: 28512 at drivers/ata/libata-eh.c:4037
ata_eh_finish+0xb4/0xcc
CPU: 0 PID: 28512 Comm: kworker/u32:2 Tainted: G     W  OE 4.14.0#1
......
Call trace:
[<ffff0000088b7bd0>] ata_eh_finish+0xb4/0xcc
[<ffff0000088b8420>] ata_do_eh+0xc4/0xd8
[<ffff0000088b8478>] ata_std_error_handler+0x44/0x8c
[<ffff0000088b8068>] ata_scsi_port_error_handler+0x480/0x694
[<ffff000008875fc4>] async_sas_ata_eh+0x4c/0x80
[<ffff0000080f6be8>] async_run_entry_fn+0x4c/0x170
[<ffff0000080ebd70>] process_one_work+0x144/0x390
[<ffff0000080ec100>] worker_thread+0x144/0x418
[<ffff0000080f2c98>] kthread+0x10c/0x138
[<ffff0000080855dc>] ret_from_fork+0x10/0x18

If ata qc leaked too many, ata tag allocation will fail and io blocked
for ever.

As suggested by Dan Williams, defer ata device commands to libata and
merge sas_eh_finish_cmd() with sas_eh_defer_cmd(). libata will handle
ata qcs correctly after this.
Signed-off-by: NJason Yan <yanaijie@huawei.com>
CC: Xiaofei Tan <tanxiaofei@huawei.com>
CC: John Garry <john.garry@huawei.com>
CC: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

318aaf34

11 1月, 2018 1 次提交

scsi: libsas: Disable asynchronous aborts for SATA devices · c9f92600

由 Hannes Reinecke 提交于 1月 10, 2018

Handling CD-ROM devices from libsas is decidedly odd, as libata relies
on SCSI EH to be started to figure out that no medium is present.  So we
cannot do asynchronous aborts for SATA devices.

Fixes: 90965761 ("scsi: libsas: allow async aborts")
Cc: <stable@vger.kernel.org> # 4.12+
Signed-off-by: NHannes Reinecke <hare@suse.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Tested-by: NYves-Alexis Perez <corsac@debian.org>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

c9f92600

04 1月, 2018 1 次提交

scsi: libsas: remove private hex2bin() implementation · 9ea4e076

由 Andy Shevchenko 提交于 12月 19, 2017

The function sas_parse_addr() could be easily substituted by hex2bin()
which is in kernel library code.

Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: NXiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

9ea4e076

22 11月, 2017 1 次提交

treewide: Remove TIMER_FUNC_TYPE and TIMER_DATA_TYPE casts · 841b86f3

由 Kees Cook 提交于 10月 23, 2017

With all callbacks converted, and the timer callback prototype
switched over, the TIMER_FUNC_TYPE cast is no longer needed,
so remove it. Conversion was done with the following scripts:

    perl -pi -e 's|\(TIMER_FUNC_TYPE\)||g' \
        $(git grep TIMER_FUNC_TYPE | cut -d: -f1 | sort -u)

    perl -pi -e 's|\(TIMER_DATA_TYPE\)||g' \
        $(git grep TIMER_DATA_TYPE | cut -d: -f1 | sort -u)

The now unused macros are also dropped from include/linux/timer.h.
Signed-off-by: NKees Cook <keescook@chromium.org>

841b86f3

02 11月, 2017 1 次提交

scsi: sas: Convert timers to use timer_setup() · 77570eed

由 Kees Cook 提交于 8月 22, 2017

In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly. This requires adding a pointer to
hold the timer's target task, as there isn't a link back from slow_task.

Cc: John Garry <john.garry@huawei.com>
Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Jack Wang <jinpu.wang@profitbricks.com>
Cc: lindar_liu@usish.com
Cc: Jens Axboe <axboe@fb.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Cc: Benjamin Block <bblock@linux.vnet.ibm.com>
Cc: Baoyou Xie <baoyou.xie@linaro.org>
Cc: Wei Yongjun <weiyongjun1@huawei.com>
Cc: linux-scsi@vger.kernel.org
Signed-off-by: NKees Cook <keescook@chromium.org>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Acked-by: John Garry <john.garry@huawei.com> # for hisi_sas part
Tested-by: John Garry <john.garry@huawei.com> # basic sanity test for hisi_sas
Reviewed-by: NJack Wang <jinpu.wang@profitbricks.com>

77570eed

26 8月, 2017 2 次提交

scsi: libsas: move bus_reset_handler() to target_reset_handler() · cc199e78

由 Hannes Reinecke 提交于 8月 25, 2017

The bus reset handler is calling I_T Nexus reset, which logically is a
target reset as it need to specify both the initiator and the target.
So move it to target reset.
Signed-off-by: NHannes Reinecke <hare@suse.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

cc199e78

scsi: libsas: Remove a set-but-not-used variable · bcba3c22

由 Bart Van Assche 提交于 8月 25, 2017

This was detected by building with W=1.
Signed-off-by: NBart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

bcba3c22

07 4月, 2017 2 次提交

scsi: make eh_eflags persistent · 8e8c9d01

由 Hannes Reinecke 提交于 4月 06, 2017

If a failed command is retried and fails again we need
to enter SCSI EH, otherwise we will never be able to
recover the command.
To detect this situation we must not clear scmd->eh_eflags
when EH finishes but rather make it persistent throughout
the lifetime of the command.
Signed-off-by: NHannes Reinecke <hare@suse.de>
Reviewed-by: NBenjamin Block <bblock@linux.vnet.ibm.com>
Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

8e8c9d01

scsi: libsas: allow async aborts · 90965761

由 Christoph Hellwig 提交于 4月 06, 2017

We now first try to call ->eh_abort_handler from a work queue, but libsas
was always failing that for no good reason.  Allow async aborts.
Reviewed-by: NJohannes Thumshirn <jth@kernel.org>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

90965761

07 2月, 2017 1 次提交

scsi: libsas: remove sas_scsi_timed_out · 28917d40

由 Christoph Hellwig 提交于 1月 30, 2017

EH_NOT_HANDLED is the default case if no eh_timed_out method is
provided, so there is no need to supply it.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

28917d40

04 12月, 2014 1 次提交

scsi: remove ->change_queue_type method · efc3c1df

由 Christoph Hellwig 提交于 11月 24, 2014

Since we got rid of ordered tag support in 2010 the prime use case of
switching on and off ordered tags has been obsolete.  The other function
of enabling/disabling tagging entirely has only been correctly implemented
by the 53c700 driver and isn't generally useful.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBart Van Assche <bvanassche@acm.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com
Reviewed-by: NHannes Reinecke <hare@suse.de>

efc3c1df

27 11月, 2014 1 次提交

libsas: remove task_collector mode · 79855d17

由 Christoph Hellwig 提交于 11月 05, 2014

The task_collector mode (or "latency_injector", (C) Dan Willians) is an
optional I/O path in libsas that queues up scsi commands instead of
directly sending it to the hardware.  It generall increases latencies
to in the optiomal case slightly reduce mmio traffic to the hardware.

Only the obsolete aic94xx driver and the mvsas driver allowed to use
it without recompiling the kernel, and most drivers didn't support it
at all.

Remove the giant blob of code to allow better optimizations for scsi-mq
in the future.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Acked-by: NDan Williams <dan.j.williams@intel.com>

79855d17

24 11月, 2014 2 次提交

scsi: drop reason argument from ->change_queue_depth · db5ed4df

由 Christoph Hellwig 提交于 11月 13, 2014

Drop the now unused reason argument from the ->change_queue_depth method.
Also add a return value to scsi_adjust_queue_depth, and rename it to
scsi_change_queue_depth now that it can be used as the default
->change_queue_depth implementation.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMike Christie <michaelc@cs.wisc.edu>
Reviewed-by: NHannes Reinecke <hare@suse.de>

db5ed4df

scsi: avoid ->change_queue_depth indirection for queue full tracking · c40ecc12

由 Christoph Hellwig 提交于 11月 13, 2014

All drivers use the implementation for ramping the queue up and down, so
instead of overloading the change_queue_depth method call the
implementation diretly if the driver opts into it by setting the
track_queue_depth flag in the host template.

Note that a few drivers validated the new queue depth in their
change_queue_depth method, but as we never go over the queue depth
set during slave_configure or the sysfs file this isn't nessecary
and can safely be removed.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMike Christie <michaelc@cs.wisc.edu>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Reviewed-by: NVenkatesh Srinivas <venkateshs@google.com>

c40ecc12

12 11月, 2014 3 次提交

scsi: don't force tagged_supported in drivers · ee11560f

由 Christoph Hellwig 提交于 11月 03, 2014

Now that we also get proper values in cmd->request->tag for untagged
commands, there is no need to force tagged_supported to on in drivers
that need host-wide tags.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMike Christie <michaelc@cs.wisc.edu>
Reviewed-by: NHannes Reinecke <hare@suse.de>

ee11560f

scsi: don't set tagging state from scsi_adjust_queue_depth · c8b09f6f

由 Christoph Hellwig 提交于 11月 03, 2014

Remove the tagged argument from scsi_adjust_queue_depth, and just let it
handle the queue depth.  For most drivers those two are fairly separate,
given that most modern drivers don't care about the SCSI "tagged" status
of a command at all, and many old drivers allow queuing of multiple
untagged commands in the driver.

Instead we start out with the ->simple_tags flag set before calling
->slave_configure, which is how all drivers actually looking at
->simple_tags except for one worke anyway.  The one other case looks
broken, but I've kept the behavior as-is for now.

Except for that we only change ->simple_tags from the ->change_queue_type,
and when rejecting a tag message in a single driver, so keeping this
churn out of scsi_adjust_queue_depth is a clear win.

Now that the usage of scsi_adjust_queue_depth is more obvious we can
also remove all the trivial instances in ->slave_alloc or ->slave_configure
that just set it to the cmd_per_lun default.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMike Christie <michaelc@cs.wisc.edu>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>

c8b09f6f

scsi: always assign block layer tags if enabled · 2ecb204d

由 Christoph Hellwig 提交于 11月 03, 2014

Allow a driver to ask for block layer tags by setting .use_blk_tags in the
host template, in which case it will always see a valid value in
request->tag, similar to the behavior when using blk-mq.  This means even
SCSI "untagged" commands will now have a tag, which is especially useful
when using a host-wide tag map.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMike Christie <michaelc@cs.wisc.edu>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: NHannes Reinecke <hare@suse.de>

2ecb204d

25 7月, 2014 1 次提交

scsi: convert host_busy to atomic_t · 74665016

由 Christoph Hellwig 提交于 1月 22, 2014

Avoid taking the host-wide host_lock to check the per-host queue limit.
Instead we do an atomic_inc_return early on to grab our slot in the queue,
and if necessary decrement it after finishing all checks.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Reviewed-by: NWebb Scales <webbnh@hp.com>
Acked-by: NJens Axboe <axboe@kernel.dk>
Tested-by: NBart Van Assche <bvanassche@acm.org>
Tested-by: NRobert Elliott <elliott@hp.com>

74665016

18 7月, 2014 1 次提交

scsi: use 64-bit LUNs · 9cb78c16

由 Hannes Reinecke 提交于 6月 25, 2014

The SCSI standard defines 64-bit values for LUNs, and large arrays
employing large or hierarchical LUN numbers become more and more
common.

So update the linux SCSI stack to use 64-bit LUN numbers.
Signed-off-by: NHannes Reinecke <hare@suse.de>
Reviewed-by: NChristoph Hellwig <hch@infradead.org>
Reviewed-by: NEwan Milne <emilne@redhat.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

9cb78c16

16 3月, 2014 1 次提交

[SCSI] libsas: introduce scmd_dbg() to quiet false positive "timeout" messages · 3af74a3c

由 Dan Williams 提交于 2月 06, 2014

libsas sometimes short circuits timeouts to force commands into error
recovery.  It is misleading to log that the command timed-out in
sas_scsi_timed_out() when in fact it was just queued for error handling.
It's also redundant in the case of a true timeout as libata eh will
detect and report timeouts via it's AC_ERR_TIMEOUT facility.

Given that some environments consider "timeout" errors to be indicative
of impending device failure demote the sas_scsi_timed_out() timeout
message to be disabled by default.  This parallels ata_scsi_timed_out().

[jejb: checkpatch fix]
Reported-by: NXun Ni <xun.ni@intel.com>
Tested-by: NNelson Cheng <nelson.cheng@intel.com>
Acked-by: NLukasz Dorau <lukasz.dorau@intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>

3af74a3c

05 6月, 2013 1 次提交

[SCSI] libsas: implement > 16 byte CDB support · e73823f7

由 James Bottomley 提交于 5月 07, 2013

Remove the arbitrary expectation in libsas that all SCSI commands are 16 bytes
or less.  Instead do all copies via cmd->cmd_len (and use a pointer to this in
the libsas task instead of a copy).  Note that this still doesn't enable > 16
byte CDB support in the underlying drivers because their internal format has
to be fixed and the wire format of > 16 byte CDBs according to the SAS spec is
different.  the libsas drivers (isci, aic94xx, mvsas and pm8xxx are all
updated for this change.

Cc: Lukasz Dorau <lukasz.dorau@intel.com>
Cc: Maciej Patelczyk <maciej.patelczyk@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Jack Wang <xjtuwjp@gmail.com>
Cc: Lindar Liu <lindar_liu@usish.com>
Cc: Xiangliang Yu <yuxiangl@marvell.com>
Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>

e73823f7

20 7月, 2012 6 次提交

[SCSI] libsas: trim sas_task of slow path infrastructure · f0bf750c

由 Dan Williams 提交于 6月 21, 2012

The timer and the completion are only used for slow path tasks (smp, and
lldd tmfs), yet we incur the allocation space and cpu setup time for
every fast path task.

Cc: Xiangliang Yu <yuxiangl@marvell.com>
Acked-by: NJack Wang <jack_wang@usish.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>

f0bf750c

[SCSI] libsas: use ->lldd_I_T_nexus_reset for ->eh_bus_reset_handler · e7db8229

由 Dan Williams 提交于 6月 21, 2012

sas_eh_bus_reset_handler() amounts to sas_phy_reset() without
notification of the reset to the lldd.  If this is triggered from
eh-cmnd recovery there may be sas_tasks for the lldd to terminate, so
->lldd_I_T_nexus_reset is warranted.

Cc: Xiangliang Yu <yuxiangl@marvell.com>
Cc: Luben Tuikov <ltuikov@yahoo.com>
Cc: Jack Wang <jack_wang@usish.com>
Reviewed-by: NJacek Danecki <jacek.danecki@intel.com>
[jacek: modify pm8001_I_T_nexus_reset to return -ENODEV]
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>

e7db8229

[SCSI] libsas: add sas_eh_abort_handler · 9524c682

由 Dan Williams 提交于 6月 21, 2012

When recovering failed eh-cmnds let the lldd attempt an abort via
scsi_abort_eh_cmnd before escalating.
Reviewed-by: NJacek Danecki <jacek.danecki@intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>

9524c682

[SCSI] libsas: enforce eh strategy handlers only in eh context · 5db45bdc

由 Dan Williams 提交于 6月 21, 2012

The strategy handlers may be called in places that are problematic for
libsas (i.e. sata resets outside of domain revalidation filtering /
libata link recovery), or problematic for userspace (non-blocking ioctl
to sleeping reset functions). However, these routines are also called
for eh escalations and recovery of scsi_eh_prep_cmnd(), so permit them
as long as we are running in the host's error handler, otherwise arrange
for them to be triggered in eh_context.
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>

5db45bdc

[SCSI] libsas: cleanup spurious calls to scsi_schedule_eh · 36fed498

由 Maciej Trela 提交于 6月 21, 2012

eh is woken up automatically by the presence of failed commands,
scsi_schedule_eh is reserved for cases where there are no failed
commands.  This guarantees that host_eh_sceduled is only incremented
when an explicit eh request is made.
Reviewed-by: NJacek Danecki <jacek.danecki@intel.com>
Signed-off-by: NMaciej Trela <maciej.trela@intel.com>
[fixed spurious delete of sas_ata_task_abort]
Signed-off-by: NArtur Wojcik <artur.wojcik@intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>

36fed498

[SCSI] libata, libsas: introduce sched_eh and end_eh port ops · e4a9c373

由 Dan Williams 提交于 6月 21, 2012

When managing shost->host_eh_scheduled libata assumes that there is a
1:1 shost-to-ata_port relationship.  libsas creates a 1:N relationship
so it needs to manage host_eh_scheduled cumulatively at the host level.
The sched_eh and end_eh port port ops allow libsas to track when domain
devices enter/leave the "eh-pending" state under ha->lock (previously
named ha->state_lock, but it is no longer just a lock for ha->state
changes).

Since host_eh_scheduled indicates eh without backing commands pinning
the device it can be deallocated at any time.  Move the taking of the
domain_device reference under the port_lock to guarantee that the
ata_port stays around for the duration of eh.
Reviewed-by: NJacek Danecki <jacek.danecki@intel.com>
Acked-by: NJeff Garzik <jgarzik@redhat.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>

e4a9c373

01 3月, 2012 7 次提交

[SCSI] libsas: don't recover end devices attached to disabled phys · 26a2e68f

由 Dan Williams 提交于 1月 30, 2012

If userspace has decided to disable a phy the kernel should honor that
and not inadvertantly re-enable the phy via error recovery.  This is
more straightforward in the sata case where link recovery (via
libata-eh) is separate from sas_task cancelling in libsas-eh.  Teach
libsas to accept -ENODEV as a successful response from I_T_nexus_reset
('successful' in terms of not escalating further).

This is a more comprehensive fix then "libsas: don't recover 'gone'
devices in sas_ata_hard_reset()", as it is no longer sata-specific.

aic94xx does check the return value from sas_phy_reset() so if the phy
is disabled we proceed with clearing the I_T_nexus.
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>

26a2e68f

[SCSI] libsas: fix lifetime of SAS_HA_FROZEN · 84023474

由 Dan Williams 提交于 1月 20, 2012

Until all sas_tasks are known to no longer be in-flight this flag gates late
completions from colliding with error handling.  However, it must be cleared
prior to the submission of scsi_send_eh_cmnd() requests, otherwise those
commands will never be completed correctly.

This was spotted by slub debug:
 =============================================================================
 BUG sas_task: Objects remaining on kmem_cache_close()
 -----------------------------------------------------------------------------

 INFO: Slab 0xffffea001f0eba00 objects=34 used=1 fp=0xffff8807c3aecb00 flags=0x8000000000004080
 Pid: 22919, comm: modprobe Not tainted 3.2.0-isci+ #2
 Call Trace:
  [<ffffffff810fcdcd>] slab_err+0xb0/0xd2
  [<ffffffff810e1c50>] ? free_percpu+0x31/0x117
  [<ffffffff81100122>] ? kzalloc+0x14/0x16
  [<ffffffff81100122>] ? kzalloc+0x14/0x16
  [<ffffffff81100486>] kmem_cache_destroy+0x11d/0x270
  [<ffffffffa0112bdc>] sas_class_exit+0x10/0x12 [libsas]
  [<ffffffff81078fba>] sys_delete_module+0x1c4/0x23c
  [<ffffffff814797ba>] ? sysret_check+0x2e/0x69
  [<ffffffff8126479e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
  [<ffffffff81479782>] system_call_fastpath+0x16/0x1b
 INFO: Object 0xffff8807c3aed280 @offset=21120
 INFO: Allocated in sas_alloc_task+0x22/0x90 [libsas] age=4615311 cpu=2 pid=12966
  __slab_alloc.clone.3+0x1d1/0x234
  kmem_cache_alloc+0x52/0x10d
  sas_alloc_task+0x22/0x90 [libsas]
  sas_queuecommand+0x20e/0x230 [libsas]
  scsi_send_eh_cmnd+0xd1/0x30c
  scsi_eh_try_stu+0x4f/0x6b
  scsi_eh_ready_devs+0xba/0x6ef
  sas_scsi_recover_host+0xa35/0xab1 [libsas]
  scsi_error_handler+0x14b/0x5fa
  kthread+0x9d/0xa5
  kernel_thread_helper+0x4/0x10
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>

84023474

[SCSI] libsas: async ata scanning · 9508a66f

由 Dan Williams 提交于 1月 18, 2012

libsas ata error handling is already async but this does not help the
scan case.  Move initial link recovery out from under host->scan_mutex,
and delay synchronization with eh until after all port probe/recovery
work has been queued.

Device ordering is maintained with scan order by still calling
sas_rphy_add() in order of domain discovery.

Since we now scan the domain list when invoking libata-eh we need to be
careful to check for fully initialized ata ports.
Acked-by: NJack Wang <jack_wang@usish.com>
Acked-by: NJeff Garzik <jgarzik@redhat.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>

9508a66f

[SCSI] libsas: fix mixed topology recovery · d230ce69

由 Dan Williams 提交于 1月 11, 2012

If we have a domain with sas and sata devices there may still be sas
recovery actions to take after peeling off the commands to send to
libata.
Reported-by: NAndrzej Jakowski <andrzej.jakowski@intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>

d230ce69

[SCSI] libsas: close scsi_remove_target() vs libata-eh race · 8abda4d2

由 Dan Williams 提交于 1月 10, 2012

ata_port lifetime in libata follows the host.  In libsas it follows the
scsi_target.  Once scsi_remove_device() has caused all commands to be
completed it allows scsi_remove_target() to immediately proceed to
freeing the ata_port causing bug reports like:

[  848.393333] BUG: spinlock bad magic on CPU#4, kworker/u:2/5107
[  848.400262] general protection fault: 0000 [#1] SMP
[  848.406244] CPU 4
[  848.408310] Modules linked in: nls_utf8 ipv6 uinput i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support ioatdma dca sg sd_mod sr_mod cdrom ahci libahci isci libsas libata scsi_transport_sas [last unloaded: scsi_wait_scan]
[  848.432060]
[  848.434137] Pid: 5107, comm: kworker/u:2 Not tainted 3.2.0-isci+ #8 Intel Corporation S2600CP/S2600CP
[  848.445310] RIP: 0010:[<ffffffff8126a68c>]  [<ffffffff8126a68c>] spin_dump+0x5e/0x8c
[  848.454787] RSP: 0018:ffff8807f868dca0  EFLAGS: 00010002
[  848.461137] RAX: 0000000000000048 RBX: ffff8807fe86a630 RCX: ffffffff817d0be0
[  848.469520] RDX: 0000000000000000 RSI: ffffffff814af1cf RDI: 0000000000000002
[  848.477959] RBP: ffff8807f868dcb0 R08: 00000000ffffffff R09: 000000006b6b6b6b
[  848.486327] R10: 000000000003fb8c R11: ffffffff81a19448 R12: 6b6b6b6b6b6b6b6b
[  848.494699] R13: ffff8808027dc520 R14: 0000000000000000 R15: 000000000000001e
[  848.503067] FS:  0000000000000000(0000) GS:ffff88083fd00000(0000) knlGS:0000000000000000
[  848.512899] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  848.519710] CR2: 00007ff77d001000 CR3: 00000007f7a5d000 CR4: 00000000000406e0
[  848.528072] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  848.536446] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  848.544831] Process kworker/u:2 (pid: 5107, threadinfo ffff8807f868c000, task ffff8807ff348000)
[  848.555327] Stack:
[  848.557959]  ffff8807fe86a630 ffff8807fe86a630 ffff8807f868dcd0 ffffffff8126a6e0
[  848.567072]  ffffffff817c142f ffff8807fe86a630 ffff8807f868dcf0 ffffffff8126a703
[  848.576190]  ffff8808027dc520 0000000000000286 ffff8807f868dd10 ffffffff814af1bb
[  848.585281] Call Trace:
[  848.588409]  [<ffffffff8126a6e0>] spin_bug+0x26/0x28
[  848.594357]  [<ffffffff8126a703>] do_raw_spin_unlock+0x21/0x88
[  848.601283]  [<ffffffff814af1bb>] _raw_spin_unlock_irqrestore+0x2c/0x65
[  848.609089]  [<ffffffffa001c103>] ata_scsi_port_error_handler+0x548/0x557 [libata]
[  848.618331]  [<ffffffff81061813>] ? async_schedule+0x17/0x17
[  848.625060]  [<ffffffffa004f30f>] async_sas_ata_eh+0x45/0x69 [libsas]
[  848.632655]  [<ffffffff810618aa>] async_run_entry_fn+0x97/0x125
[  848.639670]  [<ffffffff81057439>] process_one_work+0x207/0x38d
[  848.646577]  [<ffffffff8105738c>] ? process_one_work+0x15a/0x38d
[  848.653681]  [<ffffffff810576f7>] worker_thread+0x138/0x21c
[  848.660305]  [<ffffffff810575bf>] ? process_one_work+0x38d/0x38d
[  848.667493]  [<ffffffff8105b098>] kthread+0x9d/0xa5
[  848.673382]  [<ffffffff8106e1bd>] ? trace_hardirqs_on_caller+0x12f/0x166
[  848.681304]  [<ffffffff814b7704>] kernel_thread_helper+0x4/0x10
[  848.688324]  [<ffffffff814af534>] ? retint_restore_args+0x13/0x13
[  848.695530]  [<ffffffff8105affb>] ? __init_kthread_worker+0x5b/0x5b
[  848.702929]  [<ffffffff814b7700>] ? gs_change+0x13/0x13
[  848.709155] Code: 00 00 48 8d 88 38 04 00 00 44 8b 80 84 02 00 00 31 c0 e8 cf 1b 24 00 41 83 c8 ff 44 8b 4b 08 48 c7 c1 e0 0b 7d 81 4d 85 e4 74 10 <45> 8b 84 24 84 02 00 00 49 8d 8c 24 38 04 00 00 8b 53 04 48 89
[  848.732467] RIP  [<ffffffff8126a68c>] spin_dump+0x5e/0x8c
[  848.738905]  RSP <ffff8807f868dca0>
[  848.743743] ---[ end trace 143161646eee8caa ]---

...so arrange for the ata_port to have the same end of life as the domain
device.
Reported-by: NMarcin Tomczak <marcin.tomczak@intel.com>
Acked-by: NJeff Garzik <jgarzik@redhat.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>

8abda4d2

[SCSI] libsas: pre-clean commands that won the eh vs completion race · 45c73b65

由 Dan Williams 提交于 1月 09, 2012

When scrolling forward through the eh list (in a clear_q scenario) it is
possible to encounter commands that won the completion vs eh race. Rather
than sprinkle more "if (!task)" throughout the handler just make a pass
through the list and delete the race winners before handling the rest.
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>

45c73b65

[SCSI] libsas: fix sas_find_local_phy(), take phy references · f41a0c44

由 Dan Williams 提交于 12月 21, 2011

In the direct-attached case this routine returns the phy on which this
device was first discovered. Which is broken if we want to support
wide-targets, as this phy reference can become stale even though the
port is still active.

In the expander-attached case this routine tries to lookup the phy by
scanning the attached sas addresses of the parent expander, and BUG_ONs
if it can't find it. However since eh and the libsas workqueue run
independently we can still be attempting device recovery via eh after
libsas has recorded the device as detached. This is even easier to hit
now that eh is blocked while device domain rediscovery takes place, and
that libata is fed more timed out commands increasing the chances that
it will try to recover the ata device.

Arrange for dev->phy to always point to a last known good phy, it may be
stale after the port is torn down, but it will catch up for wide port
reconfigurations, and never be NULL.
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>

f41a0c44

20 2月, 2012 5 次提交

[SCSI] libsas: sas_phy_enable via transport_sas_phy_reset · 2a559f4b

由 Dan Williams 提交于 12月 04, 2011

Execute the link-reset triggered by sas_phy_enable via
transport_sas_phy_reset so that it can be managed by libata.
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>

2a559f4b

[SCSI] libsas: defer SAS_TASK_NEED_DEV_RESET commands to libata · 3a2cdf39

由 Dan Williams 提交于 11月 29, 2011

lldds use the SAS_TASK_NEED_DEV_RESET interface to request that eh
perform a reset.  In the sata device case defer the commands that
triggered the reset to libata-eh context so it can perform its pre and
post reset management.

In the sas_ata_post_internal() case the reset request is falling on deaf
ears as the sas_task is immediately destroyed without any reset action.
Since it is currently a nop, and likely superfluous given the conversion
to new-style libata-eh, just drop the request.
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>

3a2cdf39

[SCSI] libsas: let libata handle command timeouts · 3944f509

由 Dan Williams 提交于 11月 29, 2011

libsas-eh if it successfully aborts an ata command will hide the timeout
condition (AC_ERR_TIMEOUT) from libata.  The command likely completes
with the all-zero task->task_status it started with.  Instead, interpret
a TMF_RESP_FUNC_COMPLETE as the end of the sas_task but keep the scmd
around for libata-eh to handle.
Tested-by: NAndrzej Jakowski <andrzej.jakowski@intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>

3944f509

[SCSI] libsas: fix timeout vs completion race · 9095a64a

由 Dan Williams 提交于 11月 28, 2011

Until we have told the lldd to forget a task a timed out operation can
return from the hardware at any time.  Since completion frees the task
we need to make sure that no tasks run their normal completion handler
once eh has decided to manage the task.  Similar to
ata_scsi_cmd_error_handler() freeze completions to let eh judge the
outcome of the race.

Task collector mode is problematic because it presents a situation where
a task can be timed out and aborted before the lldd has even seen it.
For this case we need to guarantee that a task that an lldd has been
told to forget does not get queued after the lldd says "never seen it".
With sas_scsi_timed_out we achieve this with the ->task_queue_flush
mutex, rather than adding more time.
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>

9095a64a

[SCSI] libsas: prevent double completion of scmds from eh · a3a14252

由 Dan Williams 提交于 12月 06, 2011

We invoke task->task_done() to free the task in the eh case, but at this
point we are prepared for scsi_eh_flush_done_q() to finish off the scmd.

Introduce sas_end_task() to capture the final response status from the
lldd and free the task.

Also take the opportunity to kill this warning.
drivers/scsi/libsas/sas_scsi_host.c: In function ‘sas_end_task’:
drivers/scsi/libsas/sas_scsi_host.c:102:3: warning: case value ‘2’ not in enumerated type ‘enum exec_status’ [-Wswitch]
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>

a3a14252

openanolis / cloud-kernel 大约 2 年 前同步成功

openanolis / cloud-kernel
大约 2 年前同步成功