提交 · a39f7afde358ca89e9fc09a5525d3f8631a98a3a · openeuler / Kernel

19 11月, 2016 9 次提交

md/r5cache: write-out phase and reclaim support · a39f7afd

由 Song Liu 提交于 11月 17, 2016

There are two limited resources, stripe cache and journal disk space.
For better performance, we priotize reclaim of full stripe writes.
To free up more journal space, we free earliest data on the journal.

In current implementation, reclaim happens when:
1. Periodically (every R5C_RECLAIM_WAKEUP_INTERVAL, 30 seconds) reclaim
   if there is no reclaim in the past 5 seconds.
2. when there are R5C_FULL_STRIPE_FLUSH_BATCH (256) cached full stripes,
   or cached stripes is enough for a full stripe (chunk size / 4k)
   (r5c_check_cached_full_stripe)
3. when there is pressure on stripe cache (r5c_check_stripe_cache_usage)
4. when there is pressure on journal space (r5l_write_stripe, r5c_cache_data)

r5c_do_reclaim() contains new logic of reclaim.

For stripe cache:

When stripe cache pressure is high (more than 3/4 stripes are cached,
or there is empty inactive lists), flush all full stripe. If fewer
than R5C_RECLAIM_STRIPE_GROUP (NR_STRIPE_HASH_LOCKS * 2) full stripes
are flushed, flush some paritial stripes. When stripe cache pressure
is moderate (1/2 to 3/4 of stripes are cached), flush all full stripes.

For log space:

To avoid deadlock due to log space, we need to reserve enough space
to flush cached data. The size of required log space depends on total
number of cached stripes (stripe_in_journal_count). In current
implementation, the writing-out phase automatically include pending
data writes with parity writes (similar to write through case).
Therefore, we need up to (conf->raid_disks + 1) pages for each cached
stripe (1 page for meta data, raid_disks pages for all data and
parity). r5c_log_required_to_flush_cache() calculates log space
required to flush cache. In the following, we refer to the space
calculated by r5c_log_required_to_flush_cache() as
reclaim_required_space.

Two flags are added to r5conf->cache_state: R5C_LOG_TIGHT and
R5C_LOG_CRITICAL. R5C_LOG_TIGHT is set when free space on the log
device is less than 3x of reclaim_required_space. R5C_LOG_CRITICAL
is set when free space on the log device is less than 2x of
reclaim_required_space.

r5c_cache keeps all data in cache (not fully committed to RAID) in
a list (stripe_in_journal_list). These stripes are in the order of their
first appearance on the journal. So the log tail (last_checkpoint)
should point to the journal_start of the first item in the list.

When R5C_LOG_TIGHT is set, r5l_reclaim_thread starts flushing out
stripes at the head of stripe_in_journal. When R5C_LOG_CRITICAL is
set, the state machine only writes data that are already in the
log device (in stripe_in_journal_list).

This patch includes a fix to improve performance by
Shaohua Li <shli@fb.com>.
Signed-off-by: NSong Liu <songliubraving@fb.com>
Signed-off-by: NShaohua Li <shli@fb.com>

a39f7afd

md/r5cache: caching phase of r5cache · 1e6d690b

由 Song Liu 提交于 11月 17, 2016

As described in previous patch, write back cache operates in two
phases: caching and writing-out. The caching phase works as:
1. write data to journal
   (r5c_handle_stripe_dirtying, r5c_cache_data)
2. call bio_endio
   (r5c_handle_data_cached, r5c_return_dev_pending_writes).

Then the writing-out phase is as:
1. Mark the stripe as write-out (r5c_make_stripe_write_out)
2. Calcualte parity (reconstruct or RMW)
3. Write parity (and maybe some other data) to journal device
4. Write data and parity to RAID disks

This patch implements caching phase. The cache is integrated with
stripe cache of raid456. It leverages code of r5l_log to write
data to journal device.

Writing-out phase of the cache is implemented in the next patch.

With r5cache, write operation does not wait for parity calculation
and write out, so the write latency is lower (1 write to journal
device vs. read and then write to raid disks). Also, r5cache will
reduce RAID overhead (multipile IO due to read-modify-write of
parity) and provide more opportunities of full stripe writes.

This patch adds 2 flags to stripe_head.state:
 - STRIPE_R5C_PARTIAL_STRIPE,
 - STRIPE_R5C_FULL_STRIPE,

Instead of inactive_list, stripes with cached data are tracked in
r5conf->r5c_full_stripe_list and r5conf->r5c_partial_stripe_list.
STRIPE_R5C_FULL_STRIPE and STRIPE_R5C_PARTIAL_STRIPE are flags for
stripes in these lists. Note: stripes in r5c_full/partial_stripe_list
are not considered as "active".

For RMW, the code allocates an extra page for each data block
being updated.  This is stored in r5dev->orig_page and the old data
is read into it.  Then the prexor calculation subtracts ->orig_page
from the parity block, and the reconstruct calculation adds the
->page data back into the parity block.

r5cache naturally excludes SkipCopy. When the array has write back
cache, async_copy_data() will not skip copy.

There are some known limitations of the cache implementation:

1. Write cache only covers full page writes (R5_OVERWRITE). Writes
   of smaller granularity are write through.
2. Only one log io (sh->log_io) for each stripe at anytime. Later
   writes for the same stripe have to wait. This can be improved by
   moving log_io to r5dev.
3. With writeback cache, read path must enter state machine, which
   is a significant bottleneck for some workloads.
4. There is no per stripe checkpoint (with r5l_payload_flush) in
   the log, so recovery code has to replay more than necessary data
   (sometimes all the log from last_checkpoint). This reduces
   availability of the array.

This patch includes a fix proposed by ZhengYuan Liu
<liuzhengyuan@kylinos.cn>
Signed-off-by: NSong Liu <songliubraving@fb.com>
Signed-off-by: NShaohua Li <shli@fb.com>

1e6d690b

md/r5cache: State machine for raid5-cache write back mode · 2ded3703

由 Song Liu 提交于 11月 17, 2016

This patch adds state machine for raid5-cache. With log device, the
raid456 array could operate in two different modes (r5c_journal_mode):
  - write-back (R5C_MODE_WRITE_BACK)
  - write-through (R5C_MODE_WRITE_THROUGH)

Existing code of raid5-cache only has write-through mode. For write-back
cache, it is necessary to extend the state machine.

With write-back cache, every stripe could operate in two different
phases:
  - caching
  - writing-out

In caching phase, the stripe handles writes as:
  - write to journal
  - return IO

In writing-out phase, the stripe behaviors as a stripe in write through
mode R5C_MODE_WRITE_THROUGH.

STRIPE_R5C_CACHING is added to sh->state to differentiate caching and
writing-out phase.

Please note: this is a "no-op" patch for raid5-cache write-through
mode.

The following detailed explanation is copied from the raid5-cache.c:

/*
 * raid5 cache state machine
 *
 * With rhe RAID cache, each stripe works in two phases:
 *      - caching phase
 *      - writing-out phase
 *
 * These two phases are controlled by bit STRIPE_R5C_CACHING:
 *   if STRIPE_R5C_CACHING == 0, the stripe is in writing-out phase
 *   if STRIPE_R5C_CACHING == 1, the stripe is in caching phase
 *
 * When there is no journal, or the journal is in write-through mode,
 * the stripe is always in writing-out phase.
 *
 * For write-back journal, the stripe is sent to caching phase on write
 * (r5c_handle_stripe_dirtying). r5c_make_stripe_write_out() kicks off
 * the write-out phase by clearing STRIPE_R5C_CACHING.
 *
 * Stripes in caching phase do not write the raid disks. Instead, all
 * writes are committed from the log device. Therefore, a stripe in
 * caching phase handles writes as:
 *      - write to log device
 *      - return IO
 *
 * Stripes in writing-out phase handle writes as:
 *      - calculate parity
 *      - write pending data and parity to journal
 *      - write data and parity to raid disks
 *      - return IO for pending writes
 */
Signed-off-by: NSong Liu <songliubraving@fb.com>
Signed-off-by: NShaohua Li <shli@fb.com>

2ded3703

md/r5cache: move some code to raid5.h · 937621c3

由 Song Liu 提交于 11月 17, 2016

Move some define and inline functions to raid5.h, so they can be
used in raid5-cache.c
Signed-off-by: NSong Liu <songliubraving@fb.com>
Signed-off-by: NShaohua Li <shli@fb.com>

937621c3

md/r5cache: Check array size in r5l_init_log · c757ec95

由 Song Liu 提交于 11月 17, 2016

Currently, r5l_write_stripe checks meta size for each stripe write,
which is not necessary.

With this patch, r5l_init_log checks maximal meta size of the array,
which is (r5l_meta_block + raid_disks x r5l_payload_data_parity).
If this is too big to fit in one page, r5l_init_log aborts.

With current meta data, r5l_log support raid_disks up to 203.
Signed-off-by: NSong Liu <songliubraving@fb.com>
Signed-off-by: NShaohua Li <shli@fb.com>

c757ec95

md: add blktrace event for writes to superblock · 504634f6

由 Shaohua Li 提交于 11月 18, 2016

superblock write is an expensive operation. With raid5-cache, it can be called
regularly. Tracing to help performance debug.
Signed-off-by: NShaohua Li <shli@fb.com>
Cc: NeilBrown <neilb@suse.com>

504634f6

md/raid1, raid10: add blktrace records when IO is delayed · 578b54ad

由 NeilBrown 提交于 11月 14, 2016

Both raid1 and raid10 will sometimes delay handling an IO request,
such as when resync is happening or there are too many requests queued.

Add some blktrace messsages so we can see when that is happening when
looking for performance artefacts.
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>

578b54ad

md/bitmap: add blktrace event for writes to the bitmap · 581dbd94

由 NeilBrown 提交于 11月 14, 2016

We trace wheneven bitmap_unplug() finds that it needs to write
to the bitmap, or when bitmap_daemon_work() find there is work
to do.

This makes it easier to correlate bitmap updates with data writes.
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>

581dbd94

md: add block tracing for bio_remapping · 109e3765

由 NeilBrown 提交于 11月 18, 2016

The block tracing infrastructure (accessed with blktrace/blkparse)
supports the tracing of mapping bios from one device to another.
This is currently used when a bio in a partition is mapped to the
whole device, when bios are mapped by dm, and for mapping in md/raid5.
Other md personalities do not include this tracing yet, so add it.

When a read-error is detected we redirect the request to a different device.
This could justifiably be seen as a new mapping for the originial bio,
or a secondary mapping for the bio that errors.  This patch uses
the second option.

When md is used under dm-raid, the mappings are not traced as we do
not have access to the block device number of the parent.
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>

109e3765

18 11月, 2016 1 次提交

raid5-cache: fix lockdep warning · 354b445b

由 Shaohua Li 提交于 11月 16, 2016

lockdep reports warning of the rcu_dereference usage. Using normal rdev
access pattern to avoid the warning.
Signed-off-by: NShaohua Li <shli@fb.com>

354b445b

10 11月, 2016 3 次提交

md: remove md_super_wait() call after bitmap_flush() · 6119e679

由 NeilBrown 提交于 11月 09, 2016

bitmap_flush() finishes with bitmap_update_sb(), and that finishes
with write_page(..., 1), so write_page() will wait for all writes
to complete.  So there is no point calling md_super_wait()
immediately afterwards.
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>

6119e679

md: define mddev flags, recovery flags and r1bio state bits using enums · be306c29

由 NeilBrown 提交于 11月 09, 2016

This is less error prone than using individual #defines.
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>

be306c29

md/raid1: fix: IO can block resync indefinitely · f2c771a6

由 NeilBrown 提交于 11月 09, 2016

While performing a resync/recovery, raid1 divides the
array space into three regions:
 - before the resync
 - at or shortly after the resync point
 - much further ahead of the resync point.

Write requests to the first or third do not need to wait.  Write
requests to the middle region do need to wait if resync requests are
pending.

If there are any active write requests in the middle region, resync
will wait for them.

Due to an accounting error, there is a small range of addresses,
between conf->next_resync and conf->start_next_window, where write
requests will *not* be blocked, but *will* be counted in the middle
region.  This can effectively block resync indefinitely if filesystem
writes happen repeatedly to this region.

As ->next_window_requests is incremented when the sector is after
  conf->start_next_window + NEXT_NORMALIO_DISTANCE
the same boundary should be used for determining when write requests
should wait.
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>

f2c771a6

08 11月, 2016 19 次提交

md/bitmap: Don't write bitmap while earlier writes might be in-flight · 85c9ccd4

由 NeilBrown 提交于 11月 04, 2016

As we don't wait for writes to complete in bitmap_daemon_work, they
could still be in-flight when bitmap_unplug writes again.  Or when
bitmap_daemon_work tries to write again.
This can be confusing and could risk the wrong data being written last.

So make sure we wait for old writes to complete before new writes start.
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>

85c9ccd4

md/raid10: abort delayed writes when device fails. · a9ae93c8

由 NeilBrown 提交于 11月 04, 2016

When writing to an array with a bitmap enabled, the writes are grouped
in batches which are preceded by an update to the bitmap.

It is quite likely if that a drive develops a problem which is not
media related, that the bitmap write will be the first to report an
error and cause the device to be marked faulty (as the bitmap write is
at the start of a batch).

In this case, there is point submiting the subsequent writes to the
failed device - that just wastes times.

So re-check the Faulty state of a device before submitting a
delayed write.

This requires that we keep the 'rdev', rather than the 'bdev' in the
bio, then swap in the bdev just before final submission.
Reported-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>

a9ae93c8

md/raid1: abort delayed writes when device fails. · 5e2c7a36

由 NeilBrown 提交于 11月 04, 2016

When writing to an array with a bitmap enabled, the writes are grouped
in batches which are preceded by an update to the bitmap.

It is quite likely if that a drive develops a problem which is not
media related, that the bitmap write will be the first to report an
error and cause the device to be marked faulty (as the bitmap write is
at the start of a batch).

In this case, there is point submiting the subsequent writes to the
failed device - that just wastes times.

So re-check the Faulty state of a device before submitting a
delayed write.

This requires that we keep the 'rdev', rather than the 'bdev' in the
bio, then swap in the bdev just before final submission.
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>

5e2c7a36

md: perform async updates for metadata where possible. · 060b0689

由 NeilBrown 提交于 11月 04, 2016

When adding devices to, or removing device from, an array we need to
update the metadata.  However we don't need to do it synchronously as
data integrity doesn't depend on these changes being recorded
instantly.  So avoid the synchronous call to md_update_sb and just set
a flag so that the thread will do it.

This can reduce the number of updates performed when lots of devices
are being added or removed.
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>

060b0689

raid5-cache: restrict the use area of the log_offset variable · 3fd880af

由 JackieLiu 提交于 11月 02, 2016

We can calculate this offset by using ctx->meta_total_blocks,
without passing in from the function
Signed-off-by: NJackieLiu <liuyun01@kylinos.cn>
Signed-off-by: NShaohua Li <shli@fb.com>

3fd880af

N
md/raid5: change printk() to pr_*() · cc6167b4
由 NeilBrown 提交于 11月 02, 2016
```
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>
```
cc6167b4
N
md/raid10: change printk() to pr_*() · 08464e09
由 NeilBrown 提交于 11月 02, 2016
```
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>
```
08464e09
N
md/raid1: change printk() to pr_*() · 1d41c216
由 NeilBrown 提交于 11月 02, 2016
```
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>
```
1d41c216

md/raid0: replace printk() with pr_*() · 76603884

由 NeilBrown 提交于 11月 02, 2016

This makes md/raid0 much less verbose as the messages about
the array geometry are now pr_debug()
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>

76603884

md/multipath: replace printk() with pr_*() · 7279694d

由 NeilBrown 提交于 11月 02, 2016

Also remove all messages about memory allocation failure.
page_alloc() reports those.
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>

7279694d

N
md/linear: replace printk() with pr_*() · a2e202af
由 NeilBrown 提交于 11月 02, 2016
```
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>
```
a2e202af

md/bitmap: change all printk() to pr_*() · ec0cc226

由 NeilBrown 提交于 11月 02, 2016

Follow err/warn distinction introduced in md.c
Join multi-part strings into single string.
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>

ec0cc226

md: change all printk() to pr_err() or pr_warn() etc. · 9d48739e

由 NeilBrown 提交于 11月 02, 2016

1/ using pr_debug() for a number of messages reduces the noise of
   md, but still allows them to be enabled when needed.
2/ try to be consistent in the usage of pr_err() and pr_warn(), and
   document the intention
3/ When strings have been split onto multiple lines, rejoin into
   a single string.
   The cost of having lines > 80 chars is less than the cost of not
   being able to easily search for a particular message.
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>

9d48739e

md: fix some issues with alloc_disk_sb() · 7f0f0d87

由 NeilBrown 提交于 11月 02, 2016

1/ don't print a warning if allocation fails.
 page_alloc() does that already.
2/ always check return status for error.
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>

7f0f0d87

md/bitmap: call bitmap_file_unmap once bitmap_storage_alloc returns -ENOMEM · cbb38732

由 Guoqing Jiang 提交于 10月 31, 2016

It is possible that bitmap_storage_alloc could return -ENOMEM,
and some member inside store could be allocated such as filemap.

To avoid memory leak, we need to call bitmap_file_unmap to free
those members in the bitmap_resize.
Reviewed-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NGuoqing Jiang <gqjiang@suse.com>
Signed-off-by: NShaohua Li <shli@fb.com>

cbb38732

raid5: revert commit · 7adb072c

由 Tomasz Majchrzak 提交于 10月 26, 2016

Revert commit 11367799 ("md: Prevent IO hold during accessing to faulty
raid5 array") as it doesn't comply with commit c3cce6cd ("md/raid5:
ensure device failure recorded before write request returns."). That change
is not required anymore as the problem is resolved by commit 16f88949
("md: report 'write_pending' state when array in sync") - read request is
stuck as array state is not reported correctly via sysfs attribute.
Signed-off-by: NTomasz Majchrzak <tomasz.majchrzak@intel.com>
Signed-off-by: NShaohua Li <shli@fb.com>

7adb072c

md: wake up personality thread after array state update · 91a6c4ad

由 Tomasz Majchrzak 提交于 10月 25, 2016

When raid1/raid10 array fails to write to one of the drives, the request
is added to bio_end_io_list and finished by personality thread. The
thread doesn't handle it as long as MD_CHANGE_PENDING flag is set. In
case of external metadata this flag is cleared, however the thread is
not woken up. It causes request to be blocked for few seconds (until
another action on the array wakes up the thread) or to get stuck
indefinitely.

Wake up personality thread once MD_CHANGE_PENDING has been cleared.
Moving 'restart_array' call after the flag is cleared it not a solution
because in read-write mode the call doesn't wake up the thread.
Signed-off-by: NTomasz Majchrzak <tomasz.majchrzak@intel.com>
Signed-off-by: NShaohua Li <shli@fb.com>

91a6c4ad

md: don't fail an array if there are unacknowledged bad blocks · dcbcb486

由 Tomasz Majchrzak 提交于 10月 21, 2016

If external metadata handler supports bad blocks and unacknowledged bad
blocks are present, don't report disk via sysfs as faulty. Such
situation can be still handled so disk just has to be blocked for a
moment. It makes it consistent with kernel state as corresponding rdev
flag is also not set.

When the disk in being unblocked there are few cases:
1. Disk has been in blocked and faulty state, it is being unblocked but
it still remains in faulty state. Metadata handler will remove it from
array in the next call.
2. There is no bad block support in external metadata handler and bad
blocks are present - put the disk in blocked and faulty state (see
case 1).
3. There is bad block support in external metadata handler and all bad
blocks are acknowledged - clear all flags, continue.
4. There is bad block support in external metadata handler but there are
still unacknowledged bad blocks - clear all flags, continue. It is fine
to clear Blocked flag because it was probably not set anyway (if it was
it is case 1). BlockedBadBlocks flag can also be cleared because the
request waiting for it will set it again when it finds out that some bad
block is still not acknowledged. Recovery is not necessary but there are
no problems if the flag is set. Sysfs rdev state is still reported as
blocked (due to unacknowledged bad blocks) so metadata handler will
process remaining bad blocks and unblock disk again.
Signed-off-by: NTomasz Majchrzak <tomasz.majchrzak@intel.com>
Signed-off-by: NShaohua Li <shli@fb.com>

dcbcb486

md: add bad block support for external metadata · 35b785f7

由 Tomasz Majchrzak 提交于 10月 21, 2016

Add new rdev flag which external metadata handler can use to switch
on/off bad block support. If new bad block is encountered, notify it via
rdev 'unacknowledged_bad_blocks' sysfs file. If bad block has been
cleared, notify update to rdev 'bad_blocks' sysfs file.

When bad blocks support is being removed, just clear rdev flag. It is
not necessary to reset badblocks->shift field. If there are bad blocks
cleared or added at the same time, it is ok for those changes to be
applied to the structure. The array is in blocked state and the drive
which cannot handle bad blocks any more will be removed from the array
before it is unlocked.

Simplify state_show function by adding a separator at the end of each
string and overwrite last separator with new line.
Signed-off-by: NTomasz Majchrzak <tomasz.majchrzak@intel.com>
Reviewed-by: NArtur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: NShaohua Li <shli@fb.com>

35b785f7

05 11月, 2016 7 次提交

HID: sensor: fix attributes in HID sensor interface · 4c4480aa

由 Ooi, Joyce 提交于 11月 03, 2016

User is unable to access to input-X-yyy and feature-X-yyy where
X is a hex value and more than 9 (e.g. input-a-yyy, feature-b-yyy) in HID
sensor custom sysfs interface.
This is because when creating the attribute, the attribute index is
written to using %x (hex). However, when reading and writing values into
the attribute, the attribute index is scanned using %d (decimal). Hence,
user is unable to access to attributes with index in hex values
(e.g. 'a', 'b', 'c') but able to access to attributes with index in
decimal values (e.g. 1, 2, 3,..).
This fix will change input-%d-%x-%s and feature-%d-%x-%s to input-%x-%x-%s
and feature-%x-%x-%s in show_values() and store_values() accordingly.
Signed-off-by: NOoi, Joyce <joyce.ooi@intel.com>
Reviewed-by: NBenjamin Tissoires <benjamin.tissoires@redhat.com>
Acked-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

4c4480aa

HID: intel-ish-hid: request_irq failure · 021afd55

由 Srinivas Pandruvada 提交于 10月 21, 2016

On some platforms ISH interrupt is shared, which causes request_irq to
fail. This requires IRQF_SHARED irq flag.

But IRQF_NO_SUSPEND and IRQF_SHARED should not be used together, so
removed IRQF_NO_SUSPEND flag. Anyway this driver doesn't require
IRQF_NO_SUSPEND, as this interrupt is not required during "noirq" phases
of suspending and resuming devices as well as during the time when
nonboot CPUs are taken offline and brought back online.
Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

021afd55

HID: intel-ish-hid: Fix driver reinit failure · 2a1e3b93

由 Even Xu 提交于 10月 21, 2016

When built as a module, modprobe followed by rmmod can fail because
DMA was still active. So to fix this, DMA needs to be disabled during
module exit.

This change disables DMA during modules exit and change the ISH PCI
device status to D3.
Signed-off-by: NEven Xu <even.xu@intel.com>
Acked-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

2a1e3b93

HID: intel-ish-hid: Move DMA disable code to new function · 8b2979fe

由 Even Xu 提交于 10月 21, 2016

Add a new function ish_disable_dma() and move DMA disable operations
here, so that this functionality can be reused.
Signed-off-by: NEven Xu <even.xu@intel.com>
Acked-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

8b2979fe

HID: intel-ish-hid: consolidate ish wake up operation · c2ed83f5

由 Even Xu 提交于 10月 21, 2016

Same operations are done in ish_hw_start() and _ish_hw_reset() to
wakeup ISH device. Consolidate them by introducing a new function
ish_wakeup() and move the code there.
Signed-off-by: NEven Xu <even.xu@intel.com>
Acked-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

c2ed83f5

PCI: designware: Check for iATU unroll support after initializing host · 416379f9

由 Niklas Cassel 提交于 10月 14, 2016

dw_pcie_iatu_unroll_enabled() reads a dbi_base register.  Reading any
dbi_base register before pp->ops->host_init has been called causes
"imprecise external abort" on platforms like ARTPEC-6, where the PCIe
module is disabled at boot and first enabled in pp->ops->host_init.  Move
dw_pcie_iatu_unroll_enabled() to dw_pcie_setup_rc(), since it is after
pp->ops->host_init, but before pp->iatu_unroll_enabled is actually used.

Fixes: a0601a47 ("PCI: designware: Add iATU Unroll feature")
Tested-by: NJames Le Cuirot <chewi@gentoo.org>
Signed-off-by: NNiklas Cassel <niklas.cassel@axis.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Acked-by: NJoao Pinto <jpinto@synopsys.com>
Acked-by: NOlof Johansson <olof@lixom.net>

416379f9

i2c: core: fix NULL pointer dereference under race condition · 147b36d5

由 Vladimir Zapolskiy 提交于 10月 31, 2016

Race condition between registering an I2C device driver and
deregistering an I2C adapter device which is assumed to manage that
I2C device may lead to a NULL pointer dereference due to the
uninitialized list head of driver clients.

The root cause of the issue is that the I2C bus may know about the
registered device driver and thus it is matched by bus_for_each_drv(),
but the list of clients is not initialized and commonly it is NULL,
because I2C device drivers define struct i2c_driver as static and
clients field is expected to be initialized by I2C core:

  i2c_register_driver()             i2c_del_adapter()
    driver_register()                 ...
      bus_add_driver()                ...
        ...                           bus_for_each_drv(..., __process_removed_adapter)
      ...                               i2c_do_del_adapter()
    ...                                   list_for_each_entry_safe(..., &driver->clients, ...)
    INIT_LIST_HEAD(&driver->clients);

To solve the problem it is sufficient to do clients list head
initialization before calling driver_register().

The problem was found while using an I2C device driver with a sluggish
registration routine on a bus provided by a physically detachable I2C
master controller, but practically the oops may be reproduced under
the race between arbitraty I2C device driver registration and managing
I2C bus device removal e.g. by unbinding the latter over sysfs:

% echo 21a4000.i2c > /sys/bus/platform/drivers/imx-i2c/unbind
  Unable to handle kernel NULL pointer dereference at virtual address 00000000
  Internal error: Oops: 17 [#1] SMP ARM
  CPU: 2 PID: 533 Comm: sh Not tainted 4.9.0-rc3+ #61
  Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
  task: e5ada400 task.stack: e4936000
  PC is at i2c_do_del_adapter+0x20/0xcc
  LR is at __process_removed_adapter+0x14/0x1c
  Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
  Control: 10c5387d  Table: 35bd004a  DAC: 00000051
  Process sh (pid: 533, stack limit = 0xe4936210)
  Stack: (0xe4937d28 to 0xe4938000)
  Backtrace:
  [<c0667be0>] (i2c_do_del_adapter) from [<c0667cc0>] (__process_removed_adapter+0x14/0x1c)
  [<c0667cac>] (__process_removed_adapter) from [<c0516998>] (bus_for_each_drv+0x6c/0xa0)
  [<c051692c>] (bus_for_each_drv) from [<c06685ec>] (i2c_del_adapter+0xbc/0x284)
  [<c0668530>] (i2c_del_adapter) from [<bf0110ec>] (i2c_imx_remove+0x44/0x164 [i2c_imx])
  [<bf0110a8>] (i2c_imx_remove [i2c_imx]) from [<c051a838>] (platform_drv_remove+0x2c/0x44)
  [<c051a80c>] (platform_drv_remove) from [<c05183d8>] (__device_release_driver+0x90/0x12c)
  [<c0518348>] (__device_release_driver) from [<c051849c>] (device_release_driver+0x28/0x34)
  [<c0518474>] (device_release_driver) from [<c0517150>] (unbind_store+0x80/0x104)
  [<c05170d0>] (unbind_store) from [<c0516520>] (drv_attr_store+0x28/0x34)
  [<c05164f8>] (drv_attr_store) from [<c0298acc>] (sysfs_kf_write+0x50/0x54)
  [<c0298a7c>] (sysfs_kf_write) from [<c029801c>] (kernfs_fop_write+0x100/0x214)
  [<c0297f1c>] (kernfs_fop_write) from [<c0220130>] (__vfs_write+0x34/0x120)
  [<c02200fc>] (__vfs_write) from [<c0221088>] (vfs_write+0xa8/0x170)
  [<c0220fe0>] (vfs_write) from [<c0221e74>] (SyS_write+0x4c/0xa8)
  [<c0221e28>] (SyS_write) from [<c0108a20>] (ret_fast_syscall+0x0/0x1c)
Signed-off-by: NVladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
Signed-off-by: NWolfram Sang <wsa@the-dreams.de>
Cc: stable@kernel.org

147b36d5

04 11月, 2016 1 次提交

HID: usbhid: add ATEN CS962 to list of quirky devices · cf0ea4da

由 Oliver Neukum 提交于 11月 03, 2016

Like many similar devices it needs a quirk to work.
Issuing the request gets the device into an irrecoverable state.
Signed-off-by: NOliver Neukum <oneukum@suse.com>
CC: stable@vger.kernel.org
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

cf0ea4da

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功