1. 24 10月, 2015 3 次提交
    • N
      md/raid10: fix the 'new' raid10 layout to work correctly. · 8bce6d35
      NeilBrown 提交于
      In Linux 3.9 we introduce a new 'far' layout for RAID10 which was
      supposed to rotate the replicas differently and so provide better
      resilience.  In particular it could survive more combinations of 2
      drive failures.
      
      Unfortunately. due to a coding error, this some did what was wanted,
      sometimes improved less than we hoped, and sometimes - in very
      unlikely circumstances - put multiple replicas on the same device so
      the redundancy was harmed.
      
      No public user-space tool has created arrays using this layout so it
      is very unlikely that zero-redundancy arrays actually exist.  Probably
      no arrays using any form of the new layout exist.  But we cannot be
      certain.
      
      So use another bit in the 'layout' number and introduce a bug-fixed
      version of the layout.
      Also when assembling an array, if it has a zero-redundancy layout,
      give a warning.
      Reported-by: NHeinz Mauelshagen <heinzm@redhat.com>
      Signed-off-by: NNeilBrown <neilb@suse.com>
      8bce6d35
    • N
      md/raid10: don't clear bitmap bit when bad-block-list write fails. · c340702c
      NeilBrown 提交于
      When a write fails and a bad-block-list is present, we can
      update the bad-block-list instead of writing the data.  If
      this succeeds then it is OK clear the relevant bitmap-bit as
      no further 'sync' of the block is needed.
      
      However if writing the bad-block-list fails then we need to
      treat the write as failed and particularly must not clear
      the bitmap bit.  Otherwise the device can be re-added (after
      any hardware connection issues are resolved) and because the
      relevant bit in the bitmap is clear, that block will not be
      resynced.  This leads to data corruption.
      
      We already delay the final bio_endio() on the write until
      the bad-block-list is written so that when the write
      returns: either that data is safe, the bad-block record is
      safe, or the fact that the device is faulty is safe.
      However we *don't* delay the clearing of the bitmap, so the
      bitmap bit can be recorded as cleared before we know if the
      bad-block-list was written safely.
      
      So: delay that until the write really is safe.
      i.e. move the call to close_write() until just before
      calling bio_endio(), and recheck the 'is array degraded'
      status before making that call.
      
      This bug goes back to v3.1 when bad-block-lists were
      introduced, though it only affects arrays created with
      mdadm-3.3 or later as only those have bad-block lists.
      
      Backports will require at least
      Commit: 95af587e ("md/raid10: ensure device failure recorded before write request returns.")
      as well.  I'll send that to 'stable' separately.
      
      Note that of the two tests of R10BIO_WriteError that this
      patch adds, the first is certain to fail and the second is
      certain to succeed.  However doing it this way makes the
      patch more obviously correct.  I will tidy the code up in a
      future merge window.
      Reported-by: NNate Dailey <nate.dailey@stratus.com>
      Fixes: bd870a16 ("md/raid10:  Handle write errors by updating badblock log.")
      Signed-off-by: NNeilBrown <neilb@suse.com>
      c340702c
    • N
      md/raid1: don't clear bitmap bit when bad-block-list write fails. · bd8688a1
      NeilBrown 提交于
      When a write fails and a bad-block-list is present, we can
      update the bad-block-list instead of writing the data.  If
      this succeeds then it is OK clear the relevant bitmap-bit as
      no further 'sync' of the block is needed.
      
      However if writing the bad-block-list fails then we need to
      treat the write as failed and particularly must not clear
      the bitmap bit.  Otherwise the device can be re-added (after
      any hardware connection issues are resolved) and because the
      relevant bit in the bitmap is clear, that block will not be
      resynced.  This leads to data corruption.
      
      We already delay the final bio_endio() on the write until
      the bad-block-list is written so that when the write
      returns: either that data is safe, the bad-block record is
      safe, or the fact that the device is faulty is safe.
      However we *don't* delay the clearing of the bitmap, so the
      bitmap bit can be recorded as cleared before we know if the
      bad-block-list was written safely.
      
      So: delay that until the write really is safe.
      i.e. move the call to close_write() until just before
      calling bio_endio(), and recheck the 'is array degraded'
      status before making that call.
      
      This bug goes back to v3.1 when bad-block-lists were
      introduced, though it only affects arrays created with
      mdadm-3.3 or later as only those have bad-block lists.
      
      Backports will require at least
      Commit: 55ce74d4 ("md/raid1: ensure device failure recorded before write request returns.")
      as well.  I'll send that to 'stable' separately.
      
      Note that of the two tests of R1BIO_WriteError that this
      patch adds, the first is certain to fail and the second is
      certain to succeed.  However doing it this way makes the
      patch more obviously correct.  I will tidy the code up in a
      future merge window.
      Reported-and-tested-by: NNate Dailey <nate.dailey@stratus.com>
      Cc: Jes Sorensen <Jes.Sorensen@redhat.com>
      Fixes: cd5ff9a1 ("md/raid1:  Handle write errors by updating badblock log.")
      Signed-off-by: NNeilBrown <neilb@suse.com>
      bd8688a1
  2. 21 10月, 2015 2 次提交
  3. 18 10月, 2015 1 次提交
    • M
      i2c: designware: Do not use parameters from ACPI on Dell Inspiron 7348 · 56d4b8a2
      Mika Westerberg 提交于
      ACPI SSCN/FMCN methods were originally added because then the platform can
      provide the most accurate HCNT/LCNT values to the driver. However, this
      seems not to be true for Dell Inspiron 7348 where using these causes the
      touchpad to fail in boot:
      
        i2c_hid i2c-DLL0675:00: failed to retrieve report from device.
        i2c_designware INT3433:00: i2c_dw_handle_tx_abort: lost arbitration
        i2c_hid i2c-DLL0675:00: failed to retrieve report from device.
        i2c_designware INT3433:00: controller timed out
      
      The values received from ACPI are (in fast mode):
      
        HCNT: 72
        LCNT: 160
      
      this translates to following timings (input clock is 100MHz on Broadwell):
      
        tHIGH: 720 ns (spec min 600 ns)
        tLOW: 1600 ns (spec min 1300 ns)
        Bus period: 2920 ns (assuming 300 ns tf and tr)
        Bus speed: 342.5 kHz
      
      Both tHIGH and tLOW are within the I2C specification.
      
      The calculated values when ACPI parameters are not used are (in fast mode):
      
        HCNT: 87
        LCNT: 159
      
      which translates to:
      
        tHIGH: 870 ns (spec min 600 ns)
        tLOW: 1590 ns (spec min 1300 ns)
        Bus period 3060 ns (assuming 300 ns tf and tr)
        Bus speed 326.8 kHz
      
      These values are also within the I2C specification.
      
      Since both ACPI and calculated values meet the I2C specification timing
      requirements it is hard to say why the touchpad does not function properly
      with the ACPI values except that the bus speed is higher in this case (but
      still well below the max 400kHz).
      
      Solve this by adding DMI quirk to the driver that disables using ACPI
      parameters on this particulare machine.
      Reported-by: NPavel Roskin <plroskin@gmail.com>
      Signed-off-by: NMika Westerberg <mika.westerberg@linux.intel.com>
      Tested-by: NPavel Roskin <plroskin@gmail.com>
      Signed-off-by: NWolfram Sang <wsa@the-dreams.de>
      Cc: stable@kernel.org
      56d4b8a2
  4. 17 10月, 2015 1 次提交
    • M
      mm, fs: obey gfp_mapping for add_to_page_cache() · 063d99b4
      Michal Hocko 提交于
      Commit 6afdb859 ("mm: do not ignore mapping_gfp_mask in page cache
      allocation paths") has caught some users of hardcoded GFP_KERNEL used in
      the page cache allocation paths.  This, however, wasn't complete and
      there were others which went unnoticed.
      
      Dave Chinner has reported the following deadlock for xfs on loop device:
      : With the recent merge of the loop device changes, I'm now seeing
      : XFS deadlock on my single CPU, 1GB RAM VM running xfs/073.
      :
      : The deadlocked is as follows:
      :
      : kloopd1: loop_queue_read_work
      :       xfs_file_iter_read
      :       lock XFS inode XFS_IOLOCK_SHARED (on image file)
      :       page cache read (GFP_KERNEL)
      :       radix tree alloc
      :       memory reclaim
      :       reclaim XFS inodes
      :       log force to unpin inodes
      :       <wait for log IO completion>
      :
      : xfs-cil/loop1: <does log force IO work>
      :       xlog_cil_push
      :       xlog_write
      :       <loop issuing log writes>
      :               xlog_state_get_iclog_space()
      :               <blocks due to all log buffers under write io>
      :               <waits for IO completion>
      :
      : kloopd1: loop_queue_write_work
      :       xfs_file_write_iter
      :       lock XFS inode XFS_IOLOCK_EXCL (on image file)
      :       <wait for inode to be unlocked>
      :
      : i.e. the kloopd, with it's split read and write work queues, has
      : introduced a dependency through memory reclaim. i.e. that writes
      : need to be able to progress for reads make progress.
      :
      : The problem, fundamentally, is that mpage_readpages() does a
      : GFP_KERNEL allocation, rather than paying attention to the inode's
      : mapping gfp mask, which is set to GFP_NOFS.
      :
      : The didn't used to happen, because the loop device used to issue
      : reads through the splice path and that does:
      :
      :       error = add_to_page_cache_lru(page, mapping, index,
      :                       GFP_KERNEL & mapping_gfp_mask(mapping));
      
      This has changed by commit aa4d8616 ("block: loop: switch to VFS
      ITER_BVEC").
      
      This patch changes mpage_readpage{s} to follow gfp mask set for the
      mapping.  There are, however, other places which are doing basically the
      same.
      
      lustre:ll_dir_filler is doing GFP_KERNEL from the function which
      apparently uses GFP_NOFS for other allocations so let's make this
      consistent.
      
      cifs:readpages_get_pages is called from cifs_readpages and
      __cifs_readpages_from_fscache called from the same path obeys mapping
      gfp.
      
      ramfs_nommu_expand_for_mapping is hardcoding GFP_KERNEL as well
      regardless it uses mapping_gfp_mask for the page allocation.
      
      ext4_mpage_readpages is the called from the page cache allocation path
      same as read_pages and read_cache_pages
      
      As I've noticed in my previous post I cannot say I would be happy about
      sprinkling mapping_gfp_mask all over the place and it sounds like we
      should drop gfp_mask argument altogether and use it internally in
      __add_to_page_cache_locked that would require all the filesystems to use
      mapping gfp consistently which I am not sure is the case here.  From a
      quick glance it seems that some file system use it all the time while
      others are selective.
      Signed-off-by: NMichal Hocko <mhocko@suse.com>
      Reported-by: NDave Chinner <david@fromorbit.com>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Ming Lei <ming.lei@canonical.com>
      Cc: Andreas Dilger <andreas.dilger@intel.com>
      Cc: Oleg Drokin <oleg.drokin@intel.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      063d99b4
  5. 16 10月, 2015 5 次提交
  6. 15 10月, 2015 10 次提交
  7. 14 10月, 2015 5 次提交
  8. 13 10月, 2015 1 次提交
  9. 12 10月, 2015 5 次提交
  10. 10 10月, 2015 1 次提交
  11. 09 10月, 2015 6 次提交
    • J
      iommu/amd: Fix NULL pointer deref on device detach · 5adad991
      Joerg Roedel 提交于
      When a device group is detached from its domain, the iommu
      core code calls into the iommu driver to detach each device
      individually.
      
      Before this functionality went into the iommu core code, it
      was implemented in the drivers, also in the AMD IOMMU
      driver as the device alias handling code.
      
      This code is still present, as there might be aliases that
      don't exist as real PCI devices (and are therefore invisible
      to the iommu core code).
      
      Unfortunatly it might happen now, that a device is unbound
      multiple times from its domain, first by the alias handling
      code and then by the iommu core code (or vice verca).
      
      This ends up in the do_detach function which dereferences
      the dev_data->domain pointer. When the device is already
      detached, this pointer is NULL and we get a kernel oops.
      
      Removing the alias code completly is not an option, as that
      would also remove the code which handles invisible aliases.
      The code could be simplified, but this is too big of a
      change outside the merge window.
      
      For now, just check the dev_data->domain pointer in
      do_detach and bail out if it is NULL.
      Reported-by: NAndreas Hartmann <andihartmann@freenet.de>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      5adad991
    • J
      iommu/amd: Prevent binding other PCI drivers to IOMMU PCI devices · cbbc00be
      Jiang Liu 提交于
      AMD IOMMU driver makes use of IOMMU PCI devices, so prevent binding other
      PCI drivers to IOMMU PCI devices.
      
      This fixes a bug reported by Boris that system suspend/resume gets broken
      on AMD platforms. For more information, please refer to:
      	https://lkml.org/lkml/2015/9/26/89
      
      Fixes: 991de2e5 ("PCI, x86: Implement pcibios_alloc_irq() and pcibios_free_irq()")
      Signed-off-by: NJiang Liu <jiang.liu@linux.intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      cbbc00be
    • J
      dm cache: fix NULL pointer when switching from cleaner policy · 2bffa150
      Joe Thornber 提交于
      The cleaner policy doesn't make use of the per cache block hint space in
      the metadata (unlike the other policies).  When switching from the
      cleaner policy to mq or smq a NULL pointer crash (in dm_tm_new_block)
      was observed.  The crash was caused by bugs in dm-cache-metadata.c
      when trying to skip creation of the hint btree.
      
      The minimal fix is to change hint size for the cleaner policy to 4 bytes
      (only hint size supported).
      Signed-off-by: NJoe Thornber <ejt@redhat.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Cc: stable@vger.kernel.org
      2bffa150
    • D
      drm: Fix locking for sysfs dpms file · 621bd0f6
      Daniel Vetter 提交于
      With atomic drivers we need to make sure that (at least in general)
      property reads hold the right locks. But the legacy dpms property is
      special and can be read locklessly. Since userspace loves to just
      randomly look at that all the time (like with "status") do that.
      
      To make it clear that we play tricks use the READ_ONCE compiler
      barrier (and also for paranoia).
      
      Note that there's not really anything bad going on since even with the
      new atomic paths we eventually end up not chasing any pointers (and
      hence possibly freed memory and other fun stuff). The locking WARNING
      has been added in
      
      commit 88a48e29
      Author: Rob Clark <robdclark@gmail.com>
      Date:   Thu Dec 18 16:01:50 2014 -0500
      
          drm: add atomic properties
      
      but since drivers are converting not everyone will have seen this from
      the start.
      
      Jens reported this and submitted a patch to just grab the
      mode_config.connection_mutex, but we can do a bit better.
      
      v2: Remove unused variables I failed to git add for real.
      
      Reference: http://mid.gmane.org/20150928194822.GA3930@kernel.dkReported-by: NJens Axboe <axboe@fb.com>
      Tested-by: NJens Axboe <axboe@fb.com>
      Cc: Rob Clark <robdclark@gmail.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
      Signed-off-by: NDave Airlie <airlied@redhat.com>
      621bd0f6
    • M
      crash in md-raid1 and md-raid10 due to incorrect list manipulation · a452744b
      Mikulas Patocka 提交于
      The commit 55ce74d4 (md/raid1: ensure
      device failure recorded before write request returns) is causing crash in
      the LVM2 testsuite test shell/lvchange-raid.sh. For me the crash is 100%
      reproducible.
      
      The reason for the crash is that the newly added code in raid1d moves the
      list from conf->bio_end_io_list to tmp, then tests if tmp is non-empty and
      then incorrectly pops the bio from conf->bio_end_io_list (which is empty
      because the list was alrady moved).
      
      Raid-10 has a similar bug.
      
      Kernel Fault: Code=15 regs=000000006ccb8640 (Addr=0000000100000000)
      CPU: 3 PID: 1930 Comm: mdX_raid1 Not tainted 4.2.0-rc5-bisect+ #35
      task: 000000006cc1f258 ti: 000000006ccb8000 task.ti: 000000006ccb8000
      
           YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
      PSW: 00001000000001001111111000001111 Not tainted
      r00-03  000000ff0804fe0f 000000001059d000 000000001059f818 000000007f16be38
      r04-07  000000001059d000 000000007f16be08 0000000000200200 0000000000000001
      r08-11  000000006ccb8260 000000007b7934d0 0000000000000001 0000000000000000
      r12-15  000000004056f320 0000000000000000 0000000000013dd0 0000000000000000
      r16-19  00000000f0d00ae0 0000000000000000 0000000000000000 0000000000000001
      r20-23  000000000800000f 0000000042200390 0000000000000000 0000000000000000
      r24-27  0000000000000001 000000000800000f 000000007f16be08 000000001059d000
      r28-31  0000000100000000 000000006ccb8560 000000006ccb8640 0000000000000000
      sr00-03  0000000000249800 0000000000000000 0000000000000000 0000000000249800
      sr04-07  0000000000000000 0000000000000000 0000000000000000 0000000000000000
      
      IASQ: 0000000000000000 0000000000000000 IAOQ: 000000001059f61c 000000001059f620
       IIR: 0f8010c6    ISR: 0000000000000000  IOR: 0000000100000000
       CPU:        3   CR30: 000000006ccb8000 CR31: 0000000000000000
       ORIG_R28: 000000001059d000
       IAOQ[0]: call_bio_endio+0x34/0x1a8 [raid1]
       IAOQ[1]: call_bio_endio+0x38/0x1a8 [raid1]
       RP(r2): raid_end_bio_io+0x88/0x168 [raid1]
      Backtrace:
       [<000000001059f818>] raid_end_bio_io+0x88/0x168 [raid1]
       [<00000000105a4f64>] raid1d+0x144/0x1640 [raid1]
       [<000000004017fd5c>] kthread+0x144/0x160
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Fixes: 55ce74d4 ("md/raid1: ensure device failure recorded before write request returns.")
      Fixes: 95af587e ("md/raid10: ensure device failure recorded before write request returns.")
      Signed-off-by: NNeilBrown <neilb@suse.com>
      a452744b
    • S
      cpufreq: prevent lockup on reading scaling_available_frequencies · 55582bcc
      Srinivas Pandruvada 提交于
      When scaling_available_frequencies is read on an offlined cpu, then
      either lockup or junk values are displayed. This is caused by
      freed freq_table, which policy is using.
      Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      55582bcc