1. 08 8月, 2018 3 次提交
    • J
      dm kcopyd: avoid softlockup in run_complete_job · 784c9a29
      John Pittman 提交于
      It was reported that softlockups occur when using dm-snapshot ontop of
      slow (rbd) storage.  E.g.:
      
      [ 4047.990647] watchdog: BUG: soft lockup - CPU#10 stuck for 22s! [kworker/10:23:26177]
      ...
      [ 4048.034151] Workqueue: kcopyd do_work [dm_mod]
      [ 4048.034156] RIP: 0010:copy_callback+0x41/0x160 [dm_snapshot]
      ...
      [ 4048.034190] Call Trace:
      [ 4048.034196]  ? __chunk_is_tracked+0x70/0x70 [dm_snapshot]
      [ 4048.034200]  run_complete_job+0x5f/0xb0 [dm_mod]
      [ 4048.034205]  process_jobs+0x91/0x220 [dm_mod]
      [ 4048.034210]  ? kcopyd_put_pages+0x40/0x40 [dm_mod]
      [ 4048.034214]  do_work+0x46/0xa0 [dm_mod]
      [ 4048.034219]  process_one_work+0x171/0x370
      [ 4048.034221]  worker_thread+0x1fc/0x3f0
      [ 4048.034224]  kthread+0xf8/0x130
      [ 4048.034226]  ? max_active_store+0x80/0x80
      [ 4048.034227]  ? kthread_bind+0x10/0x10
      [ 4048.034231]  ret_from_fork+0x35/0x40
      [ 4048.034233] Kernel panic - not syncing: softlockup: hung tasks
      
      Fix this by calling cond_resched() after run_complete_job()'s callout to
      the dm_kcopyd_notify_fn (which is dm-snap.c:copy_callback in the above
      trace).
      Signed-off-by: NJohn Pittman <jpittman@redhat.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      784c9a29
    • M
      dm cache metadata: save in-core policy_hint_size to on-disk superblock · fd2fa954
      Mike Snitzer 提交于
      policy_hint_size starts as 0 during __write_initial_superblock().  It
      isn't until the policy is loaded that policy_hint_size is set in-core
      (cmd->policy_hint_size).  But it never got recorded in the on-disk
      superblock because __commit_transaction() didn't deal with transfering
      the in-core cmd->policy_hint_size to the on-disk superblock.
      
      The in-core cmd->policy_hint_size gets initialized by metadata_open()'s
      __begin_transaction_flags() which re-reads all superblock fields.
      Because the superblock's policy_hint_size was never properly stored, when
      the cache was created, hints_array_available() would always return false
      when re-activating a previously created cache.  This means
      __load_mappings() always considered the hints invalid and never made use
      of the hints (these hints served to optimize).
      
      Another detremental side-effect of this oversight is the cache_check
      utility would fail with: "invalid hint width: 0"
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      fd2fa954
    • H
      dm thin: stop no_space_timeout worker when switching to write-mode · 75294442
      Hou Tao 提交于
      Now both check_for_space() and do_no_space_timeout() will read & write
      pool->pf.error_if_no_space.  If these functions run concurrently, as
      shown in the following case, the default setting of "queue_if_no_space"
      can get lost.
      
      precondition:
          * error_if_no_space = false (aka "queue_if_no_space")
          * pool is in Out-of-Data-Space (OODS) mode
          * no_space_timeout worker has been queued
      
      CPU 0:                          CPU 1:
      // delete a thin device
      process_delete_mesg()
      // check_for_space() invoked by commit()
      set_pool_mode(pool, PM_WRITE)
          pool->pf.error_if_no_space = \
           pt->requested_pf.error_if_no_space
      
      				// timeout, pool is still in OODS mode
      				do_no_space_timeout
      				    // "queue_if_no_space" config is lost
      				    pool->pf.error_if_no_space = true
          pool->pf.mode = new_mode
      
      Fix it by stopping no_space_timeout worker when switching to write mode.
      
      Fixes: bcc696fa ("dm thin: stay in out-of-data-space mode once no_space_timeout expires")
      Cc: stable@vger.kernel.org
      Signed-off-by: NHou Tao <houtao1@huawei.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      75294442
  2. 01 8月, 2018 1 次提交
  3. 30 7月, 2018 1 次提交
    • A
      dm thin: include metadata_low_watermark threshold in pool status · 63c8ecb6
      Andy Grover 提交于
      The metadata low watermark threshold is set by the kernel.  But the
      kernel depends on userspace to extend the thinpool metadata device when
      the threshold is crossed.
      
      Since the metadata low watermark threshold is not visible to userspace,
      upon receiving an event, userspace cannot tell that the kernel wants the
      metadata device extended, instead of some other eventing condition.
      Making it visible (but not settable) enables userspace to affirmatively
      know the kernel is asking for a metadata device extension, by comparing
      metadata_low_watermark against nr_free_blocks_metadata, also reported in
      status.
      
      Current solutions like dmeventd have their own thresholds for extending
      the data and metadata devices, and both devices are checked against
      their thresholds on each event.  This lessens the value of the kernel-set
      threshold, since userspace will either extend the metadata device sooner,
      when receiving another event; or will receive the metadata lowater event
      and do nothing, if dmeventd's threshold is less than the kernel's.
      (This second case is dangerous. The metadata lowater event will not be
      re-sent, so no further event will be generated before the metadata
      device is out if space, unless some other event causes userspace to
      recheck its thresholds.)
      Signed-off-by: NAndy Grover <agrover@redhat.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      63c8ecb6
  4. 28 7月, 2018 16 次提交
  5. 23 7月, 2018 4 次提交
    • L
      Linux 4.18-rc6 · d72e90f3
      Linus Torvalds 提交于
      d72e90f3
    • L
      Merge tag 'nvme-for-4.18' of git://git.infradead.org/nvme · 74413084
      Linus Torvalds 提交于
      Pull NVMe fixes from Christoph Hellwig:
      
       - fix a regression in 4.18 that causes a memory leak on probe failure
         (Keith Bush)
      
       - fix a deadlock in the passthrough ioctl code (Scott Bauer)
      
       - don't enable AENs if not supported (Weiping Zhang)
      
       - fix an old regression in metadata handling in the passthrough ioctl
         code (Roland Dreier)
      
      * tag 'nvme-for-4.18' of git://git.infradead.org/nvme:
        nvme: fix handling of metadata_len for NVME_IOCTL_IO_CMD
        nvme: don't enable AEN if not supported
        nvme: ensure forward progress during Admin passthru
        nvme-pci: fix memory leak on probe failure
      74413084
    • L
      Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 165ea0d1
      Linus Torvalds 提交于
      Pull vfs fixes from Al Viro:
       "Fix several places that screw up cleanups after failures halfway
        through opening a file (one open-coding filp_clone_open() and getting
        it wrong, two misusing alloc_file()). That part is -stable fodder from
        the 'work.open' branch.
      
        And Christoph's regression fix for uapi breakage in aio series;
        include/uapi/linux/aio_abi.h shouldn't be pulling in the kernel
        definition of sigset_t, the reason for doing so in the first place had
        been bogus - there's no need to expose struct __aio_sigset in
        aio_abi.h at all"
      
      * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        aio: don't expose __aio_sigset in uapi
        ocxlflash_getfile(): fix double-iput() on alloc_file() failures
        cxl_getfile(): fix double-iput() on alloc_file() failures
        drm_mode_create_lease_ioctl(): fix open-coded filp_clone_open()
      165ea0d1
    • A
      alpha: fix osf_wait4() breakage · f88a333b
      Al Viro 提交于
      kernel_wait4() expects a userland address for status - it's only
      rusage that goes as a kernel one (and needs a copyout afterwards)
      
      [ Also, fix the prototype of kernel_wait4() to have that __user
        annotation   - Linus ]
      
      Fixes: 92ebce5a ("osf_wait4: switch to kernel_wait4()")
      Cc: stable@kernel.org # v4.13+
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f88a333b
  6. 22 7月, 2018 15 次提交