1. 31 8月, 2010 1 次提交
  2. 28 8月, 2010 1 次提交
    • J
      writeback: Fix lost wake-up shutting down writeback thread · b76b4014
      J. Bruce Fields 提交于
      Setting the task state here may cause us to miss the wake up from
      kthread_stop(), so we need to recheck kthread_should_stop() or risk
      sleeping forever in the following schedule().
      
      Symptom was an indefinite hang on an NFSv4 mount.  (NFSv4 may create
      multiple mounts in a temporary namespace while traversing the mount
      path, and since the temporary namespace is immediately destroyed, it may
      end up destroying a mount very soon after it was created, possibly
      making this race more likely.)
      
      INFO: task mount.nfs4:4314 blocked for more than 120 seconds.
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      mount.nfs4    D 0000000000000000  2880  4314   4313 0x00000000
       ffff88001ed6da28 0000000000000046 ffff88001ed6dfd8 ffff88001ed6dfd8
       ffff88001ed6c000 ffff88001ed6c000 ffff88001ed6c000 ffff88001e5003a0
       ffff88001ed6dfd8 ffff88001e5003a8 ffff88001ed6c000 ffff88001ed6dfd8
      Call Trace:
       [<ffffffff8196090d>] schedule_timeout+0x1cd/0x2e0
       [<ffffffff8106a31c>] ? mark_held_locks+0x6c/0xa0
       [<ffffffff819639a0>] ? _raw_spin_unlock_irq+0x30/0x60
       [<ffffffff8106a5fd>] ? trace_hardirqs_on_caller+0x14d/0x190
       [<ffffffff819671fe>] ? sub_preempt_count+0xe/0xd0
       [<ffffffff8195fc80>] wait_for_common+0x120/0x190
       [<ffffffff81033c70>] ? default_wake_function+0x0/0x20
       [<ffffffff8195fdcd>] wait_for_completion+0x1d/0x20
       [<ffffffff810595fa>] kthread_stop+0x4a/0x150
       [<ffffffff81061a60>] ? thaw_process+0x70/0x80
       [<ffffffff810cc68a>] bdi_unregister+0x10a/0x1a0
       [<ffffffff81229dc9>] nfs_put_super+0x19/0x20
       [<ffffffff810ee8c4>] generic_shutdown_super+0x54/0xe0
       [<ffffffff810ee9b6>] kill_anon_super+0x16/0x60
       [<ffffffff8122d3b9>] nfs4_kill_super+0x39/0x90
       [<ffffffff810eda45>] deactivate_locked_super+0x45/0x60
       [<ffffffff810edfb9>] deactivate_super+0x49/0x70
       [<ffffffff81108294>] mntput_no_expire+0x84/0xe0
       [<ffffffff811084ef>] release_mounts+0x9f/0xc0
       [<ffffffff81108575>] put_mnt_ns+0x65/0x80
       [<ffffffff8122cc56>] nfs_follow_remote_path+0x1e6/0x420
       [<ffffffff8122cfbf>] nfs4_try_mount+0x6f/0xd0
       [<ffffffff8122d0c2>] nfs4_get_sb+0xa2/0x360
       [<ffffffff810edcb8>] vfs_kern_mount+0x88/0x1f0
       [<ffffffff810ede92>] do_kern_mount+0x52/0x130
       [<ffffffff81963d9a>] ? _lock_kernel+0x6a/0x170
       [<ffffffff81108e9e>] do_mount+0x26e/0x7f0
       [<ffffffff81106b3a>] ? copy_mount_options+0xea/0x190
       [<ffffffff811094b8>] sys_mount+0x98/0xf0
       [<ffffffff810024d8>] system_call_fastpath+0x16/0x1b
      1 lock held by mount.nfs4/4314:
       #0:  (&type->s_umount_key#24){+.+...}, at: [<ffffffff810edfb1>] deactivate_super+0x41/0x70
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      Acked-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>
      b76b4014
  3. 27 8月, 2010 1 次提交
    • A
      writeback: do not lose wakeup events when forking bdi threads · 6628bc74
      Artem Bityutskiy 提交于
      This patch fixes the following issue:
      
      INFO: task mount.nfs4:1120 blocked for more than 120 seconds.
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      mount.nfs4    D 00000000fffc6a21     0  1120   1119 0x00000000
       ffff880235643948 0000000000000046 ffffffff00000000 ffffffff00000000
       ffff880235643fd8 ffff880235314760 00000000001d44c0 ffff880235643fd8
       00000000001d44c0 00000000001d44c0 00000000001d44c0 00000000001d44c0
      Call Trace:
       [<ffffffff813bc747>] schedule_timeout+0x34/0xf1
       [<ffffffff813bc530>] ? wait_for_common+0x3f/0x130
       [<ffffffff8106b50b>] ? trace_hardirqs_on+0xd/0xf
       [<ffffffff813bc5c3>] wait_for_common+0xd2/0x130
       [<ffffffff8104159c>] ? default_wake_function+0x0/0xf
       [<ffffffff813beaa0>] ? _raw_spin_unlock+0x26/0x2a
       [<ffffffff813bc6bb>] wait_for_completion+0x18/0x1a
       [<ffffffff81101a03>] sync_inodes_sb+0xca/0x1bc
       [<ffffffff811056a6>] __sync_filesystem+0x47/0x7e
       [<ffffffff81105798>] sync_filesystem+0x47/0x4b
       [<ffffffff810e7ffd>] generic_shutdown_super+0x22/0xd2
       [<ffffffff810e80f8>] kill_anon_super+0x11/0x4f
       [<ffffffffa00d06d7>] nfs4_kill_super+0x3f/0x72 [nfs]
       [<ffffffff810e7b68>] deactivate_locked_super+0x21/0x41
       [<ffffffff810e7fd6>] deactivate_super+0x40/0x45
       [<ffffffff810fc66c>] mntput_no_expire+0xb8/0xed
       [<ffffffff810fc73b>] release_mounts+0x9a/0xb0
       [<ffffffff810fc7bb>] put_mnt_ns+0x6a/0x7b
       [<ffffffffa00d0fb2>] nfs_follow_remote_path+0x19a/0x296 [nfs]
       [<ffffffffa00d11ca>] nfs4_try_mount+0x75/0xaf [nfs]
       [<ffffffffa00d1790>] nfs4_get_sb+0x276/0x2ff [nfs]
       [<ffffffff810e7dba>] vfs_kern_mount+0xb8/0x196
       [<ffffffff810e7ef6>] do_kern_mount+0x48/0xe8
       [<ffffffff810fdf68>] do_mount+0x771/0x7e8
       [<ffffffff810fe062>] sys_mount+0x83/0xbd
       [<ffffffff810089c2>] system_call_fastpath+0x16/0x1b
      
      The reason of this hang was a race condition: when the flusher thread is
      forking a bdi thread, we use 'kthread_run()', so we run it _before_ we make it
      visible in 'bdi->wb.task'. The bdi thread runs, does all works, and goes sleep.
      'bdi->wb.task' is still NULL. And this is a dangerous time window.
      
      If at this time someone queues a work for this bdi, he does not see the bdi
      thread and wakes up the forker thread instead! But the forker has already
      forked this bdi thread, but just did not make it visible yet!
      
      The result is that we lose the wake up event for this bdi thread and the NFS4
      code waits forever.
      
      To fix the problem, we should use 'ktrhead_create()' for creating bdi threads,
      then make them visible in 'bdi->wb.task', and only after this wake them up.
      This is exactly what this patch does.
      Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      6628bc74
  4. 26 8月, 2010 1 次提交
  5. 23 8月, 2010 22 次提交
  6. 22 8月, 2010 11 次提交
  7. 21 8月, 2010 3 次提交