• L
    md/raid10: fix taks hung in raid10d · 4727ec78
    Li Nan 提交于
    hulk inclusion
    category: bugfix
    bugzilla: 188380, https://gitee.com/openeuler/kernel/issues/I6GISC
    CVE: NA
    
    --------------------------------
    
    commit fe630de0 ("md/raid10: avoid deadlock on recovery.") allowed
    normal io and sync io to exist at the same time. Task hung will occur as
    below:
    
    T1                      T2		T3		T4
    raid10d
     handle_read_error
      allow_barrier
       conf->nr_pending--
        -> 0
                            //submit sync io
                            raid10_sync_request
                             raise_barrier
    			  ->will not be blocked
    			  ...
    			//submit to drivers
      raid10_read_request
       wait_barrier
        conf->nr_pending++
         -> 1
    					//retry read fail
    					raid10_end_read_request
    					 reschedule_retry
    					  add to retry_list
    					  conf->nr_queued++
    					   -> 1
    							//sync io fail
    							end_sync_read
    							 __end_sync_read
    							  reschedule_retry
    							   add to retry_list
    					                    conf->nr_queued++
    							     -> 2
     ...
     handle_read_error
      freeze_array
       wait nr_pending == nr_queued+1
            ->1	      ->3
       //task hung
    
    retry read and sync io will be added to retry_list(nr_queued->2) if they
    fails. raid10d() called handle_read_error() and hung in freeze_array().
    nr_queued will not decrease because raid10d is blocked, nr_pending will
    not increase because conf->barrier is not released.
    
    Fix it by moving allow_barrier() after raid10_read_request().
    raise_barrier() will wait for nr_waiting to become 0. Therefore, sync io
    and regular io will not be issued at the same time.
    
    We also removed the check of nr_queued. It can be 0 but don't need to be
    blocked. MD_RECOVERY_RUNNING always is set after this patch, because all
    sync io is waitting in raise_barrier(), remove it, too.
    
    Fixes: fe630de0 ("md/raid10: avoid deadlock on recovery.")
    Signed-off-by: NLi Nan <linan122@huawei.com>
    Reviewed-by: NHou Tao <houtao1@huawei.com>
    (cherry picked from commit 1fe782f0)
    4727ec78
raid10.c 135.6 KB