• L
    dm thin: Fix ABBA deadlock by resetting dm_bufio_client · 890e730d
    Li Lingfeng 提交于
    hulk inclusion
    category: bugfix
    bugzilla: https://gitee.com/openeuler/kernel/issues/I79ZEK
    CVE: NA
    
    --------------------------------
    
    As described in commit d0dcee7d ("dm thin: Fix ABBA deadlock between
    shrink_slab and dm_pool_abort_metadata"), ABBA deadlock will be triggered
    since shrinker_rwsem need to be held when operations failed on dm pool
    metadata.
    
    We have noticed the following three problem scenarios:
    1) Described by commit d0dcee7d ("dm thin: Fix ABBA deadlock between
    shrink_slab and dm_pool_abort_metadata")
    
    2) shrinker_rwsem and throttle->lock
              P1(drop cache)                        P2(kworker)
    drop_caches_sysctl_handler
     drop_slab
      shrink_slab
       down_read(&shrinker_rwsem)  - LOCK A
       do_shrink_slab
        super_cache_scan
         prune_icache_sb
          dispose_list
           evict
            ext4_evict_inode
             ext4_clear_inode
              ext4_discard_preallocations
               ext4_mb_load_buddy_gfp
                ext4_mb_init_cache
                 ext4_wait_block_bitmap
                  __ext4_error
                   ext4_handle_error
                    ext4_commit_super
                     ...
                     dm_submit_bio
                                         do_worker
                                          throttle_work_update
                                           down_write(&t->lock) -- LOCK B
                                          process_deferred_bios
                                           commit
                                            metadata_operation_failed
                                             dm_pool_abort_metadata
                                              dm_block_manager_create
                                               dm_bufio_client_create
                                                register_shrinker
                                                 down_write(&shrinker_rwsem)
                                                 -- LOCK A
                     thin_map
                      thin_bio_map
                       thin_defer_bio_with_throttle
                        throttle_lock
                         down_read(&t->lock)  - LOCK B
    
    3) shrinker_rwsem and wait_on_buffer
              P1(drop cache)                            P2(kworker)
    drop_caches_sysctl_handler
     drop_slab
      shrink_slab
       down_read(&shrinker_rwsem)  - LOCK A
       do_shrink_slab
       ...
        ext4_wait_block_bitmap
         __ext4_error
          ext4_handle_error
           jbd2_journal_abort
            jbd2_journal_update_sb_errno
             jbd2_write_superblock
              submit_bh
               // LOCK B
               // RELEASE B
                                 do_worker
                                  throttle_work_update
                                   down_write(&t->lock) - LOCK B
                                  process_deferred_bios
                                   process_bio
                                   commit
                                    metadata_operation_failed
                                     dm_pool_abort_metadata
                                      dm_block_manager_create
                                       dm_bufio_client_create
                                        register_shrinker
                                         register_shrinker_prepared
                                          down_write(&shrinker_rwsem)  - LOCK A
                                   bio_endio
          wait_on_buffer
           __wait_on_buffer
    
    Fix these by resetting dm_bufio_client without holding shrinker_rwsem.
    Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com>
    890e730d
dm-thin-metadata.c 48.1 KB