• N
    expose a hook to skip tables during iteration · 7891af8b
    Nikhil Benesch 提交于
    Summary:
    As discussed on the mailing list (["Skipping entire SSTs while iterating"](https://groups.google.com/forum/#!topic/rocksdb/ujHCJVLrHlU)), this patch adds a `table_filter` to `ReadOptions` that allows specifying a callback to be executed during iteration before each table in the database is scanned. The callback is passed the table's properties; the table is scanned iff the callback returns true.
    
    This can be used in conjunction with a `TablePropertiesCollector` to dramatically speed up scans by skipping tables that are known to contain irrelevant data for the scan at hand.
    
    We're using this [downstream in CockroachDB](https://github.com/cockroachdb/cockroach/blob/master/pkg/storage/engine/db.cc#L2009-L2022) already. With this feature, under ideal conditions, we can reduce the time of an incremental backup in  from hours to seconds.
    
    FYI, the first commit in this PR fixes a segfault that I unfortunately have not figured out how to reproduce outside of CockroachDB. I'm hoping you accept it on the grounds that it is not correct to return 8-byte aligned memory from a call to `malloc` on some 64-bit platforms; one correct approach is to infer the necessary alignment from `std::max_align_t`, as done here. As noted in the first commit message, the bug is tickled by having a`std::function` in `struct ReadOptions`. That is, the following patch alone is enough to cause RocksDB to segfault when run from CockroachDB on Darwin.
    
    ```diff
     --- a/include/rocksdb/options.h
    +++ b/include/rocksdb/options.h
    @@ -1546,6 +1546,13 @@ struct ReadOptions {
       // Default: false
       bool ignore_range_deletions;
    
    +  // A callback to determine whether relevant keys for this scan exist in a
    +  // given table based on the table's properties. The callback is passed the
    +  // properties of each table during iteration. If the callback returns false,
    +  // the table will not be scanned.
    +  // Default: empty (every table will be scanned)
    +  std::function<bool(const TableProperties&)> table_filter;
    +
       ReadOptions();
       ReadOptions(bool cksum, bool cache);
     };
    ```
    
    /cc danhhz
    Closes https://github.com/facebook/rocksdb/pull/2265
    
    Differential Revision: D5054262
    
    Pulled By: yiwu-arbug
    
    fbshipit-source-id: dd6b28f2bba6cb8466250d8c5c542d3c92785476
    7891af8b
table_cache.cc 16.9 KB