1. 11 5月, 2017 5 次提交
  2. 10 5月, 2017 5 次提交
    • A
      Fixes the CentOS 5 cross-building of RocksJava · e7cea86f
      Adam Retter 提交于
      Summary:
      Updates to CentOS 5 have been archived as CentOS 5 is EOL. We now pull the updates from the vault. This is a stop gap solution, I will send a PR in a couple days which uses fixed Docker containers (with the updates pre-installed) instead.
      
      sagar0 Here you go :-)
      Closes https://github.com/facebook/rocksdb/pull/2270
      
      Differential Revision: D5033637
      
      Pulled By: sagar0
      
      fbshipit-source-id: a9312dd1bc18bfb8653f06ffa0a1512b4415720d
      e7cea86f
    • A
      unbiase readamp bitmap · 259a00ea
      Aaron Gao 提交于
      Summary:
      Consider BlockReadAmpBitmap with bytes_per_bit = 32. Suppose bytes [a, b) were used, while bytes [a-32, a)
       and [b+1, b+33) weren't used; more formally, the union of ranges passed to BlockReadAmpBitmap::Mark() contains [a, b) and doesn't intersect with [a-32, a) and [b+1, b+33). Then bits [floor(a/32), ceil(b/32)] will be set, and so the number of useful bytes will be estimated as (ceil(b/32) - floor(a/32)) * 32, which is on average equal to b-a+31.
      
      An extreme example: if we use 1 byte from each block, it'll be counted as 32 bytes from each block.
      
      It's easy to remove this bias by slightly changing the semantics of the bitmap. Currently each bit represents a byte range [i*32, (i+1)*32).
      
      This diff makes each bit represent a single byte: i*32 + X, where X is a random number in [0, 31] generated when bitmap is created. So, e.g., if you read a single byte at random, with probability 31/32 it won't be counted at all, and with probability 1/32 it will be counted as 32 bytes; so, on average it's counted as 1 byte.
      
      *But there is one exception: the last bit will always set with the old way.*
      
      (*) - assuming read_amp_bytes_per_bit = 32.
      Closes https://github.com/facebook/rocksdb/pull/2259
      
      Differential Revision: D5035652
      
      Pulled By: lightmark
      
      fbshipit-source-id: bd98b1b9b49fbe61f9e3781d07f624e3cbd92356
      259a00ea
    • J
      port: updated PhysicalCoreID() · a6209669
      Jos Collin 提交于
      Summary:
      Updated PhysicalCoreID() to use sched_getcpu() on x86_64 for glibc >= 2.22.  Added a new
      function named GetCPUID() that calls sched_getcpu(), to avoid repeated code. This change is done as per the comments of PR: https://github.com/facebook/rocksdb/pull/2230Signed-off-by: NJos Collin <jcollin@redhat.com>
      Closes https://github.com/facebook/rocksdb/pull/2260
      
      Differential Revision: D5025734
      
      Pulled By: ajkr
      
      fbshipit-source-id: f4cca68c12573cafcf8531e7411a1e733bbf8eef
      a6209669
    • A
      Print compaction_options_universal.stop_style in LOG file · df035b68
      Aaron Gao 提交于
      Summary:
      Print compaction_options_universal.stop_style in LOG file
      ./db_bench --benchmarks=fillseq and read the log
      Closes https://github.com/facebook/rocksdb/pull/2268
      
      Differential Revision: D5032438
      
      Pulled By: lightmark
      
      fbshipit-source-id: 0e72fcd96a1caaf3cab20e86d39c75fbebf5ce37
      df035b68
    • I
      dont skip IO for filter blocks · 4897eb25
      Islam AbdelRahman 提交于
      Summary:
      Based on my experience with linkbench, We should not skip loading bloom filter blocks when they are not available in block cache when using Iterator::Seek
      
      Actually I am not sure why this behavior existed in the first place
      Closes https://github.com/facebook/rocksdb/pull/2255
      
      Differential Revision: D5010721
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 0af545a06ac4baeecb248706ec34d009c2480ca4
      4897eb25
  3. 09 5月, 2017 2 次提交
  4. 08 5月, 2017 1 次提交
    • Y
      Add bulk create/drop column family API · 2cd00773
      Yi Wu 提交于
      Summary:
      Adding DB::CreateColumnFamilie() and DB::DropColumnFamilies() to bulk create/drop column families. This is to address the problem creating/dropping 1k column families takes minutes. The bottleneck is we persist options files for every single column family create/drop, and it parses the persisted options file for verification, which take a lot CPU time.
      
      The new APIs simply create/drop column families individually, and persist options file once at the end. This improves create 1k column families to within ~0.1s. Further improvement can be merge manifest write to one IO.
      Closes https://github.com/facebook/rocksdb/pull/2248
      
      Differential Revision: D5001578
      
      Pulled By: yiwu-arbug
      
      fbshipit-source-id: d4e00bda671451e0b314c13e12ad194b1704aa03
      2cd00773
  5. 06 5月, 2017 4 次提交
    • M
      Object lifetime in cache · 40af2381
      Maysam Yabandeh 提交于
      Summary:
      Any non-raw-data dependent object must be destructed before the table
          closes. There was a bug of not doing that for filter object. This patch
          fixes the bug and adds a unit test to prevent such bugs in future.
      Closes https://github.com/facebook/rocksdb/pull/2246
      
      Differential Revision: D5001318
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 6d8772e58765485868094b92964da82ef9730b6d
      40af2381
    • T
      travis: add Windows cross-compilation · fdaefa03
      Tamir Duberstein 提交于
      Summary:
      - downcase includes for case-sensitive filesystems
      - give targets the same name (librocksdb) on all platforms
      
      With this patch it is possible to cross-compile RocksDB for Windows
      from a Linux host using mingw.
      
      cc yuslepukhin orgads
      Closes https://github.com/facebook/rocksdb/pull/2107
      
      Differential Revision: D4849784
      
      Pulled By: siying
      
      fbshipit-source-id: ad26ed6b4d393851aa6551e6aa4201faba82ef60
      fdaefa03
    • A
      do not read next datablock if upperbound is reached · a30a6960
      Aaron Gao 提交于
      Summary:
      Now if we have iterate_upper_bound set, we continue read until get a key >= upper_bound. For a lot of cases that neighboring data blocks have a user key gap between them, our index key will be a user key in the middle to get a shorter size. For example, if we have blocks:
      [a b c d][f g h]
      Then the index key for the first block will be 'e'.
      then if upper bound is any key between 'd' and 'e', for example, d1, d2, ..., d99999999999, we don't have to read the second block and also know that we have done our iteration by reaching the last key that smaller the upper bound already.
      
      This diff can reduce RA in most cases.
      Closes https://github.com/facebook/rocksdb/pull/2239
      
      Differential Revision: D4990693
      
      Pulled By: lightmark
      
      fbshipit-source-id: ab30ea2e3c6edf3fddd5efed3c34fcf7739827ff
      a30a6960
    • A
      Roundup read bytes in ReadaheadRandomAccessFile · 2d42cf5e
      Aaron Gao 提交于
      Summary:
      Fix alignment in ReadaheadRandomAccessFile
      Closes https://github.com/facebook/rocksdb/pull/2253
      
      Differential Revision: D5012336
      
      Pulled By: lightmark
      
      fbshipit-source-id: 10d2c829520cb787227ef653ef63d5d701725778
      2d42cf5e
  6. 05 5月, 2017 4 次提交
    • S
      Allow IntraL0 compaction in FIFO Compaction · 264d3f54
      Siying Dong 提交于
      Summary:
      Allow an option for users to do some compaction in FIFO compaction, to pay some write amplification for fewer number of files.
      Closes https://github.com/facebook/rocksdb/pull/2163
      
      Differential Revision: D4895953
      
      Pulled By: siying
      
      fbshipit-source-id: a1ab608dd0627211f3e1f588a2e97159646e1231
      264d3f54
    • A
      Set lower-bound on dynamic level sizes · 8c3a180e
      Andrew Kryczka 提交于
      Summary:
      Changed dynamic leveling to stop setting the base level's size bound below `max_bytes_for_level_base`.
      
      Behavior for config where `max_bytes_for_level_base == level0_file_num_compaction_trigger * write_buffer_size` and same amount of data in L0 and base-level:
      
      - Before #2027, compaction scoring would favor base-level due to dividing by size smaller than `max_bytes_for_level_base`.
      - After #2027, L0 and Lbase get equal scores. The disadvantage is L0 is often compacted before reaching the num files trigger since `write_buffer_size` can be bigger than the dynamically chosen base-level size. This increases write-amp.
      - After this diff, L0 and Lbase still get equal scores. Now it takes `level0_file_num_compaction_trigger` files of size `write_buffer_size` to trigger L0 compaction by size, fixing the write-amp problem above.
      Closes https://github.com/facebook/rocksdb/pull/2123
      
      Differential Revision: D4861570
      
      Pulled By: ajkr
      
      fbshipit-source-id: 467ddef56ed1f647c14d86bb018bcb044c39b964
      8c3a180e
    • A
      Avoid calling fallocate with UINT64_MAX · 7c1c8ce5
      Andrew Kryczka 提交于
      Summary:
      When user doesn't set a limit on compaction output file size, let's use the sum of the input files' sizes. This will avoid passing UINT64_MAX as fallocate()'s length. Reported in #2249.
      
      Test setup:
      - command: `TEST_TMPDIR=/data/rocksdb-test/ strace -e fallocate ./db_compaction_test --gtest_filter=DBCompactionTest.ManualCompactionUnknownOutputSize`
      - filesystem: xfs
      
      before this diff:
      `fallocate(10, 01, 0, 1844674407370955160) = -1 ENOSPC (No space left on device)`
      
      after this diff:
      `fallocate(10, 01, 0, 1977)              = 0`
      Closes https://github.com/facebook/rocksdb/pull/2252
      
      Differential Revision: D5007275
      
      Pulled By: ajkr
      
      fbshipit-source-id: 4491404a6ae8a41328aede2e2d6f4d9ac3e38880
      7c1c8ce5
    • L
      max_open_files dynamic set, follow up · a45e98a5
      Leonidas Galanis 提交于
      Summary:
      Followup to make 0x40000 a TableCache constant that indicates infinite capacity
      Closes https://github.com/facebook/rocksdb/pull/2247
      
      Differential Revision: D5001349
      
      Pulled By: lgalanis
      
      fbshipit-source-id: ce7bd2e54b0975bb9f8680fdaa0f8bb0e7ae81a2
      a45e98a5
  7. 04 5月, 2017 4 次提交
  8. 03 5月, 2017 4 次提交
  9. 02 5月, 2017 4 次提交
    • M
      Delete filter before closing the table · 89833577
      Maysam Yabandeh 提交于
      Summary:
      Some filters such as partitioned filter have pointers to the table for which they are created. Therefore is they are stored in the block cache, the should be forcibly erased from block cache before closing the  table, which would result into deleting the object. Otherwise the destructor will be called later when the cache is lazily erasing the object, which having the parent table no longer existent it could result into undefined behavior.
      
      Update: there will be still cases the filter is not removed from the cache since the table has not kept a pointer to the cache handle to be able to forcibly release it later. We make sure that the filter destructor does not access the table pointer to get around such cases.
      Closes https://github.com/facebook/rocksdb/pull/2207
      
      Differential Revision: D4941591
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 56fbab2a11cf447e1aa67caa30b58d7bd7ce5bbd
      89833577
    • M
      Avoid pinning when row cache is accessed · 47a09b0a
      Maysam Yabandeh 提交于
      Summary:
      With row cache being enabled, table cache is doing a short circuit for reading data. This path needs to be updated to take advantage of pinnable slice. In the meanwhile we disabling pinning in this path.
      Closes https://github.com/facebook/rocksdb/pull/2237
      
      Differential Revision: D4982389
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 542630d0cf23cfb1f0c397da82e7053df7966591
      47a09b0a
    • S
      Remove an assert that causes TSAN failure. · aeaba07b
      Siying Dong 提交于
      Summary:
      ColumnFamilyData::ConstructNewMemtable is called out of DB mutex, and it asserts current_ is not empty, but current_ should only be accessed inside DB mutex. Remove this assert to make TSAN happy.
      Closes https://github.com/facebook/rocksdb/pull/2235
      
      Differential Revision: D4978531
      
      Pulled By: siying
      
      fbshipit-source-id: 423685a7dae88ed3faaa9e1b9ccb3427ac704a4b
      aeaba07b
    • S
      Set VALGRIND_VER · 0b90aa95
      Siying Dong 提交于
      Summary:
      VALGRIND_VER was left empty after moving the environment to GCC-5. Set it back.
      Closes https://github.com/facebook/rocksdb/pull/2234
      
      Differential Revision: D4978534
      
      Pulled By: siying
      
      fbshipit-source-id: f0640d58e8f575f75fb3f8b92e686c9e0b6a59bb
      0b90aa95
  10. 29 4月, 2017 2 次提交
  11. 28 4月, 2017 4 次提交
  12. 27 4月, 2017 1 次提交