提交 · 61730186dfbdabb97949ae9bbb8a3642d968850b · kvdb / rocksdb

08 4月, 2017 1 次提交

dummy diff · 61730186

由 Islam AbdelRahman 提交于 4月 07, 2017

Summary: Closes https://github.com/facebook/rocksdb/pull/2114

Differential Revision: D4854860

Pulled By: IslamAbdelRahman

fbshipit-source-id: b871c5b9ccc52d20f5ceacdd172dc70b1dbf9110

61730186

07 4月, 2017 1 次提交

CMake: more MinGW fixes · 107c5f6a

由 Tamir Duberstein 提交于 4月 06, 2017

Summary:
siying this is a resubmission of #2081 with the 4th commit fixed. From that commit message:

> Note that the previous use of quotes in PLATFORM_{CC,CXX}FLAGS was
incorrect and caused GCC to produce the incorrect define:
>
>  #define ROCKSDB_JEMALLOC -DJEMALLOC_NO_DEMANGLE 1
>
> This was the cause of the Linux build failure on the previous version
of this change.

I've tested this locally, and the Linux build succeeds now.
Closes https://github.com/facebook/rocksdb/pull/2097

Differential Revision: D4839964

Pulled By: siying

fbshipit-source-id: cc51322

107c5f6a

06 4月, 2017 2 次提交

Move some files under util/ to separate dirs · d2dce561

由 Siying Dong 提交于 4月 05, 2017

Summary:
Move some files under util/ to new directories env/, monitoring/ options/ and cache/
Closes https://github.com/facebook/rocksdb/pull/2090

Differential Revision: D4833681

Pulled By: siying

fbshipit-source-id: 2fd8bef

d2dce561

Divide db/db_impl.cc · ce64b8b7

由 Siying Dong 提交于 4月 05, 2017

Summary:
db_impl.cc is too large to manage. Divide db_impl.cc into db/db_impl.cc, db/db_impl_compaction_flush.cc, db/db_impl_files.cc, db/db_impl_open.cc and db/db_impl_write.cc.
Closes https://github.com/facebook/rocksdb/pull/2095

Differential Revision: D4838188

Pulled By: siying

fbshipit-source-id: c5f3059

ce64b8b7

05 4月, 2017 3 次提交

S
Revert "[rocksdb][PR] CMake: more MinGW fixes" · 43010a92
由 Siying Dong 提交于 4月 04, 2017
```
fbshipit-source-id: 43b4529
```
43010a92

CMake: more MinGW fixes · 3450ac8c

由 Tamir Duberstein 提交于 4月 04, 2017

Summary:
See individual commits.

yuslepukhin siying
Closes https://github.com/facebook/rocksdb/pull/2081

Differential Revision: D4824639

Pulled By: IslamAbdelRahman

fbshipit-source-id: 2fc2b00

3450ac8c

Refactor WriteImpl (pipeline write part 1) · 9e445318

由 Yi Wu 提交于 4月 04, 2017

Summary:
Refactor WriteImpl() so when I plug-in the pipeline write code (which is
an alternative approach for WriteThread), some of the logic can be
reuse. I split out the following methods from WriteImpl():

* PreprocessWrite()
* HandleWALFull() (previous MaybeFlushColumnFamilies())
* HandleWriteBufferFull()
* WriteToWAL()

Also adding a constructor to WriteThread::Writer, and move WriteContext into db_impl.h.
No real logic change in this patch.
Closes https://github.com/facebook/rocksdb/pull/2042

Differential Revision: D4781014

Pulled By: yiwu-arbug

fbshipit-source-id: d45ca18

9e445318

04 4月, 2017 2 次提交

Move auto_roll_logger and filename out of db/ · 6ef8c620

由 Siying Dong 提交于 4月 03, 2017

Summary:
It is confusing to have auto_roll_logger to stay under db/, which has nothing to do with database. Move filename together as it is a dependency.
Closes https://github.com/facebook/rocksdb/pull/2080

Differential Revision: D4821141

Pulled By: siying

fbshipit-source-id: ca7d768

6ef8c620

replace sometimes-undefined uint type with unsigned int · d25e28d5

由 Nikhil Benesch 提交于 4月 03, 2017

Summary:
`uint` is nonstandard and not a built-in type on all compilers; replace it
with the always-valid `unsigned int`. I assume this went unnoticed because
it's inside an `#ifdef ROCKDB_JEMALLOC`.
Closes https://github.com/facebook/rocksdb/pull/2075

Differential Revision: D4820427

Pulled By: ajkr

fbshipit-source-id: 0876561

d25e28d5

31 3月, 2017 1 次提交

Option to fail a request as incomplete when skipping too many internal keys · c6d04f2e

由 Sagar Vemuri 提交于 3月 30, 2017

Summary:
Operations like Seek/Next/Prev sometimes take too long to complete when there are many internal keys to be skipped. Adding an option, max_skippable_internal_keys -- which could be used to set a threshold for the maximum number of keys that can be skipped, will help to address these cases where it is much better to fail a request (as incomplete) than to wait for a considerable time for the request to complete.

This feature -- to fail an iterator seek request as incomplete, is disabled by default when max_skippable_internal_keys = 0. It is enabled only when max_skippable_internal_keys > 0.

This feature is based on the discussion mentioned in the PR https://github.com/facebook/rocksdb/pull/1084.
Closes https://github.com/facebook/rocksdb/pull/2000

Differential Revision: D4753223

Pulled By: sagar0

fbshipit-source-id: 1c973f7

c6d04f2e

23 3月, 2017 1 次提交

make total_log_size_ atomic · 3e56c7e0

由 Aaron Gao 提交于 3月 22, 2017

Summary:
make total_log_size_ atomic to avoid overflow caused by data race.
Closes https://github.com/facebook/rocksdb/pull/2019

Differential Revision: D4751391

Pulled By: siying

fbshipit-source-id: fac01dd

3e56c7e0

22 3月, 2017 1 次提交

Flush triggered by DB write buffer size picks the oldest unflushed CF · 8f5bf044

由 Siying Dong 提交于 3月 21, 2017

Summary:
Previously, when DB write buffer size triggers, we always pick the CF with most data in its memtable to flush. This approach can minimize total flush happens. Change the behavior to always pick the oldest unflushed CF, which makes it the same behavior when max_total_wal_size hits. This approach will minimize size used by max_total_wal_size.
Closes https://github.com/facebook/rocksdb/pull/1987

Differential Revision: D4703214

Pulled By: siying

fbshipit-source-id: 9ff8b09

8f5bf044

21 3月, 2017 1 次提交

dynamic setting of stats_dump_period_sec through SetDBOption() · 6908e24b

由 Raza Hussain 提交于 3月 20, 2017

Summary:
Resolved the following issue: https://github.com/facebook/rocksdb/issues/1930
Closes https://github.com/facebook/rocksdb/pull/2004

Differential Revision: D4736764

Pulled By: yiwu-arbug

fbshipit-source-id: 64fe0b7

6908e24b

17 3月, 2017 1 次提交

Break stalls when no bg work is happening · d52f334c

由 Islam AbdelRahman 提交于 3月 16, 2017

Summary:
Current stall will keep sleeping even if there is no Flush/Compactions to wait for, I changed the logic to break the stall if we are not flushing or compacting

db_bench command used
```
# fillrandom
# memtable size = 10MB
# value size = 1 MB
# num = 1000
# use /dev/shm
./db_bench --benchmarks="fillrandom,stats" --value_size=1048576 --write_buffer_size=10485760 --num=1000 --delayed_write_rate=XXXXX  --db="/dev/shm/new_stall" | grep "Cumulative stall"
```

```
Current results

# delayed_write_rate = 1000 Kb/sec
Cumulative stall: 00:00:9.031 H:M:S

# delayed_write_rate = 200 Kb/sec
Cumulative stall: 00:00:22.314 H:M:S

# delayed_write_rate = 100 Kb/sec
Cumulative stall: 00:00:42.784 H:M:S

# delayed_write_rate = 50 Kb/sec
Cumulative stall: 00:01:23.785 H:M:S

# delayed_write_rate = 25 Kb/sec
Cumulative stall: 00:02:45.702 H:M:S
```

```
New results

# delayed_write_rate = 1000 Kb/sec
Cumulative stall: 00:00:9.017 H:M:S

# delayed_write_rate = 200 Kb/sec
Cumulative stall: 00
Closes https://github.com/facebook/rocksdb/pull/1884

Differential Revision: D4585439

Pulled By: IslamAbdelRahman

fbshipit-source-id: aed2198

d52f334c

16 3月, 2017 1 次提交

Add macros to include file name and line number during Logging · e1916368

由 Islam AbdelRahman 提交于 3月 15, 2017

Summary:
current logging
```
2017/03/14-14:20:30.393432 7fedde9f5700 (Original Log Time 2017/03/14-14:20:30.393414) [default] Level summary: base level 1 max bytes base 268435456 files[1 0 0 0 0 0 0] max score 0.25
2017/03/14-14:20:30.393438 7fedde9f5700 [JOB 2] Try to delete WAL files size 61417909, prev total WAL file size 73820858, number of live WAL files 2.
2017/03/14-14:20:30.393464 7fedde9f5700 [DEBUG] [JOB 2] Delete /dev/shm/old_logging//MANIFEST-000001 type=3 #1 -- OK
2017/03/14-14:20:30.393472 7fedde9f5700 [DEBUG] [JOB 2] Delete /dev/shm/old_logging//000003.log type=0 #3 -- OK
2017/03/14-14:20:31.427103 7fedd49f1700 [default] New memtable created with log file: #9. Immutable memtables: 0.
2017/03/14-14:20:31.427179 7fedde9f5700 [JOB 3] Syncing log #6
2017/03/14-14:20:31.427190 7fedde9f5700 (Original Log Time 2017/03/14-14:20:31.427170) Calling FlushMemTableToOutputFile with column family [default], flush slots available 1, compaction slots allowed 1, compaction slots scheduled 1
2017/03/14-14:20:31.
Closes https://github.com/facebook/rocksdb/pull/1990

Differential Revision: D4708695

Pulled By: IslamAbdelRahman

fbshipit-source-id: cb8968f

e1916368

14 3月, 2017 1 次提交

Pinnableslice (2nd attempt) · 11526252

由 Maysam Yabandeh 提交于 3月 13, 2017

Summary:
PinnableSlice

    Summary:
    Currently the point lookup values are copied to a string provided by the
    user. This incures an extra memcpy cost. This patch allows doing point lookup
    via a PinnableSlice which pins the source memory location (instead of
    copying their content) and releases them after the content is consumed
    by the user. The old API of Get(string) is translated to the new API
    underneath.

    Here is the summary for improvements:

    value 100 byte: 1.8% regular, 1.2% merge values
    value 1k byte: 11.5% regular, 7.5% merge values
    value 10k byte: 26% regular, 29.9% merge values
    The improvement for merge could be more if we extend this approach to
    pin the merge output and delay the full merge operation until the user
    actually needs it. We have put that for future work.

    PS:
    Sometimes we observe a small decrease in performance when switching from
    t5452014 to this patch but with the old Get(string) API. The d
Closes https://github.com/facebook/rocksdb/pull/1756

Differential Revision: D4391738

Pulled By: maysamyabandeh

fbshipit-source-id: 6f3edd3

11526252

08 3月, 2017 2 次提交

Add a memtable-only iterator · 97edc72d

由 Sagar Vemuri 提交于 3月 07, 2017

Summary:
This PR is to support a way to iterate over all the keys that are just in memtables.
Closes https://github.com/facebook/rocksdb/pull/1953

Differential Revision: D4663500

Pulled By: sagar0

fbshipit-source-id: 144e177

97edc72d

fix db_sst_test flakiness · 72202962

由 Leonidas Galanis 提交于 3月 07, 2017

Summary:
db_sst_test had been flaky occasionally in the following way: reached_max_space_on_compaction can in very rare cases be 0. This happens when the limit on maximum allowable space set using SetMaxAllowedSpaceUsage is hit during flush for all test db sizes (1,2,4,8 and 10MB).The fix clears the error returned when the the space limit is reached during flush. This ensures that the compaction call back will always be called. The runtime is increased slightly because the 1MB loop writes more data and hits the limit during multiple flushes until compaction is scheduled.
Closes https://github.com/facebook/rocksdb/pull/1861

Differential Revision: D4557396

Pulled By: lgalanis

fbshipit-source-id: ff778d1

72202962

07 3月, 2017 1 次提交

Set logs as getting flushed before releasing lock, race condition fix · 58b12dfe

由 Reid Horuff 提交于 3月 06, 2017

Summary:
Relating to #1903:

In MaybeFlushColumnFamilies() we want to modify the 'getting_flushed' flag before releasing the db mutex when SwitchMemtable() is called.

The following 2 actions need to be atomic in MaybeFlushColumnFamilies()
- getting_flushed is false on oldest log
- we determine that all CFs can be flushed to successfully release oldest log
- we set getting_flushed = true on the oldest log.
-------
- getting_flushed is false on oldest log
- we determine that all CFs can NOT be flushed to successfully release oldest log
- we set unable_to_flush_oldest_log_ = true on the oldest log.

#### In the 2pc case:

T1 enters function but is unable to flush all CFs to release log
T1 sets unable_to_flush_oldest_log_ = true
T1 begins flushing all CFs possible

T2 enters function but is unable to flush all CFs to release log
T2 sees unable_to_flush_oldes_log_ has been set so exits

T3 enters function and will be able to flush all CFs to release oldest log
T3 sets getting_flushed = true on oldes
Closes https://github.com/facebook/rocksdb/pull/1909

Differential Revision: D4646235

Pulled By: reidHoruff

fbshipit-source-id: c8d0447

58b12dfe

03 3月, 2017 1 次提交

sanitize readahead when direct read enabled · 6fb90134

由 Aaron Gao 提交于 3月 02, 2017

Summary:
no readahead:
readseq      :       8.438 micros/op 118510 ops/sec;   13.1 MB/s
sanitize to 10MB:
readseq      :       6.051 micros/op 165248 ops/sec;   18.3 MB/s
Closes https://github.com/facebook/rocksdb/pull/1945

Differential Revision: D4645811

Pulled By: lightmark

fbshipit-source-id: 5d63770

6fb90134

01 3月, 2017 1 次提交

Remove bulk loading and auto_roll_logger in rocksdb_lite · e877afa0

由 Aaron Gao 提交于 2月 28, 2017

Summary:
shrink lite size
Closes https://github.com/facebook/rocksdb/pull/1929

Differential Revision: D4622059

Pulled By: siying

fbshipit-source-id: 050b796

e877afa0

28 2月, 2017 1 次提交

Get unique_ptr to use delete[] for char[] in DumpMallocStats · 2ca2059f

由 Peter (Stig) Edwards 提交于 2月 27, 2017

Summary:
Avoid mismatched free() / delete / delete [] in DumpMallocStats
Closes https://github.com/facebook/rocksdb/pull/1927

Differential Revision: D4622045

Pulled By: siying

fbshipit-source-id: 1131b30

2ca2059f

24 2月, 2017 1 次提交

Remove XFunc tests · 1ba2804b

由 Siying Dong 提交于 2月 23, 2017

Summary:
Xfunc is hardly used. Remove it to keep the code simple.
Closes https://github.com/facebook/rocksdb/pull/1905

Differential Revision: D4603220

Pulled By: siying

fbshipit-source-id: 731f96d

1ba2804b

22 2月, 2017 1 次提交

Fix interference between max_total_wal_size and db_write_buffer_size checks · 18eeb7b9

由 Mike Kolupaev 提交于 2月 21, 2017

Summary:
This is a trivial fix for OOMs we've seen a few days ago in logdevice.

RocksDB get into the following state:
(1) Write throughput is too high for flushes to keep up. Compactions are out of the picture - automatic compactions are disabled, and for manual compactions we don't care that much if they fall behind. We write to many CFs, with only a few L0 sst files in each, so compactions are not needed most of the time.
(2) total_log_size_ is consistently greater than GetMaxTotalWalSize(). It doesn't get smaller since flushes are falling ever further behind.
(3) Total size of memtables is way above db_write_buffer_size and keeps growing. But the write_buffer_manager_->ShouldFlush() is not checked because (2) prevents it (for no good reason, afaict; this is what this commit fixes).
(4) Every call to WriteImpl() hits the MaybeFlushColumnFamilies() path. This keeps flushing the memtables one by one in order of increasing log file number.
(5) No write stalling trigger is hit. We rely on max_write_buffer_number
Closes https://github.com/facebook/rocksdb/pull/1893

Differential Revision: D4593590

Pulled By: yiwu-arbug

fbshipit-source-id: af79c5f

18eeb7b9

18 2月, 2017 2 次提交

Remove timeout_hint_us from WriteOptions · 381fd322

由 Yi Wu 提交于 2月 17, 2017

Summary:
The option has been deprecated for two years and has no effect. Removing.
Closes https://github.com/facebook/rocksdb/pull/1866

Differential Revision: D4555203

Pulled By: yiwu-arbug

fbshipit-source-id: c48f627

381fd322

Fail IngestExternalFile when bg_error_ exists · fce7a6e1

由 Islam AbdelRahman 提交于 2月 17, 2017

Summary:
Fail IngestExternalFile() when bg_error_ exists
Closes https://github.com/facebook/rocksdb/pull/1881

Differential Revision: D4580621

Pulled By: IslamAbdelRahman

fbshipit-source-id: 1194913

fce7a6e1

14 2月, 2017 2 次提交

Make DBImpl::has_unpersisted_data_ atomic · c2247dc1

由 Yi Wu 提交于 2月 13, 2017

Summary:
Seems to me `has_unpersisted_data_` is read from read thread and write
from write thread concurrently without synchronization. Making it an
atomic.

I update the logic not because seeing any problem with it, but it just
feel confusing.
Closes https://github.com/facebook/rocksdb/pull/1869

Differential Revision: D4555837

Pulled By: yiwu-arbug

fbshipit-source-id: eff2ab8

c2247dc1

Remove disableDataSync option · eb912a92

由 Sagar Vemuri 提交于 2月 13, 2017

Summary:
Remove disableDataSync, and another similarly named disable_data_sync options.
This is being done to simplify options, and also because the performance gains of this feature can be achieved by other methods.
Closes https://github.com/facebook/rocksdb/pull/1859

Differential Revision: D4541292

Pulled By: sagar0

fbshipit-source-id: 5b3a6ca

eb912a92

07 2月, 2017 1 次提交

Adding GetApproximateMemTableStats method · 1aaa898c

由 Vitaliy Liptchinsky 提交于 2月 06, 2017

Summary:
Added method that returns approx num of entries as well as size for memtables.
Closes https://github.com/facebook/rocksdb/pull/1841

Differential Revision: D4511990

Pulled By: VitaliyLi

fbshipit-source-id: 9a4576e

1aaa898c

04 2月, 2017 1 次提交

Fix wrong result in data race case related to Get() · 036d668b

由 Siying Dong 提交于 2月 03, 2017

Summary:
In theory, Get() can get a wrong result, if it races in a special with with flush. The bug can be reproduced in DBTest2.GetRaceFlush. Fix this bug by getting snapshot after referencing the super version.
Closes https://github.com/facebook/rocksdb/pull/1816

Differential Revision: D4475958

Pulled By: siying

fbshipit-source-id: bd9e67a

036d668b

03 2月, 2017 1 次提交

Rename merger.h -> merging_iterator.h · 574b543f

由 Islam AbdelRahman 提交于 2月 02, 2017

Summary:
merger.h was always a confusing name for me, simply give the file a better name
Closes https://github.com/facebook/rocksdb/pull/1836

Differential Revision: D4505357

Pulled By: IslamAbdelRahman

fbshipit-source-id: 07b28d8

574b543f

27 1月, 2017 1 次提交

Add test DBTest2.GetRaceFlush which can expose a data race bug · f25f1ec6

由 Siying Dong 提交于 1月 26, 2017

Summary:
A current data race issue in Get() and Flush() can cause a Get() to return wrong results when a flush happened in the middle. Disable the test for now.
Closes https://github.com/facebook/rocksdb/pull/1813

Differential Revision: D4472310

Pulled By: siying

fbshipit-source-id: 5755ebd

f25f1ec6

26 1月, 2017 2 次提交

Avoid logs_ operation out of DB mutex · 5dad9d6d

由 sdong 提交于 1月 25, 2017

Summary:
logs_.back() is called out of DB mutex, which can cause data race. We move the access into the DB mutex protection area.
Closes https://github.com/facebook/rocksdb/pull/1774

Reviewed By: AsyncDBConnMarkedDownDBException

Differential Revision: D4417472

Pulled By: AsyncDBConnMarkedDownDBException

fbshipit-source-id: 2da1f1e

5dad9d6d

Fix CompactFiles() bug when used with CompactionFilter using SuperVersion · a7b13919

由 Islam AbdelRahman 提交于 1月 25, 2017

Summary:
GetAndRefSuperVersion() should not be called again in the same thread before ReturnAndCleanupSuperVersion() is called.

If we have a compaction filter that is using DB::Get, This will happen
```
CompactFiles() {
  GetAndRefSuperVersion() // -- first call
    ..
    CompactionFilter() {
      GetAndRefSuperVersion() // -- second call
      ReturnAndCleanupSuperVersion()
    }
    ..
  ReturnAndCleanupSuperVersion()
}
```

We solve this issue in the same way Iterator is solving it, but using GetReferencedSuperVersion()

This was discovered in https://github.com/facebook/mysql-5.6/issues/427 by alxyang
Closes https://github.com/facebook/rocksdb/pull/1803

Differential Revision: D4460155

Pulled By: IslamAbdelRahman

fbshipit-source-id: 5e54322

a7b13919

25 1月, 2017 1 次提交

Remove function from DBImpl that are not used anywhere · 03ca2ac8

由 Islam AbdelRahman 提交于 1月 24, 2017

Summary:
GetAndRefSuperVersionUnlocked
ReturnAndCleanupSuperVersionUnlocked
GetColumnFamilyHandleUnlocked

Are dead code that are not used any where
Closes https://github.com/facebook/rocksdb/pull/1802

Differential Revision: D4459948

Pulled By: IslamAbdelRahman

fbshipit-source-id: 30fa89d

03ca2ac8

21 1月, 2017 3 次提交

Fix get approx size · 753ff84a

由 Vitaliy Liptchinsky 提交于 1月 20, 2017

Summary:
Fixing GetApproximateSize bug for the case of computing stats for mem tables only.
Closes https://github.com/facebook/rocksdb/pull/1795

Differential Revision: D4445507

Pulled By: IslamAbdelRahman

fbshipit-source-id: 3905846

753ff84a

Fix std::out_of_range when DBOptions::keep_log_file_num is zero · 5ac97314

由 Changli Gao 提交于 1月 20, 2017

Summary:
We should validate this option, otherwise we may see
std::out_of_range thrown at: db/db_impl.cc:1124

1123     for (unsigned int i = 0; i <= end; i++) {
1124       std::string& to_delete = old_info_log_files.at(i);
1125       std::string full_path_to_delete =
1126           (immutable_db_options_.db_log_dir.empty()
Closes https://github.com/facebook/rocksdb/pull/1722

Differential Revision: D4379495

Pulled By: yiwu-arbug

fbshipit-source-id: e136552

5ac97314

Change DB::GetApproximateSizes for more flexibility needed for MyRocks · e840213d

由 Vitaliy Liptchinsky 提交于 1月 20, 2017

Summary:
Added an option to GetApproximateSizes to exclude file stats, as MyRocks has those counted exactly and we need only stats from memtables.
Closes https://github.com/facebook/rocksdb/pull/1787

Differential Revision: D4441111

Pulled By: IslamAbdelRahman

fbshipit-source-id: c11f4c3

e840213d

20 1月, 2017 2 次提交

Flush job should release reference current version if sync log failed · 9239103c

由 Yi Wu 提交于 1月 19, 2017

Summary:
Fix the bug when sync log fail, FlushJob::Run() will not be execute and
reference to cfd->current() will not be release.
Closes https://github.com/facebook/rocksdb/pull/1792

Differential Revision: D4441316

Pulled By: yiwu-arbug

fbshipit-source-id: 5523e28

9239103c

Fix for 2PC causing WAL to grow too large · 5cf176ca

由 Reid Horuff 提交于 1月 19, 2017

Summary:
Consider the following single column family scenario:
prepare in log A
commit in log B
*WAL is too large, flush all CFs to releast log A*
*CFA is on log B so we do not see CFA is depending on log A so no flush is requested*

To fix this we must also consider the log containing the prepare section when determining what log a CF is dependent on.
Closes https://github.com/facebook/rocksdb/pull/1768

Differential Revision: D4403265

Pulled By: reidHoruff

fbshipit-source-id: ce800ff

5cf176ca

kvdb / rocksdb 12 个月 前同步成功

kvdb / rocksdb
12 个月前同步成功