未验证 提交 7cc21232 编写于 作者: S Sandy Xu 提交者: GitHub

doc: add benchmark result for different meta engines (#445)

* doc: add benchmark result for different meta engines

* doc(zh_cn): add benchmark result for different meta engines

* address comments
上级 6e8e11b1
......@@ -43,6 +43,7 @@ JuiceFS can simply and conveniently connect massive cloud storage directly to bi
- **Advanced Topic**
- [Redis best practices](redis_best_practices.md)
- [JuiceFS benchmark](benchmark.md)
- [JuiceFS metadata engines benchmark](metadata_engines_benchmark.md)
- [POSIX Compatibility](posix_compatibility.md)
- [JuiceFS cache management](cache_management.md)
- [JuiceFS operations profiling](operations_profiling.md)
......
......@@ -8,7 +8,7 @@ Metadata and data are equally important. The metadata records the detailed infor
The metadata storage of JuiceFS uses a multi-engine design. In order to create an ultra-high-performance cloud-native file system, JuiceFS first supports [Redis](https://redis.io) a key-value database running in memory, which makes JuiceFS ten times more powerful than Amazon [ EFS](https://aws.amazon.com/efs) and [S3FS](https://github.com/s3fs-fuse/s3fs-fuse) performance, [View test results](benchmark.md) .
Through active interaction with community users, we found that many application scenarios do not absolutely rely on high performance. Sometimes users just want to temporarily find a convenient tool to reliably migrate data on the cloud, or simply want to mount the object storage locally for a Small-scale use. Therefore, JuiceFS has successively opened up support for more databases such as MySQL/MariaDB and SQLite.
Through active interaction with community users, we found that many application scenarios do not absolutely rely on high performance. Sometimes users just want to temporarily find a convenient tool to reliably migrate data on the cloud, or simply want to mount the object storage locally for a Small-scale use. Therefore, JuiceFS has successively opened up support for more databases such as MySQL/MariaDB and SQLite (some performance comparison are recorded [here](metadata_engines_benchmark.md)).
**But you need to pay special attention**, in the process of using the JuiceFS file system, no matter which database you choose to store metadata, please **make sure to ensure the security of the metadata**! Once the metadata is damaged or lost, it will directly cause the corresponding data to be completely damaged or lost, and in serious cases may directly cause the entire file system to be damaged.
......@@ -185,4 +185,4 @@ Coming soon...
## PostgreSQL
Coming soon...
\ No newline at end of file
Coming soon...
# Metadata Engines Benchmark
Conclusion first:
- For pure metadata operations, MySQL costs about 3 ~ 5x times of Redis
- For small I/O (~100 KiB) workloads, total time costs with MySQL are about 1 ~ 3x of those with Redis
- For large I/O (~4 MiB) workloads, total time costs with different metadata engines show no obvious difference (object storage becomes the bottleneck)
Details are provided below. Please note all the tests are run with the same object storage (to save data), client and metadata hosts; only metadata engines differ.
## Environment
### Object Storage
Amazon S3.
### Client Hosts
- Amazon c5.xlarge: 4 vCPUs, 8 GiB Memory, Up to 10 Gigabit Network
- Ubuntu 18.04.4 LTS
### Meta Hosts
- Amazon c5d.xlarge: 4 vCPUs, 8 GiB Memory, Up to 10 Gigabit Network, 100 GB SSD (local storage for metadata engines)
- Ubuntu 18.04.4 LTS
- SSD is formated as ext4 and mounted on `/data`
### Meta Engines
#### Redis
- Version: [6.2.3](https://download.redis.io/releases/redis-6.2.3.tar.gz)
- Configuration:
- appendonly: yes
- appendfsync: everysec
- dir: `/data/redis`
#### MySQL
- Version: 8.0.25
- `/var/lib/mysql` is bind mounted on `/data/mysql`
## Tools
All the following tests are run for each metadata engine.
### Golang Benchmark
Simple benchmarks within the source code: `pkg/meta/benchmarks_test.go`.
### JuiceFS Bench
JuiceFS provides a basic benchmark command:
```bash
$ ./juicefs bench /mnt/jfs
```
### mdtest
- Version: mdtest-3.4.0+dev
Run parallel tests on 3 client nodes:
```bash
$ cat myhost
client1 slots=4
client2 slots=4
client3 slots=4
```
Test commands:
```bash
# metadata only
$ mpirun --use-hwthread-cpus --allow-run-as-root -np 12 --hostfile myhost --map-by slot /root/mdtest -b 3 -z 1 -I 100 -d /mnt/jfs
# 12000 * 100KiB files
$ mpirun --use-hwthread-cpus --allow-run-as-root -np 12 --hostfile myhost --map-by slot /root/mdtest -F -w 102400 -I 1000 -z 0 -d /mnt/jfs
```
### fio
- Version: fio-3.1
```bash
fio --name=big-write --directory=/mnt/jfs --rw=write --refill_buffers --bs=4M --size=4G --numjobs=4 --end_fsync=1 --group_reporting
```
## Results
### Golang Benchmark
- Shows time cost (us / op), smaller is better
- Number in parentheses is the multiple of Redis cost
| | Redis | MySQL |
| ---- | ----- | ----- |
| mkdir | 421 | 1820 (4.3) |
| mvdir | 586 | 2872 (4.9) |
| rmdir | 504 | 2248 (4.5) |
| readdir_10 | 220 | 1047 (4.8) |
| readdir_1k | 1506 | 14354 (9.5) |
| mknod | 442 | 1821 (4.1) |
| create | 437 | 1768 (4.0) |
| rename | 580 | 2840 (4.9) |
| unlink | 456 | 2525 (5.5) |
| lookup | 76 | 310 (4.1) |
| getattr | 69 | 269 (3.9) |
| setattr | 283 | 1023 (3.6) |
| access | 69 | 269 (3.9) |
| setxattr | 71 | 921 (13.0) |
| getxattr | 68 | 242 (3.6) |
| removexattr | 76 | 711 (9.4) |
| listxattr_1 | 68 | 259 (3.8) |
| listxattr_10 | 70 | 290 (4.1) |
| link | 360 | 2058 (5.7) |
| symlink | 429 | 2013 (4.7) |
| newchunk | 69 | 0 (0.0) |
| write | 368 | 2720 (7.4) |
| read_1 | 71 | 236 (3.3) |
| read_10 | 87 | 301 (3.5) |
### JuiceFS Bench
| | Redis | MySQL |
| -------------- | -------------- | -------------- |
| Write big | 318.84 MiB/s | 306.77 MiB/s |
| Read big | 469.94 MiB/s | 507.13 MiB/s |
| Write small | 23.4 files/s | 24.6 files/s |
| Read small | 2155.4 files/s | 1714.7 files/s |
| Stat file | 6015.8 files/s | 2867.9 files/s |
| FUSE operation | 0.4 ms | 0.4 ms |
| Update meta | 0.9 ms | 2.5 ms |
### mdtest
- Shows rate (ops / sec), bigger is better
| | Redis | MySQL |
| ------------------ | --------- | ----- |
| EMPTY FILES | | |
| Directory creation | 282.694 | 215.366 |
| Directory stat | 47474.718 | 12632.878 |
| Directory removal | 330.430 | 198.588 |
| File creation | 222.603 | 226.587 |
| File stat | 45960.505 | 13012.763 |
| File read | 49088.346 | 15622.533 |
| File removal | 334.759 | 195.183 |
| Tree creation | 956.797 | 390.026 |
| Tree removal | 295.399 | 284.733 |
| SMALL FILES | | |
| File creation | 255.077 | 245.659 |
| File stat | 51799.065 | 14191.255 |
| File read | 47091.975 | 16794.314 |
| File removal | 631.046 | 194.810 |
| Tree creation | 749.869 | 339.375 |
| Tree removal | 282.643 | 165.118 |
### fio
| | Redis | MySQL |
| --------------- | --------- | --------- |
| Write bandwidth | 350 MiB/s | 360 MiB/s |
......@@ -41,6 +41,7 @@ JuiceFS 是一款高性能 [POSIX](https://en.wikipedia.org/wiki/POSIX) 文件
- 进阶主题
- [Redis 最佳实践](redis_best_practices.md)
- [JuiceFS 性能测试](benchmark.md)
- [JuiceFS 元数据引擎对比测试](metadata_engines_benchmark.md)
- [POSIX 兼容性](posix_compatibility.md)
- [JuiceFS 缓存管理](cache_management.md)
- [JuiceFS 性能诊断](operations_profiling.md)
......
......@@ -8,7 +8,7 @@
JuiceFS 的元数据存储采用了多引擎设计。为了打造一个超高性能的云原生文件系统,JuiceFS 最先支持的是运行在内存上的键值数据库—— [Redis](https://redis.io),这使得 JuiceFS 拥有十倍于 Amazon [EFS](https://aws.amazon.com/efs)[S3FS](https://github.com/s3fs-fuse/s3fs-fuse) 的性能表现,[查看测试结果](benchmark.md)
通过与社区用户积极互动,我们发现很多应用场景并不绝对依赖高性能,有时用户只是想临时找到一个方便的工具在云上可靠的迁移数据,或者只是想更简单的把对象存储挂载到本地小规模的使用。因此,JuiceFS 陆续开放了对 MySQL/MariaDB、SQLite 等更多数据库的支持。
通过与社区用户积极互动,我们发现很多应用场景并不绝对依赖高性能,有时用户只是想临时找到一个方便的工具在云上可靠的迁移数据,或者只是想更简单的把对象存储挂载到本地小规模的使用。因此,JuiceFS 陆续开放了对 MySQL/MariaDB、SQLite 等更多数据库的支持(性能对比数据可参考[这里](metadata_engines_benchmark.md)
**但需要特别注意的是**,在使用 JuiceFS 文件系统的过程中,不论你选择哪种数据库存储元数据,请 **务必确保元数据的安全**!元数据一旦损坏或丢失,将直接导致对应数据彻底损坏或丢失,严重的可能直接导致整个文件系统发生损毁。
......@@ -185,4 +185,4 @@ $ sudo juicefs mount -d sqlite3:///home/herald/my-jfs.db /mnt/jfs/
## PostgreSQL
即将推出......
\ No newline at end of file
即将推出......
# 元数据引擎性能对比测试
首先展示结论:
- 对于纯元数据操作,MySQL 耗时约为 Redis 的3 ~ 5倍
- 对于小IO(~100 KiB)压力,使用 MySQL 引擎的操作总耗时大约是使用 Redis 引擎总耗时的1 ~ 3倍
- 对于大IO(~4 MiB)压力,使用不同元数据引擎的总耗时未见明显差异(此时对象存储成为瓶颈)
以下提供了测试的具体细节。这些测试都运行在相同的对象存储(用来存放数据),客户端和元数据节点上;只有元数据引擎不同。
## 测试环境
### 对象存储
Amazon S3.
### 客户端节点
- Amazon c5.xlarge: 4 vCPUs, 8 GiB Memory, Up to 10 Gigabit Network
- Ubuntu 18.04.4 LTS
### 元数据节点
- Amazon c5d.xlarge: 4 vCPUs, 8 GiB Memory, Up to 10 Gigabit Network, 100 GB SSD(为元数据引擎提供本地存储)
- Ubuntu 18.04.4 LTS
- SSD 数据盘被格式化为 ext4 文件系统并挂载到 `/data` 目录
### 元数据引擎
#### Redis
- 版本: [6.2.3](https://download.redis.io/releases/redis-6.2.3.tar.gz)
- 配置:
- appendonly: yes
- appendfsync: everysec
- dir: `/data/redis`
#### MySQL
- 版本: 8.0.25
- `/var/lib/mysql` 目录被绑定挂载到 `/data/mysql`
## 测试工具
每种元数据引擎都会运行以下所有测试。
### Golang Benchmark
在源码中提供了简单的元数据基准测试: `pkg/meta/benchmarks_test.go`
### JuiceFS Bench
JuiceFS 提供了一个基础的性能测试命令:
```bash
$ ./juicefs bench /mnt/jfs
```
### mdtest
- 版本: mdtest-3.4.0+dev
在3个客户端节点上并发执行测试:
```bash
$ cat myhost
client1 slots=4
client2 slots=4
client3 slots=4
```
测试命令:
```bash
# meta only
$ mpirun --use-hwthread-cpus --allow-run-as-root -np 12 --hostfile myhost --map-by slot /root/mdtest -b 3 -z 1 -I 100 -d /mnt/jfs
# 12000 * 100KiB files
$ mpirun --use-hwthread-cpus --allow-run-as-root -np 12 --hostfile myhost --map-by slot /root/mdtest -F -w 102400 -I 1000 -z 0 -d /mnt/jfs
```
### fio
- 版本: fio-3.1
```bash
fio --name=big-write --directory=/mnt/jfs --rw=write --refill_buffers --bs=4M --size=4G --numjobs=4 --end_fsync=1 --group_reporting
```
## 测试结果
### Golang Benchmark
- 展示了操作耗时(单位为 微秒/op),数值越小越好
- 括号内数字是该指标对比 Redis 的倍数
| | Redis | MySQL |
| ---- | ----- | ----- |
| mkdir | 421 | 1820 (4.3) |
| mvdir | 586 | 2872 (4.9) |
| rmdir | 504 | 2248 (4.5) |
| readdir_10 | 220 | 1047 (4.8) |
| readdir_1k | 1506 | 14354 (9.5) |
| mknod | 442 | 1821 (4.1) |
| create | 437 | 1768 (4.0) |
| rename | 580 | 2840 (4.9) |
| unlink | 456 | 2525 (5.5) |
| lookup | 76 | 310 (4.1) |
| getattr | 69 | 269 (3.9) |
| setattr | 283 | 1023 (3.6) |
| access | 69 | 269 (3.9) |
| setxattr | 71 | 921 (13.0) |
| getxattr | 68 | 242 (3.6) |
| removexattr | 76 | 711 (9.4) |
| listxattr_1 | 68 | 259 (3.8) |
| listxattr_10 | 70 | 290 (4.1) |
| link | 360 | 2058 (5.7) |
| symlink | 429 | 2013 (4.7) |
| newchunk | 69 | 0 (0.0) |
| write | 368 | 2720 (7.4) |
| read_1 | 71 | 236 (3.3) |
| read_10 | 87 | 301 (3.5) |
### JuiceFS Bench
| | Redis | MySQL |
| -------------- | -------------- | -------------- |
| Write big | 318.84 MiB/s | 306.77 MiB/s |
| Read big | 469.94 MiB/s | 507.13 MiB/s |
| Write small | 23.4 files/s | 24.6 files/s |
| Read small | 2155.4 files/s | 1714.7 files/s |
| Stat file | 6015.8 files/s | 2867.9 files/s |
| FUSE operation | 0.4 ms | 0.4 ms |
| Update meta | 0.9 ms | 2.5 ms |
### mdtest
- 展示了操作速率(每秒ops数),数值越大越好
| | Redis | MySQL |
| ------------------ | --------- | ----- |
| EMPTY FILES | | |
| Directory creation | 282.694 | 215.366 |
| Directory stat | 47474.718 | 12632.878 |
| Directory removal | 330.430 | 198.588 |
| File creation | 222.603 | 226.587 |
| File stat | 45960.505 | 13012.763 |
| File read | 49088.346 | 15622.533 |
| File removal | 334.759 | 195.183 |
| Tree creation | 956.797 | 390.026 |
| Tree removal | 295.399 | 284.733 |
| SMALL FILES | | |
| File creation | 255.077 | 245.659 |
| File stat | 51799.065 | 14191.255 |
| File read | 47091.975 | 16794.314 |
| File removal | 631.046 | 194.810 |
| Tree creation | 749.869 | 339.375 |
| Tree removal | 282.643 | 165.118 |
### fio
| | Redis | MySQL |
| --------------- | --------- | --------- |
| Write bandwidth | 350 MiB/s | 360 MiB/s |
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册