未验证 提交 6078a35e 编写于 作者: H Herald Yu 提交者: GitHub

Docs: add en/juicefs_vs_s3ql.md (#825)

* Docs: add en/juicefs_vs_s3ql.md

* Update
Co-authored-by: NChangjian Gao <gcj@juicedata.io>
上级 e831b7b3
# JuiceFS vs. S3QL
Similar to JuiceFS, [S3QL](https://github.com/s3ql/s3ql) is also an open source network file system driven by object storage and database. All data will be split into blocks and stored in object storage services such as, Amazon S3, Backblaze B2, or OpenStack Swift, the corresponding metadata will be stored in the database.
## The same point
- All support the standard POSIX file system interface through the FUSE module, so that massive cloud storage can be mounted locally and used like local storage.
- All can provide standard file system functions: hard links, symbolic links, extended attributes, file permissions.
- All support data compression and encryption, but the algorithms used are different.
## Different points
- S3QL only supports SQLite. But JuiceFS supports more databases, such as Redis, TiKV, MySQL, PostgreSQL, and SQLite.
- S3QL has no distributed capability and does not support multi-host shared mounting. JuiceFS is a typical distributed file system. When using a network-based database, it supports multi-host distributed mount read and write.
- S3QL provides data deduplication. Only one copy of the same data is stored, which can reduce the storage usage, but it will also increase the performance overhead of the system. JuiceFS pays more attention to performance, and it is too expensive to perform deduplication on large-scale data, so this function is temporarily not provided.
- S3QL provides remote synchronous backup of metadata. SQLite databases with metadata will be backed up asynchronously to object storage. JuiceFS mainly uses network databases such as Redis and MySQL, and does not directly provide SQLite database synchronization backup function, but JuiceFS supports metadata import and export, as well as various storage backend synchronization functions, users can easily backup metadata to objects Storage, also supports migration between different databases.
| | **S3QL** | **JuiceFS** |
| ------------------------- | :-------------------: | :---------------------------: |
| Metadata engine | SQLite | Redis, MySQL, SQLite, TiKV |
| Storage engine | Object Storage, Local | Object Storage, WebDAV, Local |
| Operating system | Unix-like | Linux, macOS, Windows |
| Compression algorithm | LZMA, bzip2, gzip | lz4, zstd |
| Encryption algorithm | AES-256 | AES-GCM, RSA |
| POSIX compatible | ✓ | ✓ |
| Hard link | ✓ | ✓ |
| Symbolic link | ✓ | ✓ |
| Extended attributes | ✓ | ✓ |
| Standard Unix permissions | ✓ | ✓ |
| Data block | ✓ | ✓ |
| Local cache | ✓ | ✓ |
| Elastic storage | ✓ | ✓ |
| Metadata backup | ✓ | ✓ |
| Data deduplication | ✓ | ✕ |
| Immutable trees | ✓ | ✕ |
| Snapshots | ✓ | ✕ |
| Share mount | ✕ | ✓ |
| Hadoop SDK | ✕ | ✓ |
| Kubernetes CSI Driver | ✕ | ✓ |
| S3 gateway | ✕ | ✓ |
| Language | Python | Go |
| Open source license | GPLv3 | AGPLv3 |
| Open source date | 2011 | 2021.1 |
## Usage
This part mainly evaluates the ease of installation and use of the two products.
### Installation
During the installation process, we use Rocky Linux 8.4 operating system (kernel version 4.18.0-305.12.1.el8_4.x86_64).
#### S3QL
S3QL is developed in Python and requires python-devel 3.7 and above to be installed. In addition, at least the following dependencies must be met: fuse3-devel, gcc, pyfuse3, sqlite-devel, cryptography, defusedxml, apsw, dugong. In addition, you need to pay special attention to Python's package dependencies and location issues.
S3QL will install 12 binary programs in the system, and each program provides an independent function, as shown in the figure below.
![](../../images/s3ql-bin.jpg)
#### JuiceFS
JuiceFS is developed in Go and can be used directly by downloading the pre-compiled binary file. The JuiceFS client has only one binary program `juicefs`, just copy it to any executable path of the system, for example: `/usr/local/bin`.
### Create and Mount a file system
Both S3QL and JuiceFS use database to store metadata. S3QL only supports SQLite databases, and JuiceFS supports databases such as Redis, TiKV, MySQL, MariaDB, PostgreSQL, and SQLite.
Here we use Minio object storage created locally and use them to create a file system separately:
#### S3QL
S3QL uses `mkfs.s3ql` to create a file system:
```shell
$ mkfs.s3ql --plain --backend-options no-ssl -L s3ql s3c://127.0.0.1:9000/s3ql/
```
Mount a file system using `mount.s3ql`:
```shell
$ mount.s3ql --compress none --backend-options no-ssl s3c://127.0.0.1:9000/s3ql/ mnt-s3ql
```
S3QL needs to interactively provide the access key of the object storage API through the command line when creating and mounting a file system.
#### JuiceFS
JuiceFS uses the `format` subcommand to create a file system:
```shell
$ juicefs format --storage minio \
--bucket http://127.0.0.1:9000/myjfs \
--access-key minioadmin \
--secret-key minioadmin \
sqlite3://myjfs.db \
myjfs
```
Mount a file system using `mount` subcommand:
```shell
$ sudo juicefs mount -d sqlite3://myjfs.db mnt-juicefs
```
JuiceFS only sets the object storage API access key when creating a file system, and the relevant information will be written into the metadata engine. After created, there is no need to repeatedly provide the object storage url, access key and other information.
## Summary
**S3QL** adopts the storage structure of object storage + SQLite, and storing the data in blocks can not only improve the read and write efficiency of the file, but also reduce the resource overhead when the file is modified. The advanced features such as snapshots, data deduplication, and data retention, as well as the default data compression and data encryption, making S3QL very suitable for individuals to store files in cloud storage at a lower cost and more securely.
**JuiceFS** supports object storage, HDFS, WebDAV, and local disks as data storage engines, and supports popular databases such as Redis, TiKV, MySQL, MariaDB, PostgreSQL, and SQLite as metadata storage engines. It provides a standard POSIX file system interface through FUSE, and a Java API, which can directly replace HDFS to provide storage for Hadoop. At the same time, it also provides [Kubernetes CSI Driver](https://github.com/juicedata/juicefs-csi-driver), which can be used as the storage layer of Kubernetes for data persistent storage. JucieFS is a file system designed for enterprise-level distributed data storage scenarios. It is widely used in various scenarios such as big data analysis, machine learning, container shared storage, data sharing, and backup.
......@@ -3,3 +3,4 @@
The following is a comparison between JuiceFS and similar technologies. Community users are welcome to improve and supplement together.
- [JuiceFS vs. Alluxio](comparison/juicefs_vs_alluxio.md)
- [JuiceFS vs. S3QL](comparison/juicefs_vs_s3ql.md)
......@@ -2,13 +2,13 @@
与 JuiceFS 类似,[S3QL](https://github.com/s3ql/s3ql) 也是一款由对象存储和数据库组合驱动的开源网络文件系统,所有存入的数据会被分块后存储到亚马逊 S3、Backblaze B2、OpenStack Swift 等主流的对象存储中,相应的元数据会存储在数据库中。
### 共同点
## 共同点
- 都是通过 FUSE 模块实现对标准 POSIX 文件系统接口的支持,从而可以将海量的云端存储挂载到本地,像本地存储一样使用。
- 都能提供标准的文件系统功能:硬链接、符号链接、扩展属性、文件权限。
- 都支持数据压缩和加密,但二者采用的算法各有不同。
### 不同点
## 不同点
- S3QL 仅支持 SQLite 一种数据库,而 JuiceFS 除了支持 SQLite 以外还支持 Redis、TiKV、MySQL、PostgreSQL 等数据库。
- S3QL 没有分布式能力,不支持多主机同时挂载。JuiceFS 是典型的分布式文件系统,在使用基于网络的数据库时,支持多主机分布式挂载读写。
......@@ -50,7 +50,7 @@
在安装过程中,我们使用 Rocky Linux 8.4 操作系统(内核版本 4.18.0-305.12.1.el8_4.x86_64)。
**S3QL**
#### S3QL
S3QL 采用 Python 开发,在安装时需要依赖 python-devel 3.7 及以上版本。另外,还需要至少满足以下依赖:fuse3-devel、gcc、pyfuse3、sqlite-devel、cryptography、defusedxml、apsw、dugong。另外,需要特别注意 Python 的包依赖和位置问题。
......@@ -58,7 +58,7 @@ S3QL 会在系统中安装 12 个二进制程序,每个程序都提供一个
![](../../images/s3ql-bin.jpg)
**JuiceFS**
#### JuiceFS
JuiceFS 客户端采用 Go 语言开发,直接下载预编译的二进制文件即可直接使用。JuiceFS 客户端只有一个二进制程序 `juicefs`,将其拷贝到系统的任何一个可执行路径下即可,比如:`/usr/local/bin`
......@@ -68,7 +68,7 @@ S3QL 和 JuiceFS 都使用数据库保存元数据,S3QL 仅支持 SQLite 数
这里使用本地创建的 Minio 对象存储,使用两款工具分别创建文件系统:
**S3QL**
#### S3QL
S3QL 使用 `mkfs.s3ql` 工具创建文件系统:
......@@ -84,17 +84,17 @@ $ mount.s3ql --compress none --backend-options no-ssl s3c://127.0.0.1:9000/s3ql/
S3QL 在创建和挂载文件系统时都需要通过命令行交互式的提供对象存储 API 的访问密钥。
**JuiceFS**
#### JuiceFS
JuiceFS 使用 `format` 子命令创建文件系统:
```shell
$ juicefs format --storage minio \
--bucket http://127.0.0.1:9000/myjfs \
--access-key minioadmin \
--secret-key minioadmin \
sqlite3://myjfs.db \
myjfs
--bucket http://127.0.0.1:9000/myjfs \
--access-key minioadmin \
--secret-key minioadmin \
sqlite3://myjfs.db \
myjfs
```
挂载文件系统使用 `mount` 子命令:
......@@ -103,10 +103,10 @@ myjfs
$ sudo juicefs mount -d sqlite3://myjfs.db mnt-juicefs
```
JuiceFS 只在创建文件系统时设置对象存储 API 访问密钥,相关信息会写入配置文件,之后挂载使用无需重复提供对象存储地址、密钥等信息。
JuiceFS 只在创建文件系统时设置对象存储 API 访问密钥,相关信息会写入元数据引擎,之后挂载使用无需重复提供对象存储地址、密钥等信息。
## 对比总结
**S3QL** 采用对象存储 + SQLite 的存储结构,数据分块存储既能提高文件的读写效率,也能降低文件修改时的资源开销。贴心的提供了快照、数据去重、数据保持等高级功能,加之默认的数据压缩和数据加密,让 S3QL 非常适合个人在云存储上用较低的成本、更安全的存储文件。
**JuiceFS** 支持对象存储、HDFS、WebDAV、本地磁盘作为数据存储引擎,支持 Redis、TiKV、MySQL、MariaDB、PostgreSQL、SQLite 等流行的数据作为元数据存储引擎。除了通过 FUSE 提供标准的 POSIX 文件系统接口以外,JuiceFS 还提供 Java API,可以直接替代 HDFS 为 Hadoop 提供存储。同时还提供 CSI Driver,可以作为 Kubernetes 的存储层做数据持久化存储。JucieFS 是为企业级分布式数据存储场景设计的文件系统,广泛应用于大数据分析、机器学习、容器编排、数据共享及备份等多种场景。
**JuiceFS** 支持对象存储、HDFS、WebDAV、本地磁盘作为数据存储引擎,支持 Redis、TiKV、MySQL、MariaDB、PostgreSQL、SQLite 等流行的数据作为元数据存储引擎。除了通过 FUSE 提供标准的 POSIX 文件系统接口以外,JuiceFS 还提供 Java API,可以直接替代 HDFS 为 Hadoop 提供存储。同时还提供 [Kubernetes CSI Driver](https://github.com/juicedata/juicefs-csi-driver),可以作为 Kubernetes 的存储层做数据持久化存储。JucieFS 是为企业级分布式数据存储场景设计的文件系统,广泛应用于大数据分析、机器学习、容器共享存储、数据共享及备份等多种场景。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册