未验证 提交 240fe02d 编写于 作者: C Changjian Gao 提交者: GitHub

Doc: more details about users and groups file format (#443)

* Doc: more details about users and groups file format

* Fix
上级 0cfb6159
...@@ -4,7 +4,7 @@ JuiceFS provides [Hadoop-compatible FileSystem](https://hadoop.apache.org/docs/c ...@@ -4,7 +4,7 @@ JuiceFS provides [Hadoop-compatible FileSystem](https://hadoop.apache.org/docs/c
> **NOTICE**: > **NOTICE**:
> >
> JuiceFS use local mapping of user and UID. So, you should [sync all the needed users and their UIDs](sync_accounts_between_multiple_hosts.md) across the whole Hadoop cluster to avoid permission error. > JuiceFS use local mapping of user and UID. So, you should [sync all the needed users and their UIDs](sync_accounts_between_multiple_hosts.md) across the whole Hadoop cluster to avoid permission error. You can also specify a global user list and user group file, please refer to the [relevant configurations](#other-configurations).
## Hadoop Compatibility ## Hadoop Compatibility
...@@ -30,7 +30,7 @@ $ make ...@@ -30,7 +30,7 @@ $ make
After compiling you could find the JAR file in `sdk/java/target` directory, e.g. `juicefs-hadoop-0.10.0.jar`. Beware that file with `original-` prefix, it doesn't contain third-party dependencies. It's recommended to use the JAR file with third-party dependencies. After compiling you could find the JAR file in `sdk/java/target` directory, e.g. `juicefs-hadoop-0.10.0.jar`. Beware that file with `original-` prefix, it doesn't contain third-party dependencies. It's recommended to use the JAR file with third-party dependencies.
**Note: The SDK could only be deployed to same operating system as it be compiled. For example, if you compile SDK in Linux then you must deploy it to Linux.** > **Note**: The SDK could only be deployed to same operating system as it be compiled. For example, if you compile SDK in Linux then you must deploy it to Linux.
Then put the JAR file and `$JAVA_HOME/lib/tools.jar` to the classpath of each Hadoop ecosystem component. It's recommended create a symbolic link to the JAR file. The following tables describe where the SDK be placed. Then put the JAR file and `$JAVA_HOME/lib/tools.jar` to the classpath of each Hadoop ecosystem component. It's recommended create a symbolic link to the JAR file. The following tables describe where the SDK be placed.
...@@ -86,18 +86,20 @@ Then put the JAR file and `$JAVA_HOME/lib/tools.jar` to the classpath of each Ha ...@@ -86,18 +86,20 @@ Then put the JAR file and `$JAVA_HOME/lib/tools.jar` to the classpath of each Ha
| `juicefs.memory-size` | 300 | Total read/write buffering in MiB | | `juicefs.memory-size` | 300 | Total read/write buffering in MiB |
| `juicefs.prefetch` | 3 | Prefetch N blocks in parallel | | `juicefs.prefetch` | 3 | Prefetch N blocks in parallel |
### Others ### Other Configurations
| Configuration | Default Value | Description | | Configuration | Default Value | Description |
| ------------- | ------------- | ----------- | | ------------- | ------------- | ----------- |
| `juicefs.debug` | `false` | Whether enable debug log | | `juicefs.debug` | `false` | Whether enable debug log |
| `juicefs.access-log` | | Access log path. Ensure Hadoop application has write permission, e.g. `/tmp/juicefs.access.log`. The log file will rotate automatically to keep at most 7 files. | | `juicefs.access-log` | | Access log path. Ensure Hadoop application has write permission, e.g. `/tmp/juicefs.access.log`. The log file will rotate automatically to keep at most 7 files. |
| `juicefs.superuser` | `hdfs` | The super user | | `juicefs.superuser` | `hdfs` | The super user |
| `juicefs.push-gateway` | | [Prometheus Pushgateway](https://github.com/prometheus/pushgateway) address, format is `<host>:<port>`. | | `juicefs.users` | `null` | The path of username and UID list file, e.g. `jfs://name/users`. The file format is `<username>:<UID>`, one user per line. |
| `juicefs.push-interval` | 10 | Prometheus push interval in seconds | | `juicefs.groups` | `null` | The path of group name, GID and group members list file, e.g. `jfs://name/groups`. The file format is `<group-name>:<GID>:<username1>,<username2>`, one group per line. |
| `juicefs.push-auth` | | [Prometheus basic auth](https://prometheus.io/docs/guides/basic-auth) information, format is `<username>:<password>`. | | `juicefs.push-gateway` | | [Prometheus Pushgateway](https://github.com/prometheus/pushgateway) address, format is `<host>:<port>`. |
| `juicefs.fast-resolve` | `true` | Whether enable faster metadata lookup using Redis Lua script | | `juicefs.push-interval` | 10 | Prometheus push interval in seconds |
| `juicefs.no-usage-report` | `false` | Whether disable usage reporting. JuiceFS only collects anonymous usage data (e.g. version number), no user or any sensitive data will be collected. | | `juicefs.push-auth` | | [Prometheus basic auth](https://prometheus.io/docs/guides/basic-auth) information, format is `<username>:<password>`. |
| `juicefs.fast-resolve` | `true` | Whether enable faster metadata lookup using Redis Lua script |
| `juicefs.no-usage-report` | `false` | Whether disable usage reporting. JuiceFS only collects anonymous usage data (e.g. version number), no user or any sensitive data will be collected. |
When you use multiple JuiceFS file systems, all these configurations could be set to specific file system alone. You need put file system name in the middle of configuration, for example (replace `{JFS_NAME}` with appropriate value): When you use multiple JuiceFS file systems, all these configurations could be set to specific file system alone. You need put file system name in the middle of configuration, for example (replace `{JFS_NAME}` with appropriate value):
...@@ -167,8 +169,7 @@ Add configurations to `conf/flink-conf.yaml`. You could only setup Flink client ...@@ -167,8 +169,7 @@ Add configurations to `conf/flink-conf.yaml`. You could only setup Flink client
When the following components need to access JuiceFS, they should be restarted. When the following components need to access JuiceFS, they should be restarted.
**Note: Before restart, you need to confirm JuiceFS related configuration has been written to the configuration file of each component, > **Note**: Before restart, you need to confirm JuiceFS related configuration has been written to the configuration file of each component, usually you can find them in `core-site.xml` on the machine where the service of the component was deployed.
usually you can find them in `core-site.xml` on the machine where the service of the component was deployed.**
| Components | Services | | Components | Services |
| ---------- | -------- | | ---------- | -------- |
...@@ -190,6 +191,8 @@ When `Class io.juicefs.JuiceFileSystem not found` or `No FilesSystem for scheme: ...@@ -190,6 +191,8 @@ When `Class io.juicefs.JuiceFileSystem not found` or `No FilesSystem for scheme:
$ hadoop fs -ls jfs://{JFS_NAME}/ $ hadoop fs -ls jfs://{JFS_NAME}/
``` ```
> **Note**: The `JFS_NAME` is the volume name when you format JuiceFS file system.
### Hive ### Hive
```sql ```sql
...@@ -214,12 +217,12 @@ Enable metrics reporting through following configurations: ...@@ -214,12 +217,12 @@ Enable metrics reporting through following configurations:
``` ```
> **Note**: Each process using JuiceFS Hadoop Java SDK will have a unique metric, and Pushgateway will always remember all the collected metrics, resulting in the continuous accumulation of metrics and taking up too much memory, which will also slow down Prometheus crawling metrics. It is recommended to clean up metrics which `job` is `juicefs` on Pushgateway regularly. It is recommended to use the following command to clean up once every hour. The running Hadoop Java SDK will continue to update after the metrics are cleared, which basically does not affect the use. > **Note**: Each process using JuiceFS Hadoop Java SDK will have a unique metric, and Pushgateway will always remember all the collected metrics, resulting in the continuous accumulation of metrics and taking up too much memory, which will also slow down Prometheus crawling metrics. It is recommended to clean up metrics which `job` is `juicefs` on Pushgateway regularly. It is recommended to use the following command to clean up once every hour. The running Hadoop Java SDK will continue to update after the metrics are cleared, which basically does not affect the use.
>
> ```bash
> $ curl -X DELETE http://host:9091/metrics/job/juicefs
> ```
```bash For a description of all monitoring metrics, please refer to [JuiceFS Metrics](p8s_metrics.md).
$ curl -X DELETE http://host:9091/metrics/job/juicefs
```
For a description of all monitoring metrics, please see [JuiceFS Metrics](p8s_metrics.md).
## Benchmark ## Benchmark
......
...@@ -184,7 +184,7 @@ For more details about JuiceFS CSI driver please refer to [JuiceFS CSI Driver](h ...@@ -184,7 +184,7 @@ For more details about JuiceFS CSI driver please refer to [JuiceFS CSI Driver](h
## Monitoring ## Monitoring
JuiceFS CSI driver can export [Prometheus](https://prometheus.io) metrics at port `9567`. JuiceFS CSI driver can export [Prometheus](https://prometheus.io) metrics at port `9567`. For a description of all monitoring metrics, please refer to [JuiceFS Metrics](p8s_metrics.md).
### Configure Prometheus server ### Configure Prometheus server
......
...@@ -4,7 +4,7 @@ JuiceFS 提供兼容 HDFS 接口的 Java 客户端来支持 Hadoop 生态中的 ...@@ -4,7 +4,7 @@ JuiceFS 提供兼容 HDFS 接口的 Java 客户端来支持 Hadoop 生态中的
> **注意**: > **注意**:
> >
> 由于 JuiceFS 默认使用本地的 user 和 UID 映射。因此,在分布式环境下使用,需要[同步所有需要使用的 user 和 UID](sync_accounts_between_multiple_hosts.md) 到所有的 Hadoop 节点上,以避免权限问题,指定全局的用户列表和所属用户组的文件 > 由于 JuiceFS 默认使用本地的 user 和 UID 映射。因此,在分布式环境下使用,需要[同步所有需要使用的 user 和 UID](sync_accounts_between_multiple_hosts.md) 到所有的 Hadoop 节点上,以避免权限问题。也可以指定一个全局的用户列表和所属用户组文件,具体请参见[相关配置](#其他配置)
## Hadoop 兼容性 ## Hadoop 兼容性
...@@ -88,19 +88,18 @@ $ make ...@@ -88,19 +88,18 @@ $ make
### 其他配置 ### 其他配置
| 配置项 | 默认值 | 描述 | | 配置项 | 默认值 | 描述 |
| ------------------ | ------ | ------------------------------------------------------------ | | ------------------ | ------ | ------------------------------------------------------------ |
| `juicefs.debug` | `false` | 是否开启 debug 日志 | | `juicefs.debug` | `false` | 是否开启 debug 日志 |
| `juicefs.access-log` | | 访问日志的路径。需要所有应用都有写权限,可以配置为 `/tmp/juicefs.access.log`。该文件会自动轮转,保留最近 7 个文件。 | | `juicefs.access-log` | | 访问日志的路径。需要所有应用都有写权限,可以配置为 `/tmp/juicefs.access.log`。该文件会自动轮转,保留最近 7 个文件。 |
| `juicefs.superuser` | `hdfs` | 超级用户 | | `juicefs.superuser` | `hdfs` | 超级用户 |
| `juicefs.users` | `null` | 用户名以及UID列表文件的地址,比如 `jfs://name/users`。 | | `juicefs.users` | `null` | 用户名以及 UID 列表文件的地址,比如 `jfs://name/users`。文件格式为 `<username>:<UID>`,一行一个用户。 |
| `juicefs.groups` | `null` | 用户组、GID以及组成员列表文件的地址,比如 `jfs://name/groups` | | `juicefs.groups` | `null` | 用户组、GID 以及组成员列表文件的地址,比如 `jfs://name/groups`。文件格式为 `<group-name>:<GID>:<username1>,<username2>`,一行一个用户组。 |
| `juicefs.push-gateway` | | [Prometheus Pushgateway](https://github.com/prometheus/pushgateway) 地址,格式为 `<host>:<port>`。 |
| `juicefs.push-gateway` | | [Prometheus Pushgateway](https://github.com/prometheus/pushgateway) 地址,格式为 `<host>:<port>`。 | | `juicefs.push-interval` | 10 | 推送数据到 Prometheus 的时间间隔,单位为秒。 |
| `juicefs.push-interval` | 10 | 推送数据到 Prometheus 的时间间隔,单位为秒。 | | `juicefs.push-auth` | | [Prometheus 基本认证](https://prometheus.io/docs/guides/basic-auth)信息,格式为 `<username>:<password>`。 |
| `juicefs.push-auth` | | [Prometheus 基本认证](https://prometheus.io/docs/guides/basic-auth)信息,格式为 `<username>:<password>`。 | | `juicefs.fast-resolve` | `true` | 是否开启快速元数据查找(通过 Redis Lua 脚本实现) |
| `juicefs.fast-resolve` | `true` | 是否开启快速元数据查找(通过 Redis Lua 脚本实现) | | `juicefs.no-usage-report` | `false` | 是否上报数据,它只上报诸如版本号等使用量数据,不包含任何用户信息。 |
| `juicefs.no-usage-report` | `false` | 是否上报数据,它只上报诸如版本号等使用量数据,不包含任何用户信息。 |
当使用多个 JuiceFS 文件系统时,上述所有配置项均可对单个文件系统指定,需要将文件系统名字 `{JFS_NAME}` 放在配置项的中间,比如: 当使用多个 JuiceFS 文件系统时,上述所有配置项均可对单个文件系统指定,需要将文件系统名字 `{JFS_NAME}` 放在配置项的中间,比如:
...@@ -170,7 +169,7 @@ $HADOOP_COMMON_HOME/lib/juicefs-hadoop.jar ...@@ -170,7 +169,7 @@ $HADOOP_COMMON_HOME/lib/juicefs-hadoop.jar
当需要使用以下组件访问 JuiceFS 数据时,需要重启相关服务。 当需要使用以下组件访问 JuiceFS 数据时,需要重启相关服务。
**注意:在重启之前需要保证 JuiceFS 配置已经写入配置文件,通常可以查看机器上各组件配置的 `core-site.xml` 里面是否有 JuiceFS 相关配置。** > **注意**:在重启之前需要保证 JuiceFS 配置已经写入配置文件,通常可以查看机器上各组件配置的 `core-site.xml` 里面是否有 JuiceFS 相关配置。
| 组件名 | 服务名 | | 组件名 | 服务名 |
| ------ | -------------------------- | | ------ | -------------------------- |
...@@ -192,6 +191,8 @@ HDFS、Hue、ZooKeeper 等服务无需重启。 ...@@ -192,6 +191,8 @@ HDFS、Hue、ZooKeeper 等服务无需重启。
$ hadoop fs -ls jfs://{JFS_NAME}/ $ hadoop fs -ls jfs://{JFS_NAME}/
``` ```
> **注**:这里的 `JFS_NAME` 是创建 JuiceFS 文件系统时指定的名称。
### Hive ### Hive
```sql ```sql
...@@ -216,10 +217,10 @@ JuiceFS Hadoop Java SDK 支持把运行指标以 [Prometheus](https://prometheus ...@@ -216,10 +217,10 @@ JuiceFS Hadoop Java SDK 支持把运行指标以 [Prometheus](https://prometheus
``` ```
> **注意**:每一个使用 JuiceFS Hadoop Java SDK 的进程会有唯一的指标,而 Pushgateway 会一直记住所有收集到的指标,导致指标数持续积累占用过多内存,也会使得 Prometheus 抓取指标时变慢,建议定期清理 Pushgateway 上 `job` 为 `juicefs` 的指标。建议每个小时使用下面的命令清理一次,运行中的 Hadoop Java SDK 会在指标清空后继续更新,基本不影响使用。 > **注意**:每一个使用 JuiceFS Hadoop Java SDK 的进程会有唯一的指标,而 Pushgateway 会一直记住所有收集到的指标,导致指标数持续积累占用过多内存,也会使得 Prometheus 抓取指标时变慢,建议定期清理 Pushgateway 上 `job` 为 `juicefs` 的指标。建议每个小时使用下面的命令清理一次,运行中的 Hadoop Java SDK 会在指标清空后继续更新,基本不影响使用。
>
```bash > ```bash
$ curl -X DELETE http://host:9091/metrics/job/juicefs > $ curl -X DELETE http://host:9091/metrics/job/juicefs
``` > ```
关于所有监控指标的描述,请查看 [JuiceFS 监控指标](p8s_metrics.md) 关于所有监控指标的描述,请查看 [JuiceFS 监控指标](p8s_metrics.md)
......
...@@ -184,7 +184,7 @@ For more details about JuiceFS CSI driver please refer to [JuiceFS CSI Driver](h ...@@ -184,7 +184,7 @@ For more details about JuiceFS CSI driver please refer to [JuiceFS CSI Driver](h
## Monitoring ## Monitoring
JuiceFS CSI driver can export [Prometheus](https://prometheus.io) metrics at port `9567`. JuiceFS CSI driver can export [Prometheus](https://prometheus.io) metrics at port `9567`. For a description of all monitoring metrics, please refer to [JuiceFS Metrics](p8s_metrics.md).
### Configure Prometheus server ### Configure Prometheus server
...@@ -229,4 +229,4 @@ scrape_configs: ...@@ -229,4 +229,4 @@ scrape_configs:
### Configure Grafana dashboard ### Configure Grafana dashboard
We provide a [dashboard template](./k8s_grafana_template.json) for [Grafana](https://grafana.com), which can be imported to show the collected metrics in Prometheus. We provide a [dashboard template](../en/k8s_grafana_template.json) for [Grafana](https://grafana.com), which can be imported to show the collected metrics in Prometheus.
...@@ -65,7 +65,7 @@ $ sudo install juicefs /usr/local/bin ...@@ -65,7 +65,7 @@ $ sudo install juicefs /usr/local/bin
> **提示**: 你也可以从源代码手动编译 JuiceFS 客户端。[查看详情](client_compile_and_upgrade.md) > **提示**: 你也可以从源代码手动编译 JuiceFS 客户端。[查看详情](client_compile_and_upgrade.md)
## 四、创建 JuiceFS 文件系统 ## 四、创建 JuiceFS 文件系统
创建 JuiceFS 文件系统要使用 `format` 子命令,需要同时指定用来存储元数据的 Redis 数据库和用来存储实际数据的对象存储。 创建 JuiceFS 文件系统要使用 `format` 子命令,需要同时指定用来存储元数据的 Redis 数据库和用来存储实际数据的对象存储。
...@@ -207,4 +207,3 @@ $ sudo fusermount -u /mnt/jfs ...@@ -207,4 +207,3 @@ $ sudo fusermount -u /mnt/jfs
- [Windows 系统使用 JuiceFS](juicefs_on_windows.md) - [Windows 系统使用 JuiceFS](juicefs_on_windows.md)
- [macOS 系统使用 JuiceFS](juicefs_on_macos.md) - [macOS 系统使用 JuiceFS](juicefs_on_macos.md)
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册