diff --git a/docs/en/hadoop_java_sdk.md b/docs/en/hadoop_java_sdk.md index 309f7bea5d85ffc48aff60e8ae4b30b594c33b18..08e3e9f289db1bba4f9084424b4669ee18037286 100644 --- a/docs/en/hadoop_java_sdk.md +++ b/docs/en/hadoop_java_sdk.md @@ -4,7 +4,7 @@ JuiceFS provides [Hadoop-compatible FileSystem](https://hadoop.apache.org/docs/c > **NOTICE**: > -> JuiceFS use local mapping of user and UID. So, you should [sync all the needed users and their UIDs](sync_accounts_between_multiple_hosts.md) across the whole Hadoop cluster to avoid permission error. +> JuiceFS use local mapping of user and UID. So, you should [sync all the needed users and their UIDs](sync_accounts_between_multiple_hosts.md) across the whole Hadoop cluster to avoid permission error. You can also specify a global user list and user group file, please refer to the [relevant configurations](#other-configurations). ## Hadoop Compatibility @@ -30,7 +30,7 @@ $ make After compiling you could find the JAR file in `sdk/java/target` directory, e.g. `juicefs-hadoop-0.10.0.jar`. Beware that file with `original-` prefix, it doesn't contain third-party dependencies. It's recommended to use the JAR file with third-party dependencies. -**Note: The SDK could only be deployed to same operating system as it be compiled. For example, if you compile SDK in Linux then you must deploy it to Linux.** +> **Note**: The SDK could only be deployed to same operating system as it be compiled. For example, if you compile SDK in Linux then you must deploy it to Linux. Then put the JAR file and `$JAVA_HOME/lib/tools.jar` to the classpath of each Hadoop ecosystem component. It's recommended create a symbolic link to the JAR file. The following tables describe where the SDK be placed. @@ -86,18 +86,20 @@ Then put the JAR file and `$JAVA_HOME/lib/tools.jar` to the classpath of each Ha | `juicefs.memory-size` | 300 | Total read/write buffering in MiB | | `juicefs.prefetch` | 3 | Prefetch N blocks in parallel | -### Others - -| Configuration | Default Value | Description | -| ------------- | ------------- | ----------- | -| `juicefs.debug` | `false` | Whether enable debug log | -| `juicefs.access-log` | | Access log path. Ensure Hadoop application has write permission, e.g. `/tmp/juicefs.access.log`. The log file will rotate automatically to keep at most 7 files. | -| `juicefs.superuser` | `hdfs` | The super user | -| `juicefs.push-gateway` | | [Prometheus Pushgateway](https://github.com/prometheus/pushgateway) address, format is `:`. | -| `juicefs.push-interval` | 10 | Prometheus push interval in seconds | -| `juicefs.push-auth` | | [Prometheus basic auth](https://prometheus.io/docs/guides/basic-auth) information, format is `:`. | -| `juicefs.fast-resolve` | `true` | Whether enable faster metadata lookup using Redis Lua script | -| `juicefs.no-usage-report` | `false` | Whether disable usage reporting. JuiceFS only collects anonymous usage data (e.g. version number), no user or any sensitive data will be collected. | +### Other Configurations + +| Configuration | Default Value | Description | +| ------------- | ------------- | ----------- | +| `juicefs.debug` | `false` | Whether enable debug log | +| `juicefs.access-log` | | Access log path. Ensure Hadoop application has write permission, e.g. `/tmp/juicefs.access.log`. The log file will rotate automatically to keep at most 7 files. | +| `juicefs.superuser` | `hdfs` | The super user | +| `juicefs.users` | `null` | The path of username and UID list file, e.g. `jfs://name/users`. The file format is `:`, one user per line. | +| `juicefs.groups` | `null` | The path of group name, GID and group members list file, e.g. `jfs://name/groups`. The file format is `::,`, one group per line. | +| `juicefs.push-gateway` | | [Prometheus Pushgateway](https://github.com/prometheus/pushgateway) address, format is `:`. | +| `juicefs.push-interval` | 10 | Prometheus push interval in seconds | +| `juicefs.push-auth` | | [Prometheus basic auth](https://prometheus.io/docs/guides/basic-auth) information, format is `:`. | +| `juicefs.fast-resolve` | `true` | Whether enable faster metadata lookup using Redis Lua script | +| `juicefs.no-usage-report` | `false` | Whether disable usage reporting. JuiceFS only collects anonymous usage data (e.g. version number), no user or any sensitive data will be collected. | When you use multiple JuiceFS file systems, all these configurations could be set to specific file system alone. You need put file system name in the middle of configuration, for example (replace `{JFS_NAME}` with appropriate value): @@ -167,8 +169,7 @@ Add configurations to `conf/flink-conf.yaml`. You could only setup Flink client When the following components need to access JuiceFS, they should be restarted. -**Note: Before restart, you need to confirm JuiceFS related configuration has been written to the configuration file of each component, -usually you can find them in `core-site.xml` on the machine where the service of the component was deployed.** +> **Note**: Before restart, you need to confirm JuiceFS related configuration has been written to the configuration file of each component, usually you can find them in `core-site.xml` on the machine where the service of the component was deployed. | Components | Services | | ---------- | -------- | @@ -190,6 +191,8 @@ When `Class io.juicefs.JuiceFileSystem not found` or `No FilesSystem for scheme: $ hadoop fs -ls jfs://{JFS_NAME}/ ``` +> **Note**: The `JFS_NAME` is the volume name when you format JuiceFS file system. + ### Hive ```sql @@ -214,12 +217,12 @@ Enable metrics reporting through following configurations: ``` > **Note**: Each process using JuiceFS Hadoop Java SDK will have a unique metric, and Pushgateway will always remember all the collected metrics, resulting in the continuous accumulation of metrics and taking up too much memory, which will also slow down Prometheus crawling metrics. It is recommended to clean up metrics which `job` is `juicefs` on Pushgateway regularly. It is recommended to use the following command to clean up once every hour. The running Hadoop Java SDK will continue to update after the metrics are cleared, which basically does not affect the use. +> +> ```bash +> $ curl -X DELETE http://host:9091/metrics/job/juicefs +> ``` -```bash -$ curl -X DELETE http://host:9091/metrics/job/juicefs -``` - -For a description of all monitoring metrics, please see [JuiceFS Metrics](p8s_metrics.md). +For a description of all monitoring metrics, please refer to [JuiceFS Metrics](p8s_metrics.md). ## Benchmark diff --git a/docs/en/how_to_use_on_kubernetes.md b/docs/en/how_to_use_on_kubernetes.md index 462edb8ddc567dab3064cbb89d5a7d40a8f8d124..c3c854baeb7096c6773c38295556621f723572fa 100644 --- a/docs/en/how_to_use_on_kubernetes.md +++ b/docs/en/how_to_use_on_kubernetes.md @@ -184,7 +184,7 @@ For more details about JuiceFS CSI driver please refer to [JuiceFS CSI Driver](h ## Monitoring -JuiceFS CSI driver can export [Prometheus](https://prometheus.io) metrics at port `9567`. +JuiceFS CSI driver can export [Prometheus](https://prometheus.io) metrics at port `9567`. For a description of all monitoring metrics, please refer to [JuiceFS Metrics](p8s_metrics.md). ### Configure Prometheus server diff --git a/docs/zh_cn/hadoop_java_sdk.md b/docs/zh_cn/hadoop_java_sdk.md index 99877a9d15931a695901e9a6307e6869c115c380..2cff58d7b20df6b400e018f124d298d82bad2135 100644 --- a/docs/zh_cn/hadoop_java_sdk.md +++ b/docs/zh_cn/hadoop_java_sdk.md @@ -4,7 +4,7 @@ JuiceFS 提供兼容 HDFS 接口的 Java 客户端来支持 Hadoop 生态中的 > **注意**: > -> 由于 JuiceFS 默认使用本地的 user 和 UID 映射。因此,在分布式环境下使用,需要[同步所有需要使用的 user 和 UID](sync_accounts_between_multiple_hosts.md) 到所有的 Hadoop 节点上,以避免权限问题,指定全局的用户列表和所属用户组的文件。 +> 由于 JuiceFS 默认使用本地的 user 和 UID 映射。因此,在分布式环境下使用,需要[同步所有需要使用的 user 和 UID](sync_accounts_between_multiple_hosts.md) 到所有的 Hadoop 节点上,以避免权限问题。也可以指定一个全局的用户列表和所属用户组文件,具体请参见[相关配置](#其他配置)。 ## Hadoop 兼容性 @@ -88,19 +88,18 @@ $ make ### 其他配置 -| 配置项 | 默认值 | 描述 | -| ------------------ | ------ | ------------------------------------------------------------ | -| `juicefs.debug` | `false` | 是否开启 debug 日志 | -| `juicefs.access-log` | | 访问日志的路径。需要所有应用都有写权限,可以配置为 `/tmp/juicefs.access.log`。该文件会自动轮转,保留最近 7 个文件。 | -| `juicefs.superuser` | `hdfs` | 超级用户 | -| `juicefs.users` | `null` | 用户名以及UID列表文件的地址,比如 `jfs://name/users`。 | -| `juicefs.groups` | `null` | 用户组、GID以及组成员列表文件的地址,比如 `jfs://name/groups` | - -| `juicefs.push-gateway` | | [Prometheus Pushgateway](https://github.com/prometheus/pushgateway) 地址,格式为 `:`。 | -| `juicefs.push-interval` | 10 | 推送数据到 Prometheus 的时间间隔,单位为秒。 | -| `juicefs.push-auth` | | [Prometheus 基本认证](https://prometheus.io/docs/guides/basic-auth)信息,格式为 `:`。 | -| `juicefs.fast-resolve` | `true` | 是否开启快速元数据查找(通过 Redis Lua 脚本实现) | -| `juicefs.no-usage-report` | `false` | 是否上报数据,它只上报诸如版本号等使用量数据,不包含任何用户信息。 | +| 配置项 | 默认值 | 描述 | +| ------------------ | ------ | ------------------------------------------------------------ | +| `juicefs.debug` | `false` | 是否开启 debug 日志 | +| `juicefs.access-log` | | 访问日志的路径。需要所有应用都有写权限,可以配置为 `/tmp/juicefs.access.log`。该文件会自动轮转,保留最近 7 个文件。 | +| `juicefs.superuser` | `hdfs` | 超级用户 | +| `juicefs.users` | `null` | 用户名以及 UID 列表文件的地址,比如 `jfs://name/users`。文件格式为 `:`,一行一个用户。 | +| `juicefs.groups` | `null` | 用户组、GID 以及组成员列表文件的地址,比如 `jfs://name/groups`。文件格式为 `::,`,一行一个用户组。 | +| `juicefs.push-gateway` | | [Prometheus Pushgateway](https://github.com/prometheus/pushgateway) 地址,格式为 `:`。 | +| `juicefs.push-interval` | 10 | 推送数据到 Prometheus 的时间间隔,单位为秒。 | +| `juicefs.push-auth` | | [Prometheus 基本认证](https://prometheus.io/docs/guides/basic-auth)信息,格式为 `:`。 | +| `juicefs.fast-resolve` | `true` | 是否开启快速元数据查找(通过 Redis Lua 脚本实现) | +| `juicefs.no-usage-report` | `false` | 是否上报数据,它只上报诸如版本号等使用量数据,不包含任何用户信息。 | 当使用多个 JuiceFS 文件系统时,上述所有配置项均可对单个文件系统指定,需要将文件系统名字 `{JFS_NAME}` 放在配置项的中间,比如: @@ -170,7 +169,7 @@ $HADOOP_COMMON_HOME/lib/juicefs-hadoop.jar 当需要使用以下组件访问 JuiceFS 数据时,需要重启相关服务。 -**注意:在重启之前需要保证 JuiceFS 配置已经写入配置文件,通常可以查看机器上各组件配置的 `core-site.xml` 里面是否有 JuiceFS 相关配置。** +> **注意**:在重启之前需要保证 JuiceFS 配置已经写入配置文件,通常可以查看机器上各组件配置的 `core-site.xml` 里面是否有 JuiceFS 相关配置。 | 组件名 | 服务名 | | ------ | -------------------------- | @@ -192,6 +191,8 @@ HDFS、Hue、ZooKeeper 等服务无需重启。 $ hadoop fs -ls jfs://{JFS_NAME}/ ``` +> **注**:这里的 `JFS_NAME` 是创建 JuiceFS 文件系统时指定的名称。 + ### Hive ```sql @@ -216,10 +217,10 @@ JuiceFS Hadoop Java SDK 支持把运行指标以 [Prometheus](https://prometheus ``` > **注意**:每一个使用 JuiceFS Hadoop Java SDK 的进程会有唯一的指标,而 Pushgateway 会一直记住所有收集到的指标,导致指标数持续积累占用过多内存,也会使得 Prometheus 抓取指标时变慢,建议定期清理 Pushgateway 上 `job` 为 `juicefs` 的指标。建议每个小时使用下面的命令清理一次,运行中的 Hadoop Java SDK 会在指标清空后继续更新,基本不影响使用。 - -```bash -$ curl -X DELETE http://host:9091/metrics/job/juicefs -``` +> +> ```bash +> $ curl -X DELETE http://host:9091/metrics/job/juicefs +> ``` 关于所有监控指标的描述,请查看 [JuiceFS 监控指标](p8s_metrics.md)。 diff --git a/docs/zh_cn/how_to_use_on_kubernetes.md b/docs/zh_cn/how_to_use_on_kubernetes.md index d398b728b745e8d41b5b47e30ef79afc5851c54e..2ac538f6db1651edfa52f75221c86dd74b565f18 100644 --- a/docs/zh_cn/how_to_use_on_kubernetes.md +++ b/docs/zh_cn/how_to_use_on_kubernetes.md @@ -184,7 +184,7 @@ For more details about JuiceFS CSI driver please refer to [JuiceFS CSI Driver](h ## Monitoring -JuiceFS CSI driver can export [Prometheus](https://prometheus.io) metrics at port `9567`. +JuiceFS CSI driver can export [Prometheus](https://prometheus.io) metrics at port `9567`. For a description of all monitoring metrics, please refer to [JuiceFS Metrics](p8s_metrics.md). ### Configure Prometheus server @@ -229,4 +229,4 @@ scrape_configs: ### Configure Grafana dashboard -We provide a [dashboard template](./k8s_grafana_template.json) for [Grafana](https://grafana.com), which can be imported to show the collected metrics in Prometheus. +We provide a [dashboard template](../en/k8s_grafana_template.json) for [Grafana](https://grafana.com), which can be imported to show the collected metrics in Prometheus. diff --git a/docs/zh_cn/quick_start_guide.md b/docs/zh_cn/quick_start_guide.md index ceddcf3c62651830334ad6df418142fc7a42b84b..df39138569ab83417a6893dbad80b5b4361123a4 100644 --- a/docs/zh_cn/quick_start_guide.md +++ b/docs/zh_cn/quick_start_guide.md @@ -65,7 +65,7 @@ $ sudo install juicefs /usr/local/bin > **提示**: 你也可以从源代码手动编译 JuiceFS 客户端。[查看详情](client_compile_and_upgrade.md) -## 四、创建 JuiceFS 文件系统 +## 四、创建 JuiceFS 文件系统 创建 JuiceFS 文件系统要使用 `format` 子命令,需要同时指定用来存储元数据的 Redis 数据库和用来存储实际数据的对象存储。 @@ -207,4 +207,3 @@ $ sudo fusermount -u /mnt/jfs - [Windows 系统使用 JuiceFS](juicefs_on_windows.md) - [macOS 系统使用 JuiceFS](juicefs_on_macos.md) -