> JuiceFS use local mapping of user and UID. So, you should [sync all the needed users and their UIDs](sync_accounts_between_multiple_hosts.md) across the whole Hadoop cluster to avoid permission error.
> JuiceFS use local mapping of user and UID. So, you should [sync all the needed users and their UIDs](sync_accounts_between_multiple_hosts.md) across the whole Hadoop cluster to avoid permission error. You can also specify a global user list and user group file, please refer to the [relevant configurations](#other-configurations).
## Hadoop Compatibility
## Hadoop Compatibility
...
@@ -30,7 +30,7 @@ $ make
...
@@ -30,7 +30,7 @@ $ make
After compiling you could find the JAR file in `sdk/java/target` directory, e.g. `juicefs-hadoop-0.10.0.jar`. Beware that file with `original-` prefix, it doesn't contain third-party dependencies. It's recommended to use the JAR file with third-party dependencies.
After compiling you could find the JAR file in `sdk/java/target` directory, e.g. `juicefs-hadoop-0.10.0.jar`. Beware that file with `original-` prefix, it doesn't contain third-party dependencies. It's recommended to use the JAR file with third-party dependencies.
**Note: The SDK could only be deployed to same operating system as it be compiled. For example, if you compile SDK in Linux then you must deploy it to Linux.**
> **Note**: The SDK could only be deployed to same operating system as it be compiled. For example, if you compile SDK in Linux then you must deploy it to Linux.
Then put the JAR file and `$JAVA_HOME/lib/tools.jar` to the classpath of each Hadoop ecosystem component. It's recommended create a symbolic link to the JAR file. The following tables describe where the SDK be placed.
Then put the JAR file and `$JAVA_HOME/lib/tools.jar` to the classpath of each Hadoop ecosystem component. It's recommended create a symbolic link to the JAR file. The following tables describe where the SDK be placed.
...
@@ -86,18 +86,20 @@ Then put the JAR file and `$JAVA_HOME/lib/tools.jar` to the classpath of each Ha
...
@@ -86,18 +86,20 @@ Then put the JAR file and `$JAVA_HOME/lib/tools.jar` to the classpath of each Ha
| `juicefs.memory-size` | 300 | Total read/write buffering in MiB |
| `juicefs.memory-size` | 300 | Total read/write buffering in MiB |
| `juicefs.prefetch` | 3 | Prefetch N blocks in parallel |
| `juicefs.prefetch` | 3 | Prefetch N blocks in parallel |
| `juicefs.access-log` | | Access log path. Ensure Hadoop application has write permission, e.g. `/tmp/juicefs.access.log`. The log file will rotate automatically to keep at most 7 files. |
| `juicefs.access-log` | | Access log path. Ensure Hadoop application has write permission, e.g. `/tmp/juicefs.access.log`. The log file will rotate automatically to keep at most 7 files. |
| `juicefs.superuser` | `hdfs` | The super user |
| `juicefs.superuser` | `hdfs` | The super user |
| `juicefs.push-gateway` | | [Prometheus Pushgateway](https://github.com/prometheus/pushgateway) address, format is `<host>:<port>`. |
| `juicefs.users` | `null` | The path of username and UID list file, e.g. `jfs://name/users`. The file format is `<username>:<UID>`, one user per line. |
| `juicefs.groups` | `null` | The path of group name, GID and group members list file, e.g. `jfs://name/groups`. The file format is `<group-name>:<GID>:<username1>,<username2>`, one group per line. |
| `juicefs.push-auth` | | [Prometheus basic auth](https://prometheus.io/docs/guides/basic-auth) information, format is `<username>:<password>`. |
| `juicefs.push-gateway` | | [Prometheus Pushgateway](https://github.com/prometheus/pushgateway) address, format is `<host>:<port>`. |
| `juicefs.no-usage-report` | `false` | Whether disable usage reporting. JuiceFS only collects anonymous usage data (e.g. version number), no user or any sensitive data will be collected. |
| `juicefs.push-auth` | | [Prometheus basic auth](https://prometheus.io/docs/guides/basic-auth) information, format is `<username>:<password>`. |
| `juicefs.no-usage-report` | `false` | Whether disable usage reporting. JuiceFS only collects anonymous usage data (e.g. version number), no user or any sensitive data will be collected. |
When you use multiple JuiceFS file systems, all these configurations could be set to specific file system alone. You need put file system name in the middle of configuration, for example (replace `{JFS_NAME}` with appropriate value):
When you use multiple JuiceFS file systems, all these configurations could be set to specific file system alone. You need put file system name in the middle of configuration, for example (replace `{JFS_NAME}` with appropriate value):
...
@@ -167,8 +169,7 @@ Add configurations to `conf/flink-conf.yaml`. You could only setup Flink client
...
@@ -167,8 +169,7 @@ Add configurations to `conf/flink-conf.yaml`. You could only setup Flink client
When the following components need to access JuiceFS, they should be restarted.
When the following components need to access JuiceFS, they should be restarted.
**Note: Before restart, you need to confirm JuiceFS related configuration has been written to the configuration file of each component,
> **Note**: Before restart, you need to confirm JuiceFS related configuration has been written to the configuration file of each component, usually you can find them in `core-site.xml` on the machine where the service of the component was deployed.
usually you can find them in `core-site.xml` on the machine where the service of the component was deployed.**
| Components | Services |
| Components | Services |
| ---------- | -------- |
| ---------- | -------- |
...
@@ -190,6 +191,8 @@ When `Class io.juicefs.JuiceFileSystem not found` or `No FilesSystem for scheme:
...
@@ -190,6 +191,8 @@ When `Class io.juicefs.JuiceFileSystem not found` or `No FilesSystem for scheme:
$ hadoop fs -ls jfs://{JFS_NAME}/
$ hadoop fs -ls jfs://{JFS_NAME}/
```
```
> **Note**: The `JFS_NAME` is the volume name when you format JuiceFS file system.
### Hive
### Hive
```sql
```sql
...
@@ -214,12 +217,12 @@ Enable metrics reporting through following configurations:
...
@@ -214,12 +217,12 @@ Enable metrics reporting through following configurations:
```
```
> **Note**: Each process using JuiceFS Hadoop Java SDK will have a unique metric, and Pushgateway will always remember all the collected metrics, resulting in the continuous accumulation of metrics and taking up too much memory, which will also slow down Prometheus crawling metrics. It is recommended to clean up metrics which `job` is `juicefs` on Pushgateway regularly. It is recommended to use the following command to clean up once every hour. The running Hadoop Java SDK will continue to update after the metrics are cleared, which basically does not affect the use.
> **Note**: Each process using JuiceFS Hadoop Java SDK will have a unique metric, and Pushgateway will always remember all the collected metrics, resulting in the continuous accumulation of metrics and taking up too much memory, which will also slow down Prometheus crawling metrics. It is recommended to clean up metrics which `job` is `juicefs` on Pushgateway regularly. It is recommended to use the following command to clean up once every hour. The running Hadoop Java SDK will continue to update after the metrics are cleared, which basically does not affect the use.
@@ -184,7 +184,7 @@ For more details about JuiceFS CSI driver please refer to [JuiceFS CSI Driver](h
...
@@ -184,7 +184,7 @@ For more details about JuiceFS CSI driver please refer to [JuiceFS CSI Driver](h
## Monitoring
## Monitoring
JuiceFS CSI driver can export [Prometheus](https://prometheus.io) metrics at port `9567`.
JuiceFS CSI driver can export [Prometheus](https://prometheus.io) metrics at port `9567`. For a description of all monitoring metrics, please refer to [JuiceFS Metrics](p8s_metrics.md).
@@ -184,7 +184,7 @@ For more details about JuiceFS CSI driver please refer to [JuiceFS CSI Driver](h
...
@@ -184,7 +184,7 @@ For more details about JuiceFS CSI driver please refer to [JuiceFS CSI Driver](h
## Monitoring
## Monitoring
JuiceFS CSI driver can export [Prometheus](https://prometheus.io) metrics at port `9567`.
JuiceFS CSI driver can export [Prometheus](https://prometheus.io) metrics at port `9567`. For a description of all monitoring metrics, please refer to [JuiceFS Metrics](p8s_metrics.md).
### Configure Prometheus server
### Configure Prometheus server
...
@@ -229,4 +229,4 @@ scrape_configs:
...
@@ -229,4 +229,4 @@ scrape_configs:
### Configure Grafana dashboard
### Configure Grafana dashboard
We provide a [dashboard template](./k8s_grafana_template.json) for [Grafana](https://grafana.com), which can be imported to show the collected metrics in Prometheus.
We provide a [dashboard template](../en/k8s_grafana_template.json) for [Grafana](https://grafana.com), which can be imported to show the collected metrics in Prometheus.