From bded7342e4b0a3f989d0437084954aeb11a80dd9 Mon Sep 17 00:00:00 2001 From: Changjian Gao Date: Wed, 10 Mar 2021 10:02:00 +0800 Subject: [PATCH] Fix incorrect paragraph structure and typo in Hadoop SDK doc (#244) * Fix incorrect paragraph structure and typo in Hadoop SDK doc * Update --- docs/en/hadoop_java_sdk.md | 72 ++++++++++++++++------------------- docs/zh_cn/hadoop_java_sdk.md | 36 ++++++++---------- 2 files changed, 49 insertions(+), 59 deletions(-) diff --git a/docs/en/hadoop_java_sdk.md b/docs/en/hadoop_java_sdk.md index cda8cf84..dc8055a3 100644 --- a/docs/en/hadoop_java_sdk.md +++ b/docs/en/hadoop_java_sdk.md @@ -9,7 +9,7 @@ JuiceFS Hadoop Java SDK is compatible with Hadoop 2.x and Hadoop 3.x. As well as In order to make JuiceFS works with other components, it usually takes 2 steps: 1. Put JAR file into the classpath of each Hadoop ecosystem component. -2. Put JuiceFS conf into the conf file of each Hadoop ecosystem component(usually core-site.xml). +2. Put JuiceFS configurations into the configuration file of each Hadoop ecosystem component (usually `core-site.xml`). ## Compiling @@ -40,26 +40,6 @@ Then put the JAR file and `$JAVA_HOME/lib/tools.jar` to the classpath of each Ha | UCloud UHadoop | `/home/hadoop/share/hadoop/common/lib`
`/home/hadoop/hive/auxlib`
`/home/hadoop/spark/jars`
`/home/hadoop/presto/plugin/hive-hadoop2` | | Baidu Cloud EMR | `/opt/bmr/hadoop/share/hadoop/common/lib`
`/opt/bmr/hive/auxlib`
`/opt/bmr/spark2/jars` | -### CDH6 - -Besides `core-site`,you also need to configure `mapreduce.application.classpath` of the YARN component, add: - -```shell -$HADOOP_COMMON_HOME/lib/juicefs-hadoop.jar -``` - -### HDP - -Besides `core-site` 外,you also need to configure `mapreduce.application.classpath` of the MapReduce2 component, add: - -```shell -/usr/hdp/${hdp.version}/hadoop/lib/juicefs-hadoop.jar -``` - -### Flink 配置 - -Write JuiceFS conf to `conf/flink-conf.yaml` of Flink, you can just do it in Flink Client machine. - ### Community Components | Name | Installing Paths | @@ -140,30 +120,44 @@ When you use multiple JuiceFS file systems, all these configurations could be se Add configurations to `core-site.xml`. +#### CDH 6 + +Besides `core-site`, you also need to configure `mapreduce.application.classpath` of the YARN component, add: + +```shell +$HADOOP_COMMON_HOME/lib/juicefs-hadoop.jar +``` + +#### HDP + +Besides `core-site`, you also need to configure `mapreduce.application.classpath` of the MapReduce2 component, add (variables do not need to be replaced): + +```shell +/usr/hdp/${hdp.version}/hadoop/lib/juicefs-hadoop.jar +``` + ### Configuration in Flink Add configurations to `conf/flink-conf.yaml`. You could only setup Flink client without modify configurations in Hadoop. ## Restart Services +When the following components need to access JuiceFS, they should be restarted. -When those components below need to access JuiceFS, they should be restarted. - -**Note: Before restart, you need to confirm JuiceFS related conf has been writen to the conf file of each component, -usually you can find them in core-site.xml on the machine where the service of the component was deployed.** +**Note: Before restart, you need to confirm JuiceFS related configuration has been written to the configuration file of each component, +usually you can find them in `core-site.xml` on the machine where the service of the component was deployed.** -| Components | Services | -| ------ | -------------------------- | -| Hive | HiveServer
Metastore | -| Spark | ThriftServer | -| Presto | Coordinator
Worker | -| Impala | Catalog Server
Daemon | -| HBase | Master
RegionServer | +| Components | Services | +| ---------- | -------- | +| Hive | HiveServer
Metastore | +| Spark | ThriftServer | +| Presto | Coordinator
Worker | +| Impala | Catalog Server
Daemon | +| HBase | Master
RegionServer | -HDFS,HUE,ZooKeeper etc don't need to restart. +HDFS, Hue, ZooKeeper and other services don't need to be restarted. -When `Class io.juicefs.JuiceFileSystem not found` or `No FilesSystem for scheme: jfs` exceptions was occurred after restart, -reference [FAQ](#faq) +When `Class io.juicefs.JuiceFileSystem not found` or `No FilesSystem for scheme: jfs` exceptions was occurred after restart, reference [FAQ](#faq). ## Verification @@ -187,12 +181,12 @@ CREATE TABLE IF NOT EXISTS person ### `Class io.juicefs.JuiceFileSystem not found` exception -It means JAR file was not loaded, you can verify it by `lsof -p {pid} | grep juicefs`. +It means JAR file was not loaded, you can verify it by `lsof -p {pid} | grep juicefs`. -You should check whether the JAR file was located properly, or it has the read permission by other users. +You should check whether the JAR file was located properly, or other users have the read permission. -Some hadoop distribution also need to modify `mapred-site.xml` and put the JAR file location path to the end of the param `mapreduce.application.classpath`. +Some Hadoop distribution also need to modify `mapred-site.xml` and put the JAR file location path to the end of the parameter `mapreduce.application.classpath`. ### `No FilesSystem for scheme: jfs` exception -It means JuiceFS conf was not configured properly, you need check `core-site.xml` on the local machine. \ No newline at end of file +It means JuiceFS Hadoop Java SDK was not configured properly, you need to check whether there is JuiceFS related configuration in the `core-site.xml` of the component configuration. diff --git a/docs/zh_cn/hadoop_java_sdk.md b/docs/zh_cn/hadoop_java_sdk.md index b1a8d73a..c2b2c8a3 100644 --- a/docs/zh_cn/hadoop_java_sdk.md +++ b/docs/zh_cn/hadoop_java_sdk.md @@ -2,15 +2,15 @@ JuiceFS 提供兼容 HDFS 接口的 Java 客户端来支持 Hadoop 生态中的各种应用。 -为了使各组件能够识别 JuiceFS ,通常需要两个步骤: - -1. 将 jar 文件放置到组件的 `classpath` 内 -2. 将 JuiceFS 相关配置写入配置文件(通常是 core-site.xml) - ## Hadoop 兼容性 JuiceFS Hadoop Java SDK 同时兼容 Hadoop 2.x 以及 Hadoop 3.x 环境,以及 Hadoop 生态中的各种主流组件。 +为了使各组件能够识别 JuiceFS,通常需要两个步骤: + +1. 将 JAR 文件放置到组件的 classpath 内; +2. 将 JuiceFS 相关配置写入配置文件(通常是 `core-site.xml`)。 + ## 编译 你需要先安装 Go 1.13+、JDK 8+ 以及 Maven 工具,然后运行以下命令: @@ -48,7 +48,7 @@ $ make | ---- | ---- | | Spark | `${SPARK_HOME}/jars` | | Presto | `${PRESTO_HOME}/plugin/hive-hadoop2` | -| Flink | `${FLINK_HOME}/lib` | +| Flink | `${FLINK_HOME}/lib` | ## 配置参数 @@ -122,17 +122,17 @@ $ make 将配置参数加入到 Hadoop 配置文件 `core-site.xml` 中。 -### CDH6 环境配置 +#### CDH 6 环境配置 -如果使用的是 CDH6 版本,除了修改 `core-site` 外,还需要通过 YARN 服务界面修改 `mapreduce.application.classpath`,增加: +如果使用的是 CDH 6 版本,除了修改 `core-site` 外,还需要通过 YARN 服务界面修改 `mapreduce.application.classpath`,增加: ```shell $HADOOP_COMMON_HOME/lib/juicefs-hadoop.jar ``` -### HDP 环境配置 +#### HDP 环境配置 -除了修改 `core-site` 外,还需要通过 MapReduce2 服务界面修改配置 `mapreduce.application.classpath`,在末尾增加(变量无需替换): +除了修改 `core-site` 外,还需要通过 MapReduce2 服务界面修改配置 `mapreduce.application.classpath`,在末尾增加(变量无需替换): ```shell /usr/hdp/${hdp.version}/hadoop/lib/juicefs-hadoop.jar @@ -144,9 +144,9 @@ $HADOOP_COMMON_HOME/lib/juicefs-hadoop.jar ## 重启相关服务 -当需要使用以下组件访问 JuiceFS 数据时,需要重启相关服务 +当需要使用以下组件访问 JuiceFS 数据时,需要重启相关服务。 -**注意:在重启之前需要保证 JuiceFS 配置已经写入配置文件,通常可以查看机器上各组件配置的 core-site.xml 里面是否有 JuiceFS 相关配置** +**注意:在重启之前需要保证 JuiceFS 配置已经写入配置文件,通常可以查看机器上各组件配置的 `core-site.xml` 里面是否有 JuiceFS 相关配置。** | 组件名 | 服务名 | | ------ | -------------------------- | @@ -156,13 +156,9 @@ $HADOOP_COMMON_HOME/lib/juicefs-hadoop.jar | Impala | Catalog Server
Daemon | | HBase | Master
RegionServer | -HDFS,HUE,ZooKeeper 等服务无需重启 +HDFS、Hue、ZooKeeper 等服务无需重启。 -重启后,访问 JuiceFS 如果出现 `Class io.juicefs.JuiceFileSystem not found` 或者 `No FilesSystem for scheme: jfs`,可以参考 [FAQ](#faq) - -```bash -lsof -``` +重启后,访问 JuiceFS 如果出现 `Class io.juicefs.JuiceFileSystem not found` 或者 `No FilesSystem for scheme: jfs`,可以参考 [FAQ](#faq)。 ## 验证 @@ -186,10 +182,10 @@ CREATE TABLE IF NOT EXISTS person ### 出现 `Class io.juicefs.JuiceFileSystem not found` 异常 -出现这个异常的原因是,juicefs-hadoop.jar 没有被加载,可以用过 `lsof -p {pid} | grep juicefs` 查看 jar 文件是否被加载。需要检查 jar 文件是否被正确的放置在各个组件的 classpath 里面,并且保证 jar 文件有可读权限。 +出现这个异常的原因是 juicefs-hadoop.jar 没有被加载,可以用 `lsof -p {pid} | grep juicefs` 查看 JAR 文件是否被加载。需要检查 JAR 文件是否被正确地放置在各个组件的 classpath 里面,并且保证 JAR 文件有可读权限。 另外在某些发行版 Hadoop 环境,需要修改 `mapred-site.xml` 里面的 `mapreduce.application.classpath` 参数,增加 juicefs-hadoop.jar 的路径。 ### 出现 `No FilesSystem for scheme: jfs` 异常 -出现这个异常的原因是 core-site.xml 里面的 JuiceFS 配置没有被读取到,需要检查组件配置的 core-site 里面是否有 JuiceFS 相关配置。 +出现这个异常的原因是 `core-site.xml` 里面的 JuiceFS 配置没有被读取到,需要检查组件配置的 `core-site.xml` 里面是否有 JuiceFS 相关配置。 -- GitLab