hadoop_java_sdk.md 23.0 KB
Newer Older
1 2
# Use JuiceFS Hadoop Java SDK

3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
* [Hadoop Compatibility](#hadoop-compatibility)
* [Compiling](#compiling)
* [Deploy JuiceFS Hadoop Java SDK](#deploy-juicefs-hadoop-java-sdk)
  + [Hadoop Distribution](#hadoop-distribution)
  + [Community Components](#community-components)
* [Configurations](#configurations)
  + [Core Configurations](#core-configurations)
  + [Cache Configurations](#cache-configurations)
  + [I/O Configurations](#io-configurations)
  + [Other Configurations](#other-configurations)
  + [Configurations Example](#configurations-example)
  + [Configuration in Hadoop](#configuration-in-hadoop)
    - [CDH6](#cdh6)
    - [HDP](#hdp)
  + [Configuration in Flink](#configuration-in-flink)
* [Restart Services](#restart-services)
* [Verification](#verification)
  + [Hadoop](#hadoop)
  + [Hive](#hive)
* [Metrics](#metrics)
* [Benchmark](#benchmark)
  + [Local Benchmark](#local-benchmark)
  + [Distributed Benchmark](#distributed-benchmark)
* [FAQ](#faq)
S
Suave Su 已提交
27

28 29
JuiceFS provides [Hadoop-compatible FileSystem](https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/filesystem/introduction.html) by Hadoop Java SDK to support variety of components in Hadoop ecosystem.

30 31
> **NOTICE**:
>
32
> JuiceFS use local mapping of user and UID. So, you should [sync all the needed users and their UIDs](sync_accounts_between_multiple_hosts.md) across the whole Hadoop cluster to avoid permission error. You can also specify a global user list and user group file, please refer to the [relevant configurations](#other-configurations).
33 34
>
> JuiceFS Hadoop Java SDK need extra 4 * ``juicefs.memory-size`` off-heap memory at most. By default, up to 1.2 GB of additional memory is required (depends on write load).
35

36 37 38 39
## Hadoop Compatibility

JuiceFS Hadoop Java SDK is compatible with Hadoop 2.x and Hadoop 3.x. As well as variety of components in Hadoop ecosystem.

40 41 42
In order to make JuiceFS works with other components, it usually takes 2 steps:

1. Put JAR file into the classpath of each Hadoop ecosystem component.
43
2. Put JuiceFS configurations into the configuration file of each Hadoop ecosystem component (usually `core-site.xml`).
44

45 46 47 48
## Compiling

You need first installing Go 1.13+, JDK 8+ and Maven, then run following commands:

C
Changjian Gao 已提交
49
> **Note**: If Ceph RADOS is used to store data, you need to install librados-dev and build `libjfs.so` with `-tag ceph`.
50

51 52
> **Tip**: For users in China, it's recommended to set a local Maven mirror to speed-up compilation, e.g. [Aliyun Maven Mirror](https://maven.aliyun.com).

53 54 55 56 57 58 59 60 61 62 63
### Linux or macOS

> **Note**: The built SDK contains native code, it could only be deployed to same operating system as it be compiled. For example, if you compile SDK in Linux then you must deploy it to Linux. For better compatability, please use older version glibc if possible.

  ```shell
  $ cd sdk/java
  $ make
  ```

### Windows

C
Changjian Gao 已提交
64
Righ now, you can cross compile the SDK in Linux or macOS, please install [mingw-w64](https://www.mingw-w64.org/) first.
65 66 67 68 69 70 71

  ```shell
  $ cd sdk/java
  $ make win
  ```


72 73 74 75
## Deploy JuiceFS Hadoop Java SDK

After compiling you could find the JAR file in `sdk/java/target` directory, e.g. `juicefs-hadoop-0.10.0.jar`. Beware that file with `original-` prefix, it doesn't contain third-party dependencies. It's recommended to use the JAR file with third-party dependencies.

76
> **Note**: The SDK could only be deployed to same operating system as it be compiled. For example, if you compile SDK in Linux then you must deploy it to Linux.
77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117

Then put the JAR file and `$JAVA_HOME/lib/tools.jar` to the classpath of each Hadoop ecosystem component. It's recommended create a symbolic link to the JAR file. The following tables describe where the SDK be placed.

### Hadoop Distribution

| Name              | Installing Paths                                                                                                                                                                                                                                                                                                           |
| ----              | ----------------                                                                                                                                                                                                                                                                                                           |
| CDH               | `/opt/cloudera/parcels/CDH/lib/hadoop/lib`<br>`/opt/cloudera/parcels/CDH/spark/jars`<br>`/var/lib/impala`                                                                                                                                                                                                                  |
| HDP               | `/usr/hdp/current/hadoop-client/lib`<br>`/usr/hdp/current/hive-client/auxlib`<br>`/usr/hdp/current/spark2-client/jars`                                                                                                                                                                                                     |
| Amazon EMR        | `/usr/lib/hadoop/lib`<br>`/usr/lib/spark/jars`<br>`/usr/lib/hive/auxlib`                                                                                                                                                                                                                                                   |
| Alibaba Cloud EMR | `/opt/apps/ecm/service/hadoop/*/package/hadoop*/share/hadoop/common/lib`<br>`/opt/apps/ecm/service/spark/*/package/spark*/jars`<br>`/opt/apps/ecm/service/presto/*/package/presto*/plugin/hive-hadoop2`<br>`/opt/apps/ecm/service/hive/*/package/apache-hive*/lib`<br>`/opt/apps/ecm/service/impala/*/package/impala*/lib` |
| Tencent Cloud EMR | `/usr/local/service/hadoop/share/hadoop/common/lib`<br>`/usr/local/service/presto/plugin/hive-hadoop2`<br>`/usr/local/service/spark/jars`<br>`/usr/local/service/hive/auxlib`                                                                                                                                              |
| UCloud UHadoop    | `/home/hadoop/share/hadoop/common/lib`<br>`/home/hadoop/hive/auxlib`<br>`/home/hadoop/spark/jars`<br>`/home/hadoop/presto/plugin/hive-hadoop2`                                                                                                                                                                             |
| Baidu Cloud EMR   | `/opt/bmr/hadoop/share/hadoop/common/lib`<br>`/opt/bmr/hive/auxlib`<br>`/opt/bmr/spark2/jars`                                                                                                                                                                                                                              |

### Community Components

| Name   | Installing Paths                     |
| ----   | ----------------                     |
| Spark  | `${SPARK_HOME}/jars`                 |
| Presto | `${PRESTO_HOME}/plugin/hive-hadoop2` |
| Flink  | `${FLINK_HOME}/lib`                  |

## Configurations

### Core Configurations

| Configuration                    | Default Value                | Description                                                                                                                                               |
| -------------                    | -------------                | -----------                                                                                                                                               |
| `fs.jfs.impl`                    | `io.juicefs.JuiceFileSystem` | The FileSystem implementation for `jfs://` URIs. If you wanna use different schema (e.g. `cfs://`), you could rename this configuration to `fs.cfs.impl`. |
| `fs.AbstractFileSystem.jfs.impl` | `io.juicefs.JuiceFS`         |                                                                                                                                                           |
| `juicefs.meta`                   |                              | Redis URL. Its format is `redis://<user>:<password>@<host>:<port>/<db>`.                                                                                  |
| `juicefs.accesskey`              |                              | Access key of object storage. See [this document](how_to_setup_object_storage.md) to learn how to get access key for different object storage.            |
| `juicefs.secretkey`              |                              | Secret key of object storage. See [this document](how_to_setup_object_storage.md) to learn how to get secret key for different object storage.            |

### Cache Configurations

| Configuration                | Default Value | Description                                                                                                                                                                                                                                                                                           |
| -------------                | ------------- | -----------                                                                                                                                                                                                                                                                                           |
| `juicefs.cache-dir`          |               | Directory paths of local cache. Use colon to separate multiple paths. Also support wildcard in path. **It's recommended create these directories manually and set `0777` permission so that different applications could share the cache data.**                                                      |
| `juicefs.cache-size`         | 0             | Maximum size of local cache in MiB. It's the total size when set multiple cache directories.                                                                                                                                                                                                          |
118
| `juicefs.cache-full-block`   | `true`        | Whether cache every read blocks, `false` means only cache random/small read blocks.                                                                                                                                                                                                                   |
119
| `juicefs.free-space`         | 0.1           | Min free space ratio of cache directory                                                                                                                                                                                                                                                               |
120 121
| `juicefs.discover-nodes-url` |               | The URL to discover cluster nodes, refresh every 10 minutes.<br /><br />YARN: `yarn`<br />Spark Standalone: `http://spark-master:web-ui-port/json/`<br />Spark ThriftServer: `http://thrift-server:4040/api/v1/applications/`<br />Presto: `http://coordinator:discovery-uri-port/v1/service/presto/` |

122 123
### I/O Configurations

124 125 126 127 128 129 130 131 132
| Configuration            | Default Value | Description                                     |
| -------------            | ------------- | -----------                                     |
| `juicefs.max-uploads`    | 20            | The max number of connections to upload         |
| `juicefs.get-timeout`    | 5             | The max number of seconds to download an object |
| `juicefs.put-timeout`    | 60            | The max number of seconds to upload an object   |
| `juicefs.memory-size`    | 300           | Total read/write buffering in MiB               |
| `juicefs.prefetch`       | 1             | Prefetch N blocks in parallel                   |
| `juicefs.upload-limit`   | 0             | Bandwidth limit for upload in Mbps              |
| `juicefs.download-limit` | 0             | Bandwidth limit for download in Mbps            |
133

134 135
### Other Configurations

136 137 138 139 140
| Configuration             | Default Value | Description                                                                                                                                                                 |
| -------------             | ------------- | -----------                                                                                                                                                                 |
| `juicefs.debug`           | `false`       | Whether enable debug log                                                                                                                                                    |
| `juicefs.access-log`      |               | Access log path. Ensure Hadoop application has write permission, e.g. `/tmp/juicefs.access.log`. The log file will rotate  automatically to keep at most 7 files.           |
| `juicefs.superuser`       | `hdfs`        | The super user                                                                                                                                                              |
T
tangyoupeng 已提交
141 142
| `juicefs.users`           | `null`        | The path of username and UID list file, e.g. `jfs://name/etc/users`. The file format is `<username>:<UID>`, one user per line.                                              |
| `juicefs.groups`          | `null`        | The path of group name, GID and group members list file, e.g. `jfs://name/etc/groups`. The file format is `<group-name>:<GID>:<username1>,<username2>`, one group per line. |
143 144 145 146 147 148
| `juicefs.umask`           | `null`        | The umask used when creating files and directories (e.g. `0022`), default value is `fs.permissions.umask-mode`.                                                             |
| `juicefs.push-gateway`    |               | [Prometheus Pushgateway](https://github.com/prometheus/pushgateway) address, format is `<host>:<port>`.                                                                     |
| `juicefs.push-interval`   | 10            | Prometheus push interval in seconds                                                                                                                                         |
| `juicefs.push-auth`       |               | [Prometheus basic auth](https://prometheus.io/docs/guides/basic-auth) information, format is `<username>:<password>`.                                                       |
| `juicefs.fast-resolve`    | `true`        | Whether enable faster metadata lookup using Redis Lua script                                                                                                                |
| `juicefs.no-usage-report` | `false`       | Whether disable usage reporting. JuiceFS only collects anonymous usage data (e.g. version number), no user or any sensitive data will be collected.                         |
149 150 151 152 153 154 155 156 157 158 159 160

When you use multiple JuiceFS file systems, all these configurations could be set to specific file system alone. You need put file system name in the middle of configuration, for example (replace `{JFS_NAME}` with appropriate value):

```xml
<property>
  <name>juicefs.{JFS_NAME}.meta</name>
  <value>redis://host:port/1</value>
</property>
```

### Configurations Example

161
> **Note**: Replace `{HOST}`, `{PORT}` and `{DB}` in `juicefs.meta` with appropriate values.
162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193

```xml
<property>
  <name>fs.jfs.impl</name>
  <value>io.juicefs.JuiceFileSystem</value>
</property>
<property>
  <name>fs.AbstractFileSystem.jfs.impl</name>
  <value>io.juicefs.JuiceFS</value>
</property>
<property>
  <name>juicefs.meta</name>
  <value>redis://{HOST}:{PORT}/{DB}</value>
</property>
<property>
  <name>juicefs.cache-dir</name>
  <value>/data*/jfs</value>
</property>
<property>
  <name>juicefs.cache-size</name>
  <value>1024</value>
</property>
<property>
  <name>juicefs.access-log</name>
  <value>/tmp/juicefs.access.log</value>
</property>
```

### Configuration in Hadoop

Add configurations to `core-site.xml`.

S
Suave Su 已提交
194
#### CDH6
195 196 197 198 199 200 201 202 203 204 205 206 207 208 209

Besides `core-site`, you also need to configure `mapreduce.application.classpath` of the YARN component, add:

```shell
$HADOOP_COMMON_HOME/lib/juicefs-hadoop.jar
```

#### HDP

Besides `core-site`, you also need to configure `mapreduce.application.classpath` of the MapReduce2 component, add (variables do not need to be replaced):

```shell
/usr/hdp/${hdp.version}/hadoop/lib/juicefs-hadoop.jar
```

210 211 212 213
### Configuration in Flink

Add configurations to `conf/flink-conf.yaml`. You could only setup Flink client without modify configurations in Hadoop.

214 215
## Restart Services

216
When the following components need to access JuiceFS, they should be restarted.
217

218
> **Note**: Before restart, you need to confirm JuiceFS related configuration has been written to the configuration file of each component, usually you can find them in `core-site.xml` on the machine where the service of the component was deployed.
219

220 221 222 223 224 225 226
| Components | Services                   |
| ---------- | --------                   |
| Hive       | HiveServer<br />Metastore  |
| Spark      | ThriftServer               |
| Presto     | Coordinator<br />Worker    |
| Impala     | Catalog Server<br />Daemon |
| HBase      | Master<br />RegionServer   |
227

228
HDFS, Hue, ZooKeeper and other services don't need to be restarted.
229

230
When `Class io.juicefs.JuiceFileSystem not found` or `No FilesSystem for scheme: jfs` exceptions was occurred after restart, reference [FAQ](#faq).
231

232 233 234 235 236 237 238 239
## Verification

### Hadoop

```bash
$ hadoop fs -ls jfs://{JFS_NAME}/
```

240 241
> **Note**: The `JFS_NAME` is the volume name when you format JuiceFS file system.

242 243 244 245 246 247 248 249 250
### Hive

```sql
CREATE TABLE IF NOT EXISTS person
(
  name STRING,
  age INT
) LOCATION 'jfs://{JFS_NAME}/tmp/person';
```
251

252 253
## Metrics

254
JuiceFS Hadoop Java SDK supports reporting metrics to [Prometheus Pushgateway](https://github.com/prometheus/pushgateway), then you can use [Grafana](https://grafana.com) and [dashboard template](grafana_template.json) to visualize these metrics.
255 256 257 258 259 260 261 262 263 264 265

Enable metrics reporting through following configurations:

```xml
<property>
  <name>juicefs.push-gateway</name>
  <value>host:port</value>
</property>
```

> **Note**: Each process using JuiceFS Hadoop Java SDK will have a unique metric, and Pushgateway will always remember all the collected metrics, resulting in the continuous accumulation of metrics and taking up too much memory, which will also slow down Prometheus crawling metrics. It is recommended to clean up metrics which `job` is `juicefs` on Pushgateway regularly. It is recommended to use the following command to clean up once every hour. The running Hadoop Java SDK will continue to update after the metrics are cleared, which basically does not affect the use.
266 267 268 269
>
> ```bash
> $ curl -X DELETE http://host:9091/metrics/job/juicefs
> ```
270

271
For a description of all monitoring metrics, please refer to [JuiceFS Metrics](p8s_metrics.md).
272

273 274 275 276
## Benchmark

JuiceFS provides some benchmark tools for you when JuiceFS has been deployed

277
### Local Benchmark
278 279 280 281 282
#### Meta

- create

  ```shell
T
tangyoupeng 已提交
283
  hadoop jar juicefs-hadoop.jar nnbench create -files 10000 -baseDir jfs://{JFS_NAME}/tmp/benchmarks/NNBench -local
284 285 286 287 288 289 290
  ```

  It creates 10000 empty files without write data

- open

  ```shell
T
tangyoupeng 已提交
291
  hadoop jar juicefs-hadoop.jar nnbench open -files 10000 -baseDir jfs://{JFS_NAME}/tmp/benchmarks/NNBench -local
292 293 294 295 296 297 298
  ```

  It opens 10000 files without read data

- rename

  ```shell
T
tangyoupeng 已提交
299
  hadoop jar juicefs-hadoop.jar nnbench rename -files 10000 -baseDir jfs://{JFS_NAME}/tmp/benchmarks/NNBench -local
300 301 302 303 304
  ```

- delete

  ```shell
T
tangyoupeng 已提交
305
  hadoop jar juicefs-hadoop.jar nnbench delete -files 10000 -baseDir jfs://{JFS_NAME}/tmp/benchmarks/NNBench -local
306 307 308 309
  ```

- for reference

310 311
| Operation | TPS  | Delay (ms) |
| --------- | ---  | ---------- |
T
tangyoupeng 已提交
312 313 314 315
| create | 644  | 1.55       |
| open   | 3467 | 0.29       |
| rename | 483  | 2.07       |
| delete | 506  | 1.97       |
316

S
Suave Su 已提交
317
#### I/O Performance
318 319 320 321

- sequential write

  ```shell
T
tangyoupeng 已提交
322
  hadoop jar juicefs-hadoop.jar dfsio -write -size 20000 -baseDir jfs://{JFS_NAME}/tmp/benchmarks/DFSIO -local
323 324 325 326 327
  ```

- sequential read

  ```shell
T
tangyoupeng 已提交
328
  hadoop jar juicefs-hadoop.jar dfsio -read -size 20000 -baseDir jfs://{JFS_NAME}/tmp/benchmarks/DFSIO -local
329 330 331 332 333 334
  ```

  When run the cmd for the second time, the result may be much better than the first run. It's because the data was cached in memory, just clean the local disk cache.

- for reference

335 336
| Operation | Throughput (MB/s) |
| --------- | ----------------- |
T
tangyoupeng 已提交
337 338
| write  | 647          |
| read   | 111          |
339

340
### Distributed Benchmark
341

342
Distributed benchmark use MapReduce program to test meta and IO throughput performance
343 344 345

Enough resources should be provided to make sure all Map task can be started at the same time

346
We use 3 4c32g ECS (5Gbit/s) and Aliyun Redis 5.0 4G redis for the benchmark
347 348 349 350 351 352

#### Meta

- create

  ```shell
T
tangyoupeng 已提交
353
  hadoop jar juicefs-hadoop.jar nnbench create -maps 10 -threads 10 -files 1000 -baseDir jfs://{JFS_NAME}/tmp/benchmarks/NNBench
354 355 356 357
  ```

  10 map task, each has 10 threads, each thread create 1000 empty file. 100000 files in total

T
tangyoupeng 已提交
358
- open
359 360

  ```shell
T
tangyoupeng 已提交
361
  hadoop jar juicefs-hadoop.jar nnbench open -maps 10 -threads 10 -files 1000 -baseDir jfs://{JFS_NAME}/tmp/benchmarks/NNBench
362 363 364
  ```

  10 map task, each has 10 threads, each thread open 1000 file. 100000 files in total
365

T
tangyoupeng 已提交
366
- rename
367 368

  ```shell
T
tangyoupeng 已提交
369
  hadoop jar juicefs-hadoop.jar nnbench rename -maps 10 -threads 10 -files 1000 -baseDir jfs://{JFS_NAME}/tmp/benchmarks/NNBench
370 371 372 373 374
  ```

  10 map task, each has 10 threads, each thread rename 1000 file. 100000 files in total


T
tangyoupeng 已提交
375
- delete
376 377

  ```shell
T
tangyoupeng 已提交
378
  hadoop jar juicefs-hadoop.jar nnbench delete -maps 10 -threads 10 -files 1000 -baseDir jfs://{JFS_NAME}/tmp/benchmarks/NNBench
379 380 381
  ```

  10 map task, each has 10 threads, each thread delete 1000 file. 100000 files in total
382

383 384 385 386
- for reference

    - 10 threads

387 388
  | Operation | TPS  | Delay (ms) |
  | --------- | ---  | ---------- |
T
tangyoupeng 已提交
389 390 391 392
  | create | 4178 | 2.2        |
  | open   | 9407 | 0.8        |
  | rename | 3197 | 2.9       |
  | delete | 3060 | 3.0        |
393 394 395

    - 100 threads

396 397
  | Operation | TPS   | Delay (ms) |
  | --------- | ---   | ---------- |
T
tangyoupeng 已提交
398 399 400 401
  | create | 11773  | 7.9       |
  | open   | 34083 | 2.4        |
  | rename | 8995  | 10.8       |
  | delete | 7191  | 13.6       |
402

S
Suave Su 已提交
403
#### I/O Performance
404 405 406 407

- sequential write

  ```shell
T
tangyoupeng 已提交
408
  hadoop jar juicefs-hadoop.jar dfsio -write -maps 10 -size 10000 -baseDir jfs://{JFS_NAME}/tmp/benchmarks/DFSIO
409 410 411 412 413 414 415
  ```

  10 map task, each task write 10000MB random data sequentially

- sequential read

  ```shell
T
tangyoupeng 已提交
416
  hadoop jar juicefs-hadoop.jar dfsio -read -maps 10 -size 10000 -baseDir jfs://{JFS_NAME}/tmp/benchmarks/DFSIO
417 418 419 420 421 422 423
  ```

  10 map task, each task read 10000MB random data sequentially


- for reference

424 425
| Operation | Total Throughput (MB/s) |
| --------- | ----------------------- |
T
tangyoupeng 已提交
426 427
| write     | 1835                    |
| read      | 1234                    |
428 429


430 431 432 433
## FAQ

### `Class io.juicefs.JuiceFileSystem not found` exception

434
It means JAR file was not loaded, you can verify it by `lsof -p {pid} | grep juicefs`.
435

436
You should check whether the JAR file was located properly, or other users have the read permission.
437

438
Some Hadoop distribution also need to modify `mapred-site.xml` and put the JAR file location path to the end of the parameter `mapreduce.application.classpath`.
439 440 441

### `No FilesSystem for scheme: jfs` exception

442
It means JuiceFS Hadoop Java SDK was not configured properly, you need to check whether there is JuiceFS related configuration in the `core-site.xml` of the component configuration.