提交 651f9b09 编写于 作者: 噼里啪啦嘣

11 完成

上级 0ab7590b
# Configuring Dependencies, Connectors, Libraries
# 配置依赖项、连接器、库
Every Flink application depends on a set of Flink libraries. At the bare minimum, the application depends on the Flink APIs. Many applications depend in addition on certain connector libraries (like Kafka, Cassandra, etc.). When running Flink applications (either in a distributed deployment, or in the IDE for testing), the Flink runtime library must be available as well.
每个Flink应用程序都依赖于一组Flink库。至少,应用程序依赖于Flink API。许多应用程序还依赖于某些连接器库(如Kafka、Cassandra等)。在运行Flink应用程序时(无论是在分布式部署中还是在用于测试的IDE中),Flink运行时库也必须可用。
## Flink核心与应用依赖关系
与大多数运行用户定义应用程序的系统一样,Flink中有两大类依赖关系和库:
## Flink Core and Application Dependencies
* **Flink核心依赖关系**: Flink本身包括一组运行系统所需的类和依赖关系,例如协调、联网、检查点、故障转移、API、操作(如窗口)、资源管理等。所有这些类和依赖项的集合构成Flink运行时的核心,并且在启动Flink应用程序时必须存在。
As with most systems that run user-defined applications, there are two broad categories of dependencies and libraries in Flink:
这些核心类和依赖项被打包在 `flink-dist` jar中。它们是Flinks `lib` 文件夹的一部分,也是Flink容器基本图像的一部分。认为这些依赖关系类似于Java的核心库(`rt.jar`、`charset.jar`等)。),其中包含`String`和`List`等类。
* **Flink Core Dependencies**: Flink itself consists of a set of classes and dependencies that are needed to run the system, for example coordination, networking, checkpoints, failover, APIs, operations (such as windowing), resource management, etc. The set of all these classes and dependencies forms the core of Flink’s runtime and must be present when a Flink application is started.
Flink核心依赖项不包含任何连接器或库(CEP、SQL、ML等)为了避免默认情况下类路径中有过多的依赖项和类。事实上,我们试图保持核心依赖项尽可能小,以保持默认类路径的小,并避免依赖冲突。
These core classes and dependencies are packaged in the `flink-dist` jar. They are part of Flink’s `lib` folder and part of the basic Flink container images. Think of these dependencies as similar to Java’s core library (`rt.jar`, `charsets.jar`, etc.), which contains the classes like `String` and `List`.
* The **用户应用程序依赖项** 是特定用户应用程序需要的所有连接器、格式或库。
The Flink Core Dependencies do not contain any connectors or libraries (CEP, SQL, ML, etc.) in order to avoid having an excessive number of dependencies and classes in the classpath by default. In fact, we try to keep the core dependencies as slim as possible to keep the default classpath small and avoid dependency clashes.
用户应用程序通常打包到 _application jar_ 中,其中包含应用程序代码以及所需的连接器和库依赖项。
* The **User Application Dependencies** are all connectors, formats, or libraries that a specific user application needs.
用户应用程序依赖关系显式不包括FlinkDataSet/DataStreamAPI和运行时依赖关系,因为它们已经是Flink的CoreDependencies的一部分。
## 设置项目:基本依赖项
The user application is typically packaged into an _application jar_, which contains the application code and the required connector and library dependencies.
The user application dependencies explicitly do not include the Flink DataSet / DataStream APIs and runtime dependencies, because those are already part of Flink’s Core Dependencies.
## Setting up a Project: Basic Dependencies
Every Flink application needs as the bare minimum the API dependencies, to develop against. For Maven, you can use the [Java Project Template](//ci.apache.org/projects/flink/flink-docs-release-1.7/dev/projectsetup/java_api_quickstart.html) or [Scala Project Template](//ci.apache.org/projects/flink/flink-docs-release-1.7/dev/projectsetup/scala_api_quickstart.html) to create a program skeleton with these initial dependencies.
When setting up a project manually, you need to add the following dependencies for the Java/Scala API (here presented in Maven syntax, but the same dependencies apply to other build tools (Gradle, SBT, etc.) as well.
每个Flink应用程序都需要作为API依赖关系的最小值来进行开发。对于Maven,您可以使用[JavaProject Template](//ci.apache.org/projects/flink/flink-docs-release-1.7/dev/projectsetup/java_api_quickstart.html)[ScalaProject Template](//ci.apache.org/projects/flink/flink-docs-release-1.7/dev/projectsetup/scala_api_quickstart.html)]来创建具有这些初始依赖项的程序框架。
手动设置项目时,您需要为Java/ScalaAPI添加以下依赖项(这里是Maven语法,但同样的依赖项也适用于其他构建工具(Gradle、SBT等)。也是。
<figure class="highlight">
```
<dependency>
......@@ -43,9 +40,9 @@ When setting up a project manually, you need to add the following dependencies f
</dependency>
```
</figure>
<figure class="highlight">
```
<dependency>
......@@ -62,21 +59,21 @@ When setting up a project manually, you need to add the following dependencies f
</dependency>
```
</figure>
**重要:** 请注意,所有这些依赖项的作用域都设置为_provided_。这意味着需要它们来编译,但是不应该将它们打包到项目的应用程序jar文件中-这些依赖项是FlinkCore依赖项,这些依赖项在任何设置中都已经可用。
**Important:** Please note that all these dependencies have their scope set to _provided_. That means that they are needed to compile against, but that they should not be packaged into the project’s resulting application jar file - these dependencies are Flink Core Dependencies, which are already available in any setup.
强烈建议将依赖项保持在作用域 _provided_ 中。如果它们不设置为 _provided_ ,最好的情况是结果JAR变得过大,因为它还包含所有Flink核心依赖项。最坏的情况是,添加到应用程序的jar文件中的Flink核心依赖项与您自己的一些依赖版本发生冲突(通常通过倒类加载来避免)。
It is highly recommended to keep the dependencies in scope _provided_. If they are not set to _provided_, the best case is that the resulting JAR becomes excessively large, because it also contains all Flink core dependencies. The worst case is that the Flink core dependencies that are added to the application’s jar file clash with some of your own dependency versions (which is normally avoided through inverted classloading).
**关于IntelliJ:** 要使应用程序在IntelliJIDEA中运行,Flink依赖关系需要在 SCOPE_COMPILE__ 而不是 _REVITY_ 中声明。否则,IntelliJ将不会将它们添加到类路径中,使用“NoClassDefFountError”执行in-IDE将失败。为了避免将依赖关系范围声明为_COMPILE_(不建议这样做,见上文),上面链接的Java和Scala项目模板使用了一个技巧:它们添加了一个配置文件,当应用程序在IntelliJ中运行时有选择地激活它,然后将依赖关系提升到Scope_COMPILE_,而不影响JAR文件的打包。
**Note on IntelliJ:** To make the applications run within IntelliJ IDEA, the Flink dependencies need to be declared in scope _compile_ rather than _provided_. Otherwise IntelliJ will not add them to the classpath and the in-IDE execution will fail with a `NoClassDefFountError`. To avoid having to declare the dependency scope as _compile_ (which is not recommended, see above), the above linked Java- and Scala project templates use a trick: They add a profile that selectively activates when the application is run in IntelliJ and only then promotes the dependencies to scope _compile_, without affecting the packaging of the JAR files.
## 添加连接器和库依赖关系
## Adding Connector and Library Dependencies
Most applications need specific connectors or libraries to run, for example a connector to Kafka, Cassandra, etc. These connectors are not part of Flink’s core dependencies and must hence be added as dependencies to the application
Below is an example adding the connector for Kafka 0.10 as a dependency (Maven syntax):
大多数应用程序需要特定的连接器或库来运行,例如卡夫卡、卡桑德拉等的连接器。这些连接器不是Flink的核心依赖项的一部分,因此必须作为依赖项添加到应用程序中。
下面是一个示例,将Kafka 0.10的连接器添加为依赖项(Maven语法):
<figure class="highlight">
```
<dependency>
......@@ -86,45 +83,35 @@ Below is an example adding the connector for Kafka 0.10 as a dependency (Maven s
</dependency>
```
</figure>
我们建议将应用程序代码及其所有必需的依赖项打包到一个 _jar-with-dependencies_,我们称之为 _application jar_。应用程序JAR可以提交给已经在运行的Flink集群,也可以添加到Flink应用程序容器映像中。
We recommend to package the application code and all its required dependencies into one _jar-with-dependencies_ which we refer to as the _application jar_. The application jar can be submitted to an already running Flink cluster, or added to a Flink application container image.
Projects created from the [Java Project Template](//ci.apache.org/projects/flink/flink-docs-release-1.7/dev/projectsetup/java_api_quickstart.html) or [Scala Project Template](//ci.apache.org/projects/flink/flink-docs-release-1.7/dev/projectsetup/scala_api_quickstart.html) are configured to automatically include the application dependencies into the application jar when running `mvn clean package`. For projects that are not set up from those templates, we recommend to add the Maven Shade Plugin (as listed in the Appendix below) to build the application jar with all required dependencies.
**Important:** For Maven (and other build tools) to correctly package the dependencies into the application jar, these application dependencies must be specified in scope _compile_ (unlike the core dependencies, which must be specified in scope _provided_).
## Scala Versions
[JavaProject Template](//ci.apache.org/projects/flink/flink-docs-release-1.7/dev/projectsetup/java_api_quickstart.html)[scala项目Template](//ci.apache.org/projects/flink/flink-docs-release-1.7/dev/projectsetup/scala_api_quickstart.html)]创建的项目被配置为在运行 `mvn clean package` 时自动将应用程序依赖项包含到应用程序JAR中。对于没有从这些模板中设置的项目,我们建议添加Maven Shade插件(如下面的附录所示),以构建具有所有所需依赖项的应用程序JAR。
Scala versions (2.10, 2.11, 2.12, etc.) are not binary compatible with one another. For that reason, Flink for Scala 2.11 cannot be used with an application that uses Scala 2.12.
**重要:** 要使Maven(和其他构建工具)正确地将依赖项打包到应用程序JAR中,这些应用程序依赖项必须在SCOPE_COMPILE_(与核心依赖项不同,必须在Scope_Providing_)中指定)。
All Flink dependencies that (transitively) depend on Scala are suffixed with the Scala version that they are built for, for example `flink-streaming-scala_2.11`.
## Scala 版本
Developers that only use Java can pick any Scala version, Scala developers need to pick the Scala version that matches their application’s Scala version.
Scala版本(2.10、2.11、2.12等)不是二进制兼容的。因此,Scala2.11的Flink不能与使用Scala2.12的应用程序一起使用。
Please refer to the [build guide](//ci.apache.org/projects/flink/flink-docs-release-1.7/flinkDev/building.html#scala-versions) for details on how to build Flink for a specific Scala version.
**Note:** Because of major breaking changes in Scala 2.12, Flink 1.5 currently builds only for Scala 2.11. We aim to add support for Scala 2.12 in the next versions.
依赖于Scala的所有Flink依赖关系都是Suffix的,Scala版本为它们构建,例如`flink-streaming-scala_2.11`
只有Java的开发人员可以选择任何Scala版本,Scala开发人员需要选择与其应用程序Scala版本匹配的Scala版本。
有关如何为特定Scala版本构建Flink的详细信息,请参阅[Build guide](//ci.apache.org/projects/flink/flink-docs-release-1.7/flinkDev/building.html#scala-versions)]。
**Note:** 由于Scala2.12中的重大中断更改,Flink 1.5目前只为Scala2.11构建。我们的目标是在下一个版本中增加对Scala2.12的支持。
## Hadoop Dependencies
**General rule: It should never be necessary to add Hadoop dependencies directly to your application.** _(The only exception being when using existing Hadoop input-/output formats with Flink’s Hadoop compatibility wrappers)_
If you want to use Flink with Hadoop, you need to have a Flink setup that includes the Hadoop dependencies, rather than adding Hadoop as an application dependency. Please refer to the [Hadoop Setup Guide](//ci.apache.org/projects/flink/flink-docs-release-1.7/ops/deployment/hadoop.html) for details.
There are two main reasons for that design:
* Some Hadoop interaction happens in Flink’s core, possibly before the user application is started, for example setting up HDFS for checkpoints, authenticating via Hadoop’s Kerberos tokens, or deployment on YARN.
* Flink’s inverted classloading approach hides many transitive dependencies from the core dependencies. That applies not only to Flink’s own core dependencies, but also to Hadoop’s dependencies when present in the setup. That way, applications can use different versions of the same dependencies without running into dependency conflicts (and trust us, that’s a big deal, because Hadoops dependency tree is huge.)
If you need Hadoop dependencies during testing or development inside the IDE (for example for HDFS access), please configure these dependencies similar to the scope of the dependencies to _test_ or to _provided_.
## Appendix: Template for building a Jar with Dependencies
To build an application JAR that contains all dependencies required for declared connectors and libraries, you can use the following shade plugin definition:
**一般规则:永远不必将Hadoop依赖项直接添加到应用程序中。** _(唯一的例外是在Flink的Hadoop兼容性包装器中使用现有Hadoop输入/输出格式时)_
如果您想在Hadoop中使用Flink,您需要一个包含Hadoop依赖项的Flink设置,而不是将Hadoop作为应用程序依赖项添加。有关详细信息,请参阅[Hadoop设置Guide](//ci.apache.org/projects/flink/flink-docs-release-1.7/ops/deployment/hadoop.html)]。
这种设计有两个主要原因:
* 一些Hadoop交互发生在Flink的核心中,可能是在用户应用程序启动之前,例如为检查点设置HDFS、通过Hadoop的Kerberos令牌进行身份验证或在纱线上部署。
* Flink的反向类加载方法从核心依赖项中隐藏了许多传递依赖项。这不仅适用于Flink自己的核心依赖项,而且也适用于Hadoop在安装过程中的依赖关系。这样,应用程序可以使用相同依赖项的不同版本,而不会遇到依赖冲突(相信我们,这很重要,因为Hadoops依赖树很大)。
如果在IDE内部测试或开发过程中需要Hadoop依赖项(例如,对于HDFS访问),请将这些依赖项配置为与依赖关系范围类似的 _test_ 或 _Providing_ 。
## 附录:用依赖关系构建JAR的模板
要构建包含声明的连接器和库所需的所有依赖项的应用程序JAR,可以使用以下阴影插件定义:
<figure class="highlight">
```
<build>
......@@ -172,5 +159,5 @@ To build an application JAR that contains all dependencies required for declared
</build>
```
</figure>
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册