diff --git a/docs/1.7/104.md b/docs/1.7/104.md index dfd9ea48a7525a63e6d427ee68e86a19ad91124b..8e133e98d8a7d942b03e61ec221e40112a7b46b0 100644 --- a/docs/1.7/104.md +++ b/docs/1.7/104.md @@ -4,35 +4,35 @@ ## 从Flink 1.3+ 到 Flink 1.7 -### API changes for serializer snapshots +### TypeSerializer 的变化 -This would be relevant mostly for users implementing custom `TypeSerializer`s for their state. +这部分主要与实现`TypeSerializer`接口来自定义序列化的用户有关。 -The old `TypeSerializerConfigSnapshot` abstraction is now deprecated, and will be fully removed in the future in favor of the new `TypeSerializerSnapshot`. For details and guides on how to migrate, please see [Migrating from deprecated serializer snapshot APIs before Flink 1.7] (//ci.apache.org/projects/flink/flink-docs-release-1.7/dev/stream/state/custom_serialization.html#migration-from-deprecated-serializer-snapshot-apis-before-Flink-1.7). +原来的 `TypeSerializerConfigSnapshot` 抽象接口被弃用了, 并且将在将来完全删除,取而代之的是新的 `TypeSerializerSnapshot`. 详情请参考 [Migrating from deprecated serializer snapshot APIs before Flink 1.7](//ci.apache.org/projects/flink/flink-docs-release-1.7/dev/stream/state/custom_serialization.html#migration-from-deprecated-serializer-snapshot-apis-before-Flink-1.7)。 -## Migrating from Flink 1.2 to Flink 1.3 +## 从 Flink 1.2 迁移到 Flink 1.3 -There are a few APIs that have been changed since Flink 1.2\. Most of the changes are documented in their specific documentations. The following is a consolidated list of API changes and links to details for migration when upgrading to Flink 1.3. +自Flink 1.2以来,有一些API已被更改。大多数更改都记录在其特定文档中。以下是API更改的综合列表以及升级到Flink 1.3时迁移详细信息的链接。 -### `TypeSerializer` interface changes +### `TypeSerializer` 接口变化 -This would be relevant mostly for users implementing custom `TypeSerializer`s for their state. +这主要适用于自定义 `TypeSerializer`接口的用户 -Since Flink 1.3, two additional methods have been added that are related to serializer compatibility across savepoint restores. Please see [Handling serializer upgrades and compatibility](//ci.apache.org/projects/flink/flink-docs-release-1.7/dev/stream/state/custom_serialization.html#handling-serializer-upgrades-and-compatibility) for further details on how to implement these methods. +从Flink 1.3开始,添加了两个与保存点恢复的串行器兼容性相关的其他方法。 有关如何实现这些方法的更多详细信息,请参阅 [序列化升级和兼容性](//ci.apache.org/projects/flink/flink-docs-release-1.7/dev/stream/state/custom_serialization.html#handling-serializer-upgrades-and-compatibility) -### `ProcessFunction` is always a `RichFunction` +### `ProcessFunction` 是 `RichFunction` -In Flink 1.2, `ProcessFunction` and its rich variant `RichProcessFunction` was introduced. Since Flink 1.3, `RichProcessFunction` was removed and `ProcessFunction` is now always a `RichFunction` with access to the lifecycle methods and runtime context. +从Flink 1.2, `ProcessFunction` 引入,并有了多种实现例如 `RichProcessFunction` 。 从Flink 1.3,开始`RichProcessFunction` 被移除了, 现在 `ProcessFunction` 始终是 `RichFunction` 并且可以访问运行时上下文。 -### Flink CEP library API changes +### Flink CEP 库API更改 -The CEP library in Flink 1.3 ships with a number of new features which have led to some changes in the API. Please visit the [CEP Migration docs](//ci.apache.org/projects/flink/flink-docs-release-1.7/dev/libs/cep.html#migrating-from-an-older-flink-version) for details. +Flink 1.3中的CEP库新增了许多新函数,请参阅 [CEP迁移文档](//ci.apache.org/projects/flink/flink-docs-release-1.7/dev/libs/cep.html#migrating-from-an-older-flink-version) 。 -### Logger dependencies removed from Flink core artifacts +### Flink core 中移除了Logger的依赖 -In Flink 1.3, to make sure that users can use their own custom logging framework, core Flink artifacts are now clean of specific logger dependencies. +Flink1.3以后,用户可以选用自己期望的日志框架了,Flink移除了日志记录框架的依赖。 -Example and quickstart archetypes already have loggers specified and should not be affected. For other custom projects, make sure to add logger dependencies. For example, in Maven’s `pom.xml`, you can add: +实例和快速入门的demo已经指定了日志记录器,不会有问题,其他项目,请确保添加日志依赖,比如Maven的 `pom.xml`,中需要增加 @@ -52,21 +52,21 @@ Example and quickstart archetypes already have loggers specified and should not -## Migrating from Flink 1.1 to Flink 1.2 +## 从 Flink 1.1 到 Flink 1.2的迁移 -As mentioned in the [State documentation](//ci.apache.org/projects/flink/flink-docs-release-1.7/dev/stream/state/state.html), Flink has two types of state: **keyed** and **non-keyed** state (also called **operator** state). Both types are available to both operators and user-defined functions. This document will guide you through the process of migrating your Flink 1.1 function code to Flink 1.2 and will present some important internal changes introduced in Flink 1.2 that concern the deprecation of the aligned window operators from Flink 1.1 (see [Aligned Processing Time Window Operators](#aligned-processing-time-window-operators)). +正如 [状态文档](//ci.apache.org/projects/flink/flink-docs-release-1.7/dev/stream/state/state.html)中所说,Flink 有两种状态: **keyed** 和 **non-keyed** 状态 (也被称作 **operator** 状态). 这两种类型都可用于 算子和用户定义的函数。文档将指导您完成从Flink 1.1函数代码迁移到Flink 1.2的过程,并介绍Flink 1.2中引入的一些重要内部更改,这些改变涉及到Flink 1.1中对齐窗口操作的弃用。 (请参阅 时间对齐窗口算子[Aligned Processing Time Window Operators](#aligned-processing-time-window-operators)). -The migration process will serve two goals: +迁移有两个目标: -1. allow your functions to take advantage of the new features introduced in Flink 1.2, such as rescaling, +1. 引入Flink1.2中引入的新函数,比如自适应(rescaling) -2. make sure that your new Flink 1.2 job will be able to resume execution from a savepoint generated by its Flink 1.1 predecessor. +2. 确保新Flink 1.2作业能够从Flink 1.1的保存点恢复执行 -After following the steps in this guide, you will be able to migrate your running job from Flink 1.1 to Flink 1.2 simply by taking a [savepoint](//ci.apache.org/projects/flink/flink-docs-release-1.7/ops/state/savepoints.html) with your Flink 1.1 job and giving it to your Flink 1.2 job as a starting point. This will allow the Flink 1.2 job to resume execution from where its Flink 1.1 predecessor left off. +按照本指南操作可以把正在运行的Flink1.1作业迁移到Flink1.2中。前提需要在Flink1.1中使用 [保存点](//ci.apache.org/projects/flink/flink-docs-release-1.7/ops/state/savepoints.html) 并把保存点作为Flink1.2作业的起点。这样,Flink1.2就可以从之前中断的位置恢复执行了。 -### Example User Functions +### 用户函数示例 -As running examples for the remainder of this document we will use the `CountMapper` and the `BufferingSink` functions. The first is an example of a function with **keyed** state, while the second has **non-keyed** state. The code for the aforementioned two functions in Flink 1.1 is presented below: +本文档其余部分使用 `CountMapper` 和 `BufferingSink` 函数作为示例。第一个函数是 **keyed** 状态,第二个不是 s **non-keyed** 状态,Flink 1.1中上述两个函数的代码如下: @@ -137,29 +137,29 @@ public class BufferingSink implements SinkFunction>, -The `CountMapper` is a `RichFlatMapFunction` which assumes a grouped-by-key input stream of the form `(word, 1)`. The function keeps a counter for each incoming key (`ValueState<Integer> counter`) and if the number of occurrences of a certain word surpasses the user-provided threshold, a tuple is emitted containing the word itself and the number of occurrences. +`CountMapper` 是一个按表格分组输入`(word, 1)`的 `RichFlatMapFunction` , 函数为每个传入的key保存一个计数器 (`ValueState<Integer> counter`) 并且 ,如果某个单词的出现次数超过用户提供的阈值,则会发出一个包含单词本身和出现次数的元组。 -The `BufferingSink` is a `SinkFunction` that receives elements (potentially the output of the `CountMapper`) and buffers them until a certain user-specified threshold is reached, before emitting them to the final sink. This is a common way to avoid many expensive calls to a database or an external storage system. To do the buffering in a fault-tolerant manner, the buffered elements are kept in a list (`bufferedElements`) which is periodically checkpointed. + `BufferingSink` 是一个 `SinkFunction` 接收方 ( `CountMapper`可能的输出) ,直到达到用户定义的最终状态之前会一直缓存数据,这可以避免频繁的对数据库或者是存储系统的操作,通常这些操作都是比较耗时或开销比较大的。为了以容错方式进行缓冲,缓冲数据元保存在列表(`bufferedElements`) 列表会被定期被检查点保存。 -### State API Migration +### 状态 API 迁移 -To leverage the new features of Flink 1.2, the code above should be modified to use the new state abstractions. After doing these changes, you will be able to change the parallelism of your job (scale up or down) and you are guaranteed that the new version of your job will start from where its predecessor left off. +要使用Flink 1.2的新函数,应修改上面的代码来完成新的状态抽象。完成这些更改后,就可以实现作业的并行度的修改(向上或向下扩展),并确保新版本的作业将从之前作业停止的位置开始执行。 -**Keyed State:** Something to note before delving into the details of the migration process is that if your function has **only keyed state**, then the exact same code from Flink 1.1 also works for Flink 1.2 with full support for the new features and full backwards compatibility. Changes could be made just for better code organization, but this is just a matter of style. +**Keyed State:** 需要注意的是,如果代码中只有**keyed state**,那么Flink1.1的代码也适用于1.2版本,并且完全支持新函数和向下兼容。可以仅针代码格式进行更改,但这只是风格/习惯问题。 -With the above said, the rest of this section focuses on the **non-keyed state**. +综上所述,本章我们重点阐述 **non-keyed state**的迁移 -#### Rescaling and new state abstractions +#### 自适应和新状态抽象 -The first modification is the transition from the old `Checkpointed<T extends Serializable>` state interface to the new ones. In Flink 1.2, a stateful function can implement either the more general `CheckpointedFunction` interface, or the `ListCheckpointed<T extends Serializable>` interface, which is semantically closer to the old `Checkpointed` one. +第一个修改是 `Checkpointed<T extends Serializable>` 接口有了新的实现。在 Flink 1.2中,有状态函数可以实现更通用的 `CheckpointedFunction` 接口或 `ListCheckpointed<T extends Serializable>` 接口 ,和之前版本的 `Checkpointed` 类似。 -In both cases, the non-keyed state is expected to be a `List` of _serializable_ objects, independent from each other, thus eligible for redistribution upon rescaling. In other words, these objects are the finest granularity at which non-keyed state can be repartitioned. As an example, if with parallelism 1 the checkpointed state of the `BufferingSink` contains elements `(test1, 2)` and `(test2, 2)`, when increasing the parallelism to 2, `(test1, 2)` may end up in task 0, while `(test2, 2)` will go to task 1. +在这两种情况中,非键合状态预期是一个 _可序列化_ 的 `List` ,对象彼此独立,这样可以在自适应的时候重新分配,意味着, 这些对象是可以重新分区非被Keys化状态的最细粒度。例如,如果并行度为1的 `BufferingSink` 有 `(test1, 2)` 和 `(test2, 2)`两个数据,当并行度增加到2时, `(test1, 2)` 可能在task 0中,而 `(test2, 2)` 可能在task 1中。 -More details on the principles behind rescaling of both keyed state and non-keyed state can be found in the [State documentation](//ci.apache.org/projects/flink/flink-docs-release-1.7/dev/stream/state/index.html). +更详细的信息可以参阅[状态文档](//ci.apache.org/projects/flink/flink-docs-release-1.7/dev/stream/state/index.html). ##### ListCheckpointed -The `ListCheckpointed` interface requires the implementation of two methods: + `ListCheckpointed` 接口需要实现两个方法: @@ -171,7 +171,7 @@ void restoreState(List state) throws Exception; -Their semantics are the same as their counterparts in the old `Checkpointed` interface. The only difference is that now `snapshotState()` should return a list of objects to checkpoint, as stated earlier, and `restoreState` has to handle this list upon recovery. If the state is not re-partitionable, you can always return a `Collections.singletonList(MY_STATE)` in the `snapshotState()`. The updated code for `BufferingSink` is included below: +它的语义和之前的 `Checkpointed` 接口类似,唯一区别是,现在 `snapshotState()` 返回的是检查点对象列表, 如前所述, `restoreState` 必须在恢复的时候,处理这个列表。如果状态不是重新分区,可以随时返回 `Collections.singletonList(MY_STATE)` 的 `snapshotState()`。 更新的代码 `BufferingSink` 如下: @@ -226,11 +226,11 @@ public class BufferingSinkListCheckpointed implements -As shown in the code, the updated function also implements the `CheckpointedRestoring` interface. This is for backwards compatibility reasons and more details will be explained at the end of this section. +更新后的函数也实现了 `CheckpointedRestoring` 接口。这是出于向后兼容性原因,更多细节将在本节末尾解释。 ##### CheckpointedFunction -The `CheckpointedFunction` interface requires again the implementation of two methods: +`CheckpointedFunction` 接口也需要实现这两个方法。 @@ -242,7 +242,7 @@ void initializeState(FunctionInitializationContext context) throws Exception; -As in Flink 1.1, `snapshotState()` is called whenever a checkpoint is performed, but now `initializeState()` (which is the counterpart of the `restoreState()`) is called every time the user-defined function is initialized, rather than only in the case that we are recovering from a failure. Given this, `initializeState()` is not only the place where different types of state are initialized, but also where state recovery logic is included. An implementation of the `CheckpointedFunction` interface for `BufferingSink` is presented below. +在Flink 1.1中, 检查点执行会调用`snapshotState()` 方法,但是现在当用户每次初始化自定义函数时,会调用 `initializeState()` (对应 `restoreState()`) ,而不是在恢复的情况下调用,鉴于此, `initializeState()` 不仅是初始化不同类型状态的地方,而且还包括状态恢复逻辑。实现了 `CheckpointedFunction` 接口的 `BufferingSink` 代码如下所示 @@ -302,15 +302,15 @@ public class BufferingSink implements SinkFunction>, -The `initializeState` takes as argument a `FunctionInitializationContext`. This is used to initialize the non-keyed state “container”. This is a container of type `ListState` where the non-keyed state objects are going to be stored upon checkpointing: + `initializeState` 方法是需要传入 `FunctionInitializationContext`,用于初始化non-keyed 状态的 “容器”,容器的类型是 `ListState`,供 non-keyed 状态的对象被检查点存储时使用: `this.checkpointedState = context.getOperatorStateStore().getSerializableListState("buffered-elements");` -After initializing the container, we use the `isRestored()` method of the context to check if we are recovering after a failure. If this is `true`, _i.e._ we are recovering, the restore logic is applied. +初始化之后,调用 `isRestored()` 方法可以获取当前是否在恢复。如果是 `true`, 表示正在恢复。 -As shown in the code of the modified `BufferingSink`, this `ListState` recovered during state initialization is kept in a class variable for future use in `snapshotState()`. There the `ListState` is cleared of all objects included by the previous checkpoint, and is then filled with the new ones we want to checkpoint. +正如下面的代码所示,在状态初始化期间恢复 `BufferingSink`, `ListState` 中保存的变量可以被 `snapshotState()`使用, `ListState` 会清除掉之前检查点存储的对象,然后存储当前检查点的对象。 -As a side note, the keyed state can also be initialized in the `initializeState()` method. This can be done using the `FunctionInitializationContext` given as argument, instead of the `RuntimeContext`, which is the case for Flink 1.1\. If the `CheckpointedFunction` interface was to be used in the `CountMapper` example, the old `open()` method could be removed and the new `snapshotState()` and `initializeState()` methods would look like this: +当然, keyed 状态也可以在 `initializeState()` 方法中初始化, 可以使用 `FunctionInitializationContext` 来完成初始化,而不是使用Flink1.1中的 `RuntimeContext`,如果`CheckpointedFunction` 要在 `CountMapper` 中使用该接口,则可以不使用 `open()` 方法, `snapshotState()` 和 `initializeState()` 方法如下所示: @@ -352,25 +352,25 @@ public class CountMapper extends RichFlatMapFunction, Tu -Notice that the `snapshotState()` method is empty as Flink itself takes care of snapshotting managed keyed state upon checkpointing. +请注意, `snapshotState()` 方法为空,因为Flink本身负责在检查点时就会存储Keys化的对象 -#### Backwards compatibility with Flink 1.1 +#### 向后兼容Flink 1.1 -So far we have seen how to modify our functions to take advantage of the new features introduced by Flink 1.2. The question that remains is “Can I make sure that my modified (Flink 1.2) job will start from where my already running job from Flink 1.1 stopped?”. +到目前为止,我们已经了解如何修改函数来引入Flink 1.2的新函数。剩下的问题是“我可以确保我的修改后的(Flink 1.2)作业将从我从Flink 1.1运行的作业停止的位置开始吗?”。 -The answer is yes, and the way to do it is pretty straightforward. For the keyed state, you have to do nothing. Flink will take care of restoring the state from Flink 1.1\. For the non-keyed state, your new function has to implement the `CheckpointedRestoring` interface, as shown in the code above. This has a single method, the familiar `restoreState()` from the old `Checkpointed` interface from Flink 1.1\. As shown in the modified code of the `BufferingSink`, the `restoreState()` method is identical to its predecessor. +答案是肯定的,而这样做的方式非常简单。对于被Keys化状态,什么都不需要做。Flink将负责从Flink 1.1恢复状态。对于非被Keys化状态,新函数必须像上面代码一样实现 `CheckpointedRestoring` 接口,还有个办法,需要熟悉Flink1.1的 `restoreState()` 和`Checkpointed` 接口,然后修改 `BufferingSink`, `restoreState()` 方法完成和之前一样的功能。 -### Aligned Processing Time Window Operators +### 时间对齐窗口算子 -In Flink 1.1, and only when operating on _processing time_ with no specified evictor or trigger, the command `timeWindow()` on a keyed stream would instantiate a special type of `WindowOperator`. This could be either an `AggregatingProcessingTimeWindowOperator` or an `AccumulatingProcessingTimeWindowOperator`. Both of these operators are referred to as _aligned_ window operators as they assume their input elements arrive in order. This is valid when operating in processing time, as elements get as timestamp the wall-clock time at the moment they arrive at the window operator. These operators were restricted to using the memory state backend, and had optimized data structures for storing the per-window elements which leveraged the in-order input element arrival. +在Flink 1.1中,只有在没有指定的触发器的时, `timeWindow()` 才会实例化特殊的数据类型 `WindowOperator`。 它可以是 `AggregatingProcessingTimeWindowOperator` 或 `AccumulatingProcessingTimeWindowOperator`.。这两个算子都可以称之为时间对齐窗口算子,因为它们假定输入数据是按顺序到达,当在处理时间中操作时,这是有效的,元素到达窗口操作时获得时间。算子仅限于使用内存状态,并且优化了用于存储数据的元素结构。 -In Flink 1.2, the aligned window operators are deprecated, and all windowing operations go through the generic `WindowOperator`. This migration requires no change in the code of your Flink 1.1 job, as Flink will transparently read the state stored by the aligned window operators in your Flink 1.1 savepoint, translate it into a format that is compatible with the generic `WindowOperator`, and resume execution using the generic `WindowOperator`. +在Flink 1.2中,不推荐使用对齐窗口算子,并且所有窗口算子操作都通过泛型`WindowOperator`来实现。迁移不需要更改Flink 1.1作业的代码,因为Flink将读取Flink 1.1保存点中对齐的窗口 算子存储的状态,将其转换为与泛型相兼容的格式 `WindowOperator`,并使用通用的 `WindowOperator`。 -Note Although deprecated, you can still use the aligned window operators in Flink 1.2 through special `WindowAssigners` introduced for exactly this purpose. These assigners are the `SlidingAlignedProcessingTimeWindows` and the `TumblingAlignedProcessingTimeWindows` assigners, for sliding and tumbling windows respectively. A Flink 1.2 job that uses aligned windowing has to be a new job, as there is no way to resume execution from a Flink 1.1 savepoint while using these operators. +虽然是已经弃用这个方法,但是在Flink 1.2 中仍然是可以使用的,通过特殊的 `WindowAssigners` 可以实现这个目的。 `SlidingAlignedProcessingTimeWindows` 和 `TumblingAlignedProcessingTimeWindows` assigners,分别是滑动窗口和滚动窗口,使用对齐窗口的Flink 1.2作业必须是一项新作业,因为在使用这些 算子时无法从Flink 1.1保存点恢复执行。 -Attention The aligned window operators provide **no rescaling** capabilities and **no backwards compatibility** with Flink 1.1. +注意时间对齐窗口算子**不提供自适应** 而且 **不向下兼容** Flink 1.1. -The code to use the aligned window operators in Flink 1.2 is presented below: +在Flink 1.2中使用对齐窗口 算子的代码如下所示: diff --git a/docs/1.7/105.md b/docs/1.7/105.md index 0a77b56cc3e3634f993ef2bb5a447e6a1c94266a..d25846f5aa49f950cd9f595f742e5d2ff41cacce 100644 --- a/docs/1.7/105.md +++ b/docs/1.7/105.md @@ -1,35 +1,33 @@ -# Standalone Cluster +# 独立集群 -This page provides instructions on how to run Flink in a _fully distributed fashion_ on a _static_ (but possibly heterogeneous) cluster. +这里给出如何在集群上以完全分布式方式运行Flink程序的说明 . -## Requirements +## 需求 -### Software Requirements +### 软件需求 -Flink runs on all _UNIX-like environments_, e.g. **Linux**, **Mac OS X**, and **Cygwin** (for Windows) and expects the cluster to consist of **one master node** and **one or more worker nodes**. Before you start to setup the system, make sure you have the following software installed **on each node**: +Flink 需要运行在类unix环境下,比如 **Linux**, **Mac OS X**, 和 **Cygwin** (适用于Windows) 希望集群中至少由一个**主节点**和最少一个**工作节点**组成。 在开始运行前,需要保证 **每个节点**上都安装了一下软件: -* **Java 1.8.x** or higher, -* **ssh** (sshd must be running to use the Flink scripts that manage remote components) +* **Java 1.8.x** 及其以上版本, +* **ssh** (必须运行sshd才能使用管理远程组件的Flink脚本) -If your cluster does not fulfill these software requirements you will need to install/upgrade it. +如果您的群集不满足这些软件要求,则需要安装/升级软件来达到要求. -Having **passwordless SSH** and **the same directory structure** on all your cluster nodes will allow you to use our scripts to control everything. +通过 **SSH互信** 和 **相同的目录结构** 这样我们的脚本就能控制所有内容. -### `JAVA_HOME` Configuration +### `JAVA_HOME` 配置 -Flink requires the `JAVA_HOME` environment variable to be set on the master and all worker nodes and point to the directory of your Java installation. +Flink程序要求所有节点上必须设置`JAVA_HOME`环境变量,并`JAVA_HOME`环境变量指向JAVA的安装目录. -You can set this variable in `conf/flink-conf.yaml` via the `env.java.home` key. +可以通过 `conf/flink-conf.yaml` 文件中的 `env.java.home` 配置来设置配置. -## Flink Setup - -Go to the [downloads page](http://flink.apache.org/downloads.html) and get the ready-to-run package. Make sure to pick the Flink package **matching your Hadoop version**. If you don’t plan to use Hadoop, pick any version. - -After downloading the latest release, copy the archive to your master node and extract it: +## Flink 安装 +转到 [下载页](http://flink.apache.org/downloads.html) 下载Flink安装程序. 程序中如果需要使用Hadoop,则选择的Flink版本要保证和 **Hadoop 版本**相匹配。否则可以选择任何版本 +下载完毕后,复制到主节点并解压: ``` tar xzf flink-*.tgz @@ -38,17 +36,17 @@ cd flink-* -### Configuring Flink +### 配置 Flink -After having extracted the system files, you need to configure Flink for the cluster by editing _conf/flink-conf.yaml_. +解压缩之后,需要修改Flink的配置文件,配置文件名称为 _conf/flink-conf.yaml_。 -Set the `jobmanager.rpc.address` key to point to your master node. You should also define the maximum amount of main memory the JVM is allowed to allocate on each node by setting the `jobmanager.heap.mb` and `taskmanager.heap.mb` keys. +修改 `jobmanager.rpc.address` 配置值为主节点,设置 `jobmanager.heap.mb` 和 `taskmanager.heap.mb` 用来指定JVM在每个节点上分配的最大堆内存量。 -These values are given in MB. If some worker nodes have more main memory which you want to allocate to the Flink system you can overwrite the default value by setting the environment variable `FLINK_TM_HEAP` on those specific nodes. +这些设置值单位为 MB. 如果某工作节点有更多主内存可以分配给Flink系统,可以修改节点上的 `FLINK_TM_HEAP` 配置来覆盖默认值。 -Finally, you must provide a list of all nodes in your cluster which shall be used as worker nodes. Therefore, similar to the HDFS configuration, edit the file _conf/slaves_ and enter the IP/host name of each worker node. Each worker node will later run a TaskManager. +最后,必须提供集群中所有节点的列表,这些节点将用作工作节点。与HDFS配置类似, 编辑_conf/slaves_文件,并输入每个工作节点的IP /主机名. 每个工作节点将运行 _TaskManager_. -The following example illustrates the setup with three nodes (with IP addresses from _10.0.0.1_ to _10.0.0.3_ and hostnames _master_, _worker1_, _worker2_) and shows the contents of the configuration files (which need to be accessible at the same path on all machines): +下图说明了具有三个节点 (IP 地址 从_10.0.0.1_ 到 _10.0.0.3_ 和对应的主机名 _master_, _worker1_, _worker2_)的集群配置, 并显示了配置文件的内容(需要在所有计算机上的相同路径上访问)): ![](../img/quickstart_cluster.png) @@ -67,27 +65,25 @@ conf/slaves** 10.0.0.3 ``` -The Flink directory must be available on every worker under the same path. You can use a shared NFS directory, or copy the entire Flink directory to every worker node. - -Please see the [configuration page](../config.html) for details and additional configuration options. - -In particular, +Flink目录必须每个worker上同一路径下都可用。可以使用共享NFS目录,也可以将整个Flink目录复制到每个工作节点。 -* the amount of available memory per JobManager (`jobmanager.heap.mb`), -* the amount of available memory per TaskManager (`taskmanager.heap.mb`), -* the number of available CPUs per machine (`taskmanager.numberOfTaskSlots`), -* the total number of CPUs in the cluster (`parallelism.default`) and -* the temporary directories (`taskmanager.tmp.dirs`) +有关详细信息和其他配置选项,请参阅 [configuration page](../config.html) 。 -are very important configuration values. +需要额外注意的一些配置是, -### Starting Flink +* 每个JobManager的可用内存总量 (`jobmanager.heap.mb`), +* 每个TaskManager的可用内容总量 (`taskmanager.heap.mb`), +* 每台机器的可用CPU数量 (`taskmanager.numberOfTaskSlots`), +* 集群中的CPU总数 (`parallelism.default`) +* 临时目录 (`taskmanager.tmp.dirs`) -The following script starts a JobManager on the local node and connects via SSH to all worker nodes listed in the _slaves_ file to start the TaskManager on each node. Now your Flink system is up and running. The JobManager running on the local node will now accept jobs at the configured RPC port. +这些是对Flink非常重要的配置值。 -Assuming that you are on the master node and inside the Flink directory: +### 启动Flink +以下脚本在本地节点上启动JobManager,并通过ssh连接到所有_slaves_文件中定义的所有工作节点,启动TaskManger。 现在Flink系统已启动并正在运行。 在本地运行的JobManager 将通过配置的RPC端口来接受作业提交。 +A假设您在主节点上并在Flink目录中: ``` bin/start-cluster.sh @@ -95,13 +91,13 @@ bin/start-cluster.sh -To stop Flink, there is also a `stop-cluster.sh` script. +停止Flink对应的脚本是 `stop-cluster.sh`。 -### Adding JobManager/TaskManager Instances to a Cluster +### 将JobManager/TaskManager实例添加到群集 -You can add both JobManager and TaskManager instances to your running cluster with the `bin/jobmanager.sh` and `bin/taskmanager.sh` scripts. +可以使用 `bin/jobmanager.sh` 和 `bin/taskmanager.sh` 脚本将JobManager和TaskManager实例添加到正在运行的集群中. -#### Adding a JobManager +#### 添加 JobManager @@ -111,7 +107,7 @@ bin/jobmanager.sh ((start|start-foreground) [host] [webui-port])|stop|stop-all -#### Adding a TaskManager +#### 添加 TaskManager @@ -121,5 +117,5 @@ bin/taskmanager.sh start|start-foreground|stop|stop-all -Make sure to call these scripts on the hosts on which you want to start/stop the respective instance. +需要保证在对应的机器上调用启动/停止脚本。 diff --git a/docs/1.7/106.md b/docs/1.7/106.md index b325ca0b24c585630d73fe4c55d491a4c762c09b..f3be397fa7de8cfa3e7c028e26bb8df93f898990 100644 --- a/docs/1.7/106.md +++ b/docs/1.7/106.md @@ -1,12 +1,12 @@ -# YARN Setup +# YARN 配置 -## Quickstart +## 快速开始 -### Start a long-running Flink cluster on YARN +### 启动YARN -Start a YARN session with 4 Task Managers (each with 4 GB of Heapspace): +启动一个有 4 Task Managers (每个4 GB堆内存)的YARN session命令: @@ -21,11 +21,11 @@ cd flink-1.7.1/ -Specify the `-s` flag for the number of processing slots per Task Manager. We recommend to set the number of slots to the number of processors per machine. + `-s`参数可以指定每个TaskManager的处理槽数。建议将插槽数设置为每台计算机的处理器数。 -Once the session has been started, you can submit jobs to the cluster using the `./bin/flink` tool. +YARN启动后,可以通过 `./bin/flink` 工具来提交Flink作业. -### Run a Flink job on YARN +### 在YARN上运行Flink作业 @@ -42,26 +42,26 @@ cd flink-1.7.1/ ## Flink YARN Session -Apache [Hadoop YARN](http://hadoop.apache.org/) is a cluster resource management framework. It allows to run various distributed applications on top of a cluster. Flink runs on YARN next to other applications. Users do not have to setup or install anything if there is already a YARN setup. +Apache [Hadoop YARN](http://hadoop.apache.org/) 是一个集群资源管理框架。它允许在群集上运行各种分布式应用程序。Flink在YARN上运行。如果已经有YARN,用户不必安装其他任何东西。 -**Requirements** +**系统配置需求** -* at least Apache Hadoop 2.2 -* HDFS (Hadoop Distributed File System) (or another distributed file system supported by Hadoop) +* 版本不低于Apache Hadoop 2.2 +* HDFS(Hadoop分布式文件系统)(或Hadoop支持的其他分布式文件系统) -If you have troubles using the Flink YARN client, have a look in the [FAQ section](http://flink.apache.org/faq.html#yarn-deployment). +如果在使用Flink YARN时遇到问题,请查阅 [FAQ section](http://flink.apache.org/faq.html#yarn-deployment). -### Start Flink Session +### 启动 Flink 会话 -Follow these instructions to learn how to launch a Flink Session within your YARN cluster. +以下说明了解如何在YARN群集中启动Flink会话。 -A session will start all required Flink services (JobManager and TaskManagers) so that you can submit programs to the cluster. Note that you can run multiple programs per session. +会话将启动Flink程序运行所必须的服务 (JobManager 和 TaskManagers) 这样就可以把程序提交到集群中.需要注意的是,可以在会话中运行多个程序 #### Download Flink -Download a Flink package for Hadoop >= 2 from the [download page](http://flink.apache.org/downloads.html). It contains the required files. +从 [下载页](http://flink.apache.org/downloads.html)下载Hadoop版本>2的安装包. 它包含所需的文件。 -Extract the package using: +通过下面命令解压: @@ -72,9 +72,9 @@ cd flink-1.7.1/ -#### Start a Session +#### 启动一个会话 -Use the following command to start a session +通过以下命令启动一个session @@ -84,7 +84,7 @@ Use the following command to start a session -This command will show you the following overview: +命令将显示以下概述: @@ -106,9 +106,9 @@ Usage: -Please note that the Client requires the `YARN_CONF_DIR` or `HADOOP_CONF_DIR` environment variable to be set to read the YARN and HDFS configuration. +需要注意的是,客户端需要 `YARN_CONF_DIR` 或 `HADOOP_CONF_DIR` 环境变量来保证可以读取 YARN 和 HDFS 配置. -**Example:** Issue the following command to allocate 10 Task Managers, with 8 GB of memory and 32 processing slots each: +**举例:** 以下命令可以分配10个TaskManager,每个TaskManager有8 GB内存和32个处理插槽 @@ -118,31 +118,29 @@ Please note that the Client requires the `YARN_CONF_DIR` or `HADOOP_CONF_DIR` en -The system will use the configuration in `conf/flink-conf.yaml`. Please follow our [configuration guide](//ci.apache.org/projects/flink/flink-docs-release-1.7/ops/config.html) if you want to change something. +系统会使用 `conf/flink-conf.yaml`文件中的配置. 需要更改配置可以参考 [configuration guide](//ci.apache.org/projects/flink/flink-docs-release-1.7/ops/config.html) 。 -Flink on YARN will overwrite the following configuration parameters `jobmanager.rpc.address` (because the JobManager is always allocated at different machines), `taskmanager.tmp.dirs` (we are using the tmp directories given by YARN) and `parallelism.default` if the number of slots has been specified. +YARN上的Flink将覆盖以下配置参数 `jobmanager.rpc.address` (因为JobManager可能会分配到不同的机器上运行), `taskmanager.tmp.dirs` (我们使用YARN给出的tmp目录) 以及 `parallelism.default` 如果已经指定插槽数 -If you don’t want to change the configuration file to set configuration parameters, there is the option to pass dynamic properties via the `-D` flag. So you can pass parameters this way: `-Dfs.overwrite-files=true -Dtaskmanager.network.memory.min=536346624`. +I如果不希望通过更改配置文件来设置配置参数,则可以选择通过`-D`标志传递动态属性。例如通过如下方式传递参数: `-Dfs.overwrite-files=true -Dtaskmanager.network.memory.min=536346624`. -The example invocation starts 11 containers (even though only 10 containers were requested), since there is one additional container for the ApplicationMaster and Job Manager. +例子中启动了11个容器(即使只请求了10个容器)因为ApplicationMaster和Job Manager还有一个额外的容器。 -Once Flink is deployed in your YARN cluster, it will show you the connection details of the Job Manager. +在YARN群集中部署Flink后,YARN会显示JobManager的详细信息。 -Stop the YARN session by stopping the unix process (using CTRL+C) or by entering ‘stop’ into the client. +按下CTRL + C 或者输入 ‘stop’可以停止YARN会话。 -Flink on YARN will only start all requested containers if enough resources are available on the cluster. Most YARN schedulers account for the requested memory of the containers, some account also for the number of vcores. By default, the number of vcores is equal to the processing slots (`-s`) argument. The [`yarn.containers.vcores`](//ci.apache.org/projects/flink/flink-docs-release-1.7/ops/config.html#yarn-containers-vcores) allows overwriting the number of vcores with a custom value. In order for this parameter to work you should enable CPU scheduling in your cluster. +如果群集上资源从租,Flink会按照请求来启动容器。大多数YARN调度账户考虑所请求容器的内存量,还有些账户会思考vcores的数量。 默认情况下,vcores的数量等于处理slots (`-s`) 参数。 调整 [`yarn.containers.vcores`](//ci.apache.org/projects/flink/flink-docs-release-1.7/ops/config.html#yarn-containers-vcores) 可以自定义vcores的数值。 为了使此参数起作用,需要在群集中启用CPU调度。 -#### Detached YARN Session +#### 分离YARN 会话 -If you do not want to keep the Flink YARN client running all the time, it’s also possible to start a _detached_ YARN session. The parameter for that is called `-d` or `--detached`. +如果不想让Flink程序一直运行,可以启动一个 _分离_ YARN 会话。参数称为 `-d` or `--detached`。 -In that case, the Flink YARN client will only submit Flink to the cluster and then close itself. Note that in this case its not possible to stop the YARN session using Flink. +在这种情况下,Flink YARN客户端将仅向群集提交Flink,然后自行关闭。需要注意的是,在这种情况下,无法使用Flink停止YARN会话,需要使用 YARN 工具 (`yarn application -kill <appId>`) 来停止YARN会话。 -Use the YARN utilities (`yarn application -kill <appId>`) to stop the YARN session. +#### 附加到已有Session -#### Attach to an existing Session - -Use the following command to start a session +使用以下命令启动会话 @@ -152,7 +150,7 @@ Use the following command to start a session -This command will show you the following overview: +命令将会显示一下内容 @@ -164,9 +162,9 @@ Usage: -As already mentioned, `YARN_CONF_DIR` or `HADOOP_CONF_DIR` environment variable must be set to read the YARN and HDFS configuration. +正如前面所述, 必须设置`YARN_CONF_DIR` 和 `HADOOP_CONF_DIR` 环境变量来确保可以读取YARN和HDFS配置。 -**Example:** Issue the following command to attach to running Flink YARN session `application_1463870264508_0029`: +**举例:** 输入以下命令以附加到正在运行的Flink YARN会话 `application_1463870264508_0029`: @@ -176,13 +174,13 @@ As already mentioned, `YARN_CONF_DIR` or `HADOOP_CONF_DIR` environment variable -Attaching to a running session uses YARN ResourceManager to determine Job Manager RPC port. +附加到正在运行的会话使用YARN ResourceManager来指定JobManagerRPC端口。 -Stop the YARN session by stopping the unix process (using CTRL+C) or by entering ‘stop’ into the client. +按下CTRL + C 或者输入 ‘stop’可以停止YARN会话。 -### Submit Job to Flink +### 向Flink提交作业 -Use the following command to submit a Flink program to the YARN cluster: +使用以下命令将Flink程序提交到YARN群集: @@ -192,9 +190,9 @@ Use the following command to submit a Flink program to the YARN cluster: -Please refer to the documentation of the [command-line client](//ci.apache.org/projects/flink/flink-docs-release-1.7/ops/cli.html). +参数可以参阅 [command-line client](//ci.apache.org/projects/flink/flink-docs-release-1.7/ops/cli.html)文档。 -The command will show you a help menu like this: +该命令将会按照如下显示: @@ -220,9 +218,9 @@ Action "run" compiles and runs a program. -Use the _run_ action to submit a job to YARN. The client is able to determine the address of the JobManager. In the rare event of a problem, you can also pass the JobManager address using the `-m` argument. The JobManager address is visible in the YARN console. +使用 _run_ 命令把作业提交给YARN。客户端能够确定JobManager的地址。 少数情况下可以通过 `-m` 参数来指定JobManager的地址。JobManager可以在YARN的控制台上可见 -**Example** +**示例** @@ -235,7 +233,7 @@ hadoop fs -copyFromLocal LICENSE-2.0.txt hdfs:/// ... -If there is the following error, make sure that all TaskManagers started: +如果出现以下错误,请检查所有TaskManagers都已启动: @@ -246,17 +244,17 @@ Exception in thread "main" org.apache.flink.compiler.CompilerException: -You can check the number of TaskManagers in the JobManager web interface. The address of this interface is printed in the YARN session console. +可以在JobManager Web界面中检查TaskManagers的数量。接口的地址打印在YARN会话控制台中。 -If the TaskManagers do not show up after a minute, you should investigate the issue using the log files. +如果一分钟后TaskManagers没有出现,则需要使用日志文件确认问题。 -## Run a single Flink job on YARN +## 在 YARN上运行单作业 -The documentation above describes how to start a Flink cluster within a Hadoop YARN environment. It is also possible to launch Flink within YARN only for executing a single job. +上面的文档描述了如何在YARN环境中启动Flink集群。也可以通过YARN中启动Flink来执行单个作业。 -Please note that the client then expects the `-yn` value to be set (number of TaskManagers). +需要注意客户端可以通过 `-yn` 参数来指定 TaskManagers的数量 -**_Example:_** +**_示例:_** @@ -266,41 +264,39 @@ Please note that the client then expects the `-yn` value to be set (number of Ta -The command line options of the YARN session are also available with the `./bin/flink` tool. They are prefixed with a `y` or `yarn` (for the long argument options). - -Note: You can use a different configuration directory per job by setting the environment variable `FLINK_CONF_DIR`. To use this copy the `conf` directory from the Flink distribution and modify, for example, the logging settings on a per-job basis. - -Note: It is possible to combine `-m yarn-cluster` with a detached YARN submission (`-yd`) to “fire and forget” a Flink job to the YARN cluster. In this case, your application will not get any accumulator results or exceptions from the ExecutionEnvironment.execute() call! +YARN会话的命令行选项也可用于 `./bin/flink`.以 `y` 或者 `yarn` 为前缀 -### User jars & Classpath +注意: 可以为每个作业使用不同的配置目录`FLINK_CONF_DIR`,需要这样配置时,请拷贝 `conf`目录,然后分发到不同节点上,并修改对应的配置 -By default Flink will include the user jars into the system classpath when running a single job. This behavior can be controlled with the `yarn.per-job-cluster.include-user-jar` parameter. +注意:当对一个分离的YARN会话(`-yd`)使用`-m yarn-cluster`参数时,应用程序将不会从ExecutionEnvironment.execute()调用获得任何累加器结果或异常! -When setting this to `DISABLED` Flink will include the jar in the user classpath instead. +### jars 和 Classpath -The user-jars position in the class path can be controlled by setting the parameter to one of the following: +默认情况下,Flink将在运行单个作业时将用户jar包含到系统类路径中。可以使用 `yarn.per-job-cluster.include-user-jar` 参数控制. -* `ORDER`: (default) Adds the jar to the system class path based on the lexicographic order. -* `FIRST`: Adds the jar to the beginning of the system class path. -* `LAST`: Adds the jar to the end of the system class path. +当设置为 `DISABLED` 时,Flink会使用用户类路径中的jar包 -## Recovery behavior of Flink on YARN +可以通过以下之一来控制类路径中的user-jar位置: -Flink’s YARN client has the following configuration parameters to control how to behave in case of container failures. These parameters can be set either from the `conf/flink-conf.yaml` or when starting the YARN session, using `-D` parameters. +* `ORDER`: (默认) 根据字典顺序将jar添加到系统类路径。 +* `FIRST`: 将jar添加到系统类路径的开头。 +* `LAST`: 将jar添加到系统类路径的末尾。 -* `yarn.reallocate-failed`: This parameter controls whether Flink should reallocate failed TaskManager containers. Default: true -* `yarn.maximum-failed-containers`: The maximum number of failed containers the ApplicationMaster accepts until it fails the YARN session. Default: The number of initially requested TaskManagers (`-n`). -* `yarn.application-attempts`: The number of ApplicationMaster (+ its TaskManager containers) attempts. If this value is set to 1 (default), the entire YARN session will fail when the Application master fails. Higher values specify the number of restarts of the ApplicationMaster by YARN. +## Flink 在 YARN上的故障恢复 -## Debugging a failed YARN session +运行在YARN上的Flink可以通过配置参数来控制发生故障时的行为,相关配置可以在 `conf/flink-conf.yaml` 文件配置或使用 `-D` 参数指定。 -There are many reasons why a Flink YARN session deployment can fail. A misconfigured Hadoop setup (HDFS permissions, YARN configuration), version incompatibilities (running Flink with vanilla Hadoop dependencies on Cloudera Hadoop) or other errors. +* `yarn.reallocate-failed`: Flink是否应重新分配失败的TaskManager容器。默认值:true。 +* `yarn.maximum-failed-containers`: ApplicationMaster在YARN会话失败之前接受的最大失败容器数,默认值:最初请求的TaskManagers (`-n`)的数量。 +* `yarn.application-attempts`: ApplicationMaster (+ TaskManager ) 尝试次数,如果此值设置为1(默认值),则当Application master失败时,整个YARN会话将失败。较高的值指定YARN重新启动ApplicationMaster的次数。 -### Log Files +## 调试失败的YARN会话 -In cases where the Flink YARN session fails during the deployment itself, users have to rely on the logging capabilities of Hadoop YARN. The most useful feature for that is the [YARN log aggregation](http://hortonworks.com/blog/simplifying-user-logs-management-and-access-in-yarn/). To enable it, users have to set the `yarn.log-aggregation-enable` property to `true` in the `yarn-site.xml` file. Once that is enabled, users can use the following command to retrieve all log files of a (failed) YARN session. +Flink YARN会话部署失败的原因有很多。Hadoop配置错误(HDFS权限,YARN配置),版本不兼容(在Cloudera Hadoop上运行Flink与vanilla Hadoop依赖关系)或其他错误。 +### 日志文件 +在部署期间失败,调试时比较有用的一个手段就是YARN日志聚合,参阅 [YARN 日志聚合](http://hortonworks.com/blog/simplifying-user-logs-management-and-access-in-yarn/). 在` yarn-site.xml` 中配置 `yarn.log-aggregation-enable` 为true可以启用配置。 启用后,用户可以使用以下命令检索(失败的)YARN会话的所有日志文件。 ``` yarn logs -applicationId @@ -308,56 +304,56 @@ yarn logs -applicationId -Note that it takes a few seconds after the session has finished until the logs show up. +需要注意的是,会话结束后需要几秒钟才会显示日志。 -### YARN Client console & Web interfaces +### YARN Client 终端 和 Web 界面 -The Flink YARN client also prints error messages in the terminal if errors occur during runtime (for example if a TaskManager stops working after some time). +如果在运行期间发生错误,Flink YARN客户端还会在终端中打印错误消息(例如,如果TaskManager在一段时间后停止工作)。 -In addition to that, there is the YARN Resource Manager web interface (by default on port 8088). The port of the Resource Manager web interface is determined by the `yarn.resourcemanager.webapp.address` configuration value. +除此之外,还有YARN Resource Manager Web界面(默认情况下在端口8088上)可以看到详情。端口由`yarn.resourcemanager.webapp.address`配置指定。 -It allows to access log files for running YARN applications and shows diagnostics for failed apps. +界面可以看到日志文件以运行YARN应用程序,显示失败应用程序和一些错误信息。 -## Build YARN client for a specific Hadoop version +## 构建特定版本的YARN客户端 -Users using Hadoop distributions from companies like Hortonworks, Cloudera or MapR might have to build Flink against their specific versions of Hadoop (HDFS) and YARN. Please read the [build instructions](//ci.apache.org/projects/flink/flink-docs-release-1.7/flinkDev/building.html) for more details. +使用Hortonworks,Cloudera或MapR等公司的Hadoop发行版的用户可能必须针对其特定版本的Hadoop(HDFS)和YARN构建Flink。相关内容可以查阅 [构建指南](//ci.apache.org/projects/flink/flink-docs-release-1.7/flinkDev/building.html) 。 -## Running Flink on YARN behind Firewalls +## 在防火墙后面的YARN上运行Flink -Some YARN clusters use firewalls for controlling the network traffic between the cluster and the rest of the network. In those setups, Flink jobs can only be submitted to a YARN session from within the cluster’s network (behind the firewall). If this is not feasible for production use, Flink allows to configure a port range for all relevant services. With these ranges configured, users can also submit jobs to Flink crossing the firewall. +某些YARN群集使用防火墙来控制群集与网络其余部分之间的网络流量。在这些设置中,Flink作业只能从群集网络内(防火墙后面)提交到YARN会话。这种情况下需要放开Flink应用的相关端口。 -Currently, two services are needed to submit a job: +目前两个与提交作业相关的服务: -* The JobManager (ApplicationMaster in YARN) -* The BlobServer running within the JobManager. +* JobManager (ApplicationMaster in YARN) +* 在JobManager中运行的BlobServer。 -When submitting a job to Flink, the BlobServer will distribute the jars with the user code to all worker nodes (TaskManagers). The JobManager receives the job itself and triggers the execution. +向Flink提交作业时,BlobServer会将带有用户代码的jar分发给所有工作节点(TaskManagers)。JobManager自己接收作业并触发执行。 -The two configuration parameters for specifying the ports are the following: +用于指定端口的两个配置参数如下: * `yarn.application-master.port` * `blob.server.port` -These two configuration options accept single ports (for example: “50010”), ranges (“50000-50025”), or a combination of both (“50010,50011,50020-50025,50050-50075”). +这两个配置选项可以配置,单个端口(例如:“50010”),范围(“50000-50025”)或两者的组合(“50010,50011,50020-50025,50050-50075”)。 -(Hadoop is using a similar mechanism, there the configuration parameter is called `yarn.app.mapreduce.am.job.client.port-range`.) +(Hadoop使用类似的机制,比如配置`yarn.app.mapreduce.am.job.client.port-range`.) -## Background / Internals +## Flink和YARN的内部交互机制 -This section briefly describes how Flink and YARN interact. +本节简要介绍Flink和YARN如何交互。 ![](../img/FlinkOnYarn.svg) -The YARN client needs to access the Hadoop configuration to connect to the YARN resource manager and to HDFS. It determines the Hadoop configuration using the following strategy: +ARN客户端需要访问Hadoop配置用来连接到YARN资源管理器和HDFS。使用以下策略确定Hadoop配置: -* Test if `YARN_CONF_DIR`, `HADOOP_CONF_DIR` or `HADOOP_CONF_PATH` are set (in that order). If one of these variables are set, they are used to read the configuration. -* If the above strategy fails (this should not be the case in a correct YARN setup), the client is using the `HADOOP_HOME` environment variable. If it is set, the client tries to access `$HADOOP_HOME/etc/hadoop` (Hadoop 2) and `$HADOOP_HOME/conf` (Hadoop 1). +* 按顺序测试 `YARN_CONF_DIR`, `HADOOP_CONF_DIR` 或者 `HADOOP_CONF_PATH` 是否存在,如果其中一个变量存在,则使用配置中的路径来读取配置。 +* 如果上面的都不存在 (YARN设置正确不应该这样),则客户端正在使用 `HADOOP_HOME` 环境变量。 如果设置了则尝试访问 `$HADOOP_HOME/etc/hadoop` (Hadoop 2) 和 `$HADOOP_HOME/conf` (Hadoop 1)路径来读取配置。 -When starting a new Flink YARN session, the client first checks if the requested resources (containers and memory) are available. After that, it uploads a jar that contains Flink and the configuration to HDFS (step 1). +启动新的Flink YARN会话时,客户端首先检查所请求的资源(容器和内存)是否可用。之后,将包含Flink和配置的jar上传到HDFS(步骤1)。 -The next step of the client is to request (step 2) a YARN container to start the _ApplicationMaster_ (step 3). Since the client registered the configuration and jar-file as a resource for the container, the NodeManager of YARN running on that particular machine will take care of preparing the container (e.g. downloading the files). Once that has finished, the _ApplicationMaster_ (AM) is started. +客户端的下一步是请求(步骤2)YARN容器以启动 _ApplicationMaster_ (步骤 3)。由于客户端将配置和jar文件注册为容器的资源,因此在该特定机器上运行的YARN的NodeManager将负责准备容器(例如,下载文件)。完成后,将启动 _ApplicationMaster_ (AM) 。 -The _JobManager_ and AM are running in the same container. Once they successfully started, the AM knows the address of the JobManager (its own host). It is generating a new Flink configuration file for the TaskManagers (so that they can connect to the JobManager). The file is also uploaded to HDFS. Additionally, the _AM_ container is also serving Flink’s web interface. All ports the YARN code is allocating are _ephemeral ports_. This allows users to execute multiple Flink YARN sessions in parallel. + _JobManager_ 和 AM 在同一容器内,运行一旦它们成功启动,AM就知道JobManager(它自己的主机)的地址。它正在为TaskManagers生成一个新的Flink配置文件(以便它们可以连接到JobManager)。该文件也自动上传到HDFS中, _AM_ 还提供了Flink的WEB界面。YARN代码分配的所有端口都是 _临时端口_,这样用户可以并行执行多个Flink YARN会话。 -After that, the AM starts allocating the containers for Flink’s TaskManagers, which will download the jar file and the modified configuration from the HDFS. Once these steps are completed, Flink is set up and ready to accept Jobs. +之后,AM开始为Flink的TaskManagers分配容器,这将从HDFS下载jar文件和修改后的配置。完成这些步骤后,即可建立Flink并准备接受作业。 diff --git a/docs/1.7/107.md b/docs/1.7/107.md index 95ddc2577f43fdc8f7133b0629a07630f3c6c572..8a5579c70ce8e7967e4ae9b2c3ba1d778b68356e 100644 --- a/docs/1.7/107.md +++ b/docs/1.7/107.md @@ -1,101 +1,101 @@ -# Mesos Setup +# Mesos 设置 -## Background +## 背景 -The Mesos implementation consists of two components: The Application Master and the Worker. The workers are simple TaskManagers which are parameterized by the environment set up by the application master. The most sophisticated component of the Mesos implementation is the application master. The application master currently hosts the following components: +Mesos实现包含两个组件:Application Master和Worker。Worker是简单的TaskManagers,由应用程序设置的环境进行参数化。Mesos实现中最复杂的组件是应用程序主机。应用程序主机当前托管以下组件: -### Mesos Scheduler +### Mesos 调度程序 -The scheduler is responsible for registering the framework with Mesos, requesting resources, and launching worker nodes. The scheduler continuously needs to report back to Mesos to ensure the framework is in a healthy state. To verify the health of the cluster, the scheduler monitors the spawned workers and marks them as failed and restarts them if necessary. +调度程序负责向Mesos注册,请求资源和启动工作节点。调度程序不断向Mesos报告以确保框架处于正常状态。为了验证群集的运行状况,调度程序监视生成的worker并将其标记为失败,并在必要时重新启动它们。 -Flink’s Mesos scheduler itself is currently not highly available. However, it persists all necessary information about its state (e.g. configuration, list of workers) in Zookeeper. In the presence of a failure, it relies on an external system to bring up a new scheduler. The scheduler will then register with Mesos again and go through the reconciliation phase. In the reconciliation phase, the scheduler receives a list of running workers nodes. It matches these against the recovered information from Zookeeper and makes sure to bring back the cluster in the state before the failure. +Flink的Mesos调度程序本身目前不具备高可用性。但是,它会在Zookeeper中保存有关其状态(例如配置,工作者列表)的所有必要信息。在出现故障时,它依赖于外部系统来启动新的调度程序。然后,调度程序将再次向Mesos注册并完成协调阶段。在协调阶段,调度程序接收正在运行的工作程序节点的列表。它将这些与Zookeeper中恢复的信息进行匹配,并确保在故障之前将群集恢复到状态。 -### Artifact Server +### 工作服务器 -The artifact server is responsible for providing resources to the worker nodes. The resources can be anything from the Flink binaries to shared secrets or configuration files. For instance, in non-containerized environments, the artifact server will provide the Flink binaries. What files will be served depends on the configuration overlay used. +工件服务器负责向工作节点提供资源。资源可以是从Flink二进制文件或者是配置文件。例如,在非容器化环境中,工件服务器将提供Flink二进制文件和需要依赖的文件。 -### Flink’s Dispatcher and Web Interface +### Flink’s 的调度程序和web页面 -The Dispatcher and the web interface provide a central point for monitoring, job submission, and other client interaction with the cluster (see [FLIP-6](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077)). +Flink提供了Web界面为监视,为作业提交和监控交互提供了便利(参阅 [FLIP-6](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077)). -### Startup script and configuration overlays +### 启动脚本和配置覆盖 -The startup script provide a way to configure and start the application master. All further configuration is then inherited by the workers nodes. This is achieved using configuration overlays. Configuration overlays provide a way to infer configuration from environment variables and config files which are shipped to the worker nodes. +启动脚本提供了一种配置和启动应用程序主机的方法。然后,工作节点继承所有的配置。这是使用配置覆盖来实现的。配置覆盖提供了一种从环境变量和配置文件推断配置的方法,这些配置文件将发送到工作节点。 ## DC/OS -This section refers to [DC/OS](https://dcos.io) which is a Mesos distribution with a sophisticated application management layer. It comes pre-installed with Marathon, a service to supervise applications and maintain their state in case of failures. +本文涉及到 [DC/OS](https://dcos.io) 它是具有复杂应用程序管理层的Mesos分发。它预装了Marathon,这是一种监控应用程序并在发生故障时保持其状态的服务。 -If you don’t have a running DC/OS cluster, please follow the [instructions on how to install DC/OS on the official website](https://dcos.io/install/). +如果没有正在运行的DC/OS集群,可以参考 [如何安装DC/OS](https://dcos.io/install/). -Once you have a DC/OS cluster, you may install Flink through the DC/OS Universe. In the search prompt, just search for Flink. Alternatively, you can use the DC/OS CLI: +安装DC / OS群集后,可以通过DC / OS Universe安装Flink。在搜索提示中,只需搜索Flink。或者,可以使用DC / OS CLI: ``` dcos package install flink ``` -Further information can be found in the [DC/OS examples documentation](https://github.com/dcos/examples/tree/master/1.8/flink). +详细信息请参阅 [DC/OS 示例](https://github.com/dcos/examples/tree/master/1.8/flink). -## Mesos without DC/OS +## 无 DC/OS的Mesos -You can also run Mesos without DC/OS. +也没在没有DC/OS的环境中运行 Mesos。 -### Installing Mesos +### 安装 Mesos -Please follow the [instructions on how to setup Mesos on the official website](http://mesos.apache.org/getting-started/). +参阅 [MEsos安装指南](http://mesos.apache.org/getting-started/)来完成安装。 -After installation you have to configure the set of master and agent nodes by creating the files `MESOS_HOME/etc/mesos/masters` and `MESOS_HOME/etc/mesos/slaves`. These files contain in each row a single hostname on which the respective component will be started (assuming SSH access to these nodes). +安装完成后,需要创建 `MESOS_HOME/etc/mesos/masters` 和 `MESOS_HOME/etc/mesos/slaves`文件来配置主机诶单和从节点。文件中每行包含一个主机名(要求可以通过ssh访问到这些节点)。 -Next you have to create `MESOS_HOME/etc/mesos/mesos-master-env.sh` or use the template found in the same directory. In this file, you have to define +接下来需要创建 `MESOS_HOME/etc/mesos/mesos-master-env.sh` 文件,或者在同目录下找到模板,修改模板内容。以下内容是必须的 ``` export MESOS_work_dir=WORK_DIRECTORY ``` -and it is recommended to uncommment +并建议取消注释 ``` export MESOS_log_dir=LOGGING_DIRECTORY ``` -In order to configure the Mesos agents, you have to create `MESOS_HOME/etc/mesos/mesos-agent-env.sh` or use the template found in the same directory. You have to configure +配置Mesos代理,需要创建 `MESOS_HOME/etc/mesos/mesos-agent-env.sh` 文件,或在同目录下修改模板。 ``` export MESOS_master=MASTER_HOSTNAME:MASTER_PORT ``` -and uncomment +并取消注释 ``` export MESOS_log_dir=LOGGING_DIRECTORY export MESOS_work_dir=WORK_DIRECTORY ``` -#### Mesos Library +#### Mesos 类库 In order to run Java applications with Mesos you have to export `MESOS_NATIVE_JAVA_LIBRARY=MESOS_HOME/lib/libmesos.so` on Linux. Under Mac OS X you have to export `MESOS_NATIVE_JAVA_LIBRARY=MESOS_HOME/lib/libmesos.dylib`. #### Deploying Mesos -In order to start your mesos cluster, use the deployment script `MESOS_HOME/sbin/mesos-start-cluster.sh`. In order to stop your mesos cluster, use the deployment script `MESOS_HOME/sbin/mesos-stop-cluster.sh`. More information about the deployment scripts can be found [here](http://mesos.apache.org/documentation/latest/deploy-scripts/). +要使用Mesos运行Java应用程序, 请使用脚本 `MESOS_HOME/sbin/mesos-start-cluster.sh`. 停止Mesost则执行 `MESOS_HOME/sbin/mesos-stop-cluster.sh`. 更多关于部署脚本可以参阅[部署脚本](http://mesos.apache.org/documentation/latest/deploy-scripts/). -### Installing Marathon +### 安装 Marathon -Optionally, you may also [install Marathon](https://mesosphere.github.io/marathon/docs/) which enables you to run Flink in [high availability (HA) mode](#high-availability). +或者可以安装Marathon参阅 [Marathon安装](https://mesosphere.github.io/marathon/docs/) 即可开启高可用,参阅 [高可用模式](#high-availability). -### Pre-installing Flink vs Docker/Mesos containers +### 预安装Flink 与 Docker/Mesos 容器 -You may install Flink on all of your Mesos Master and Agent nodes. You can also pull the binaries from the Flink web site during deployment and apply your custom configuration before launching the application master. A more convenient and easier to maintain approach is to use Docker containers to manage the Flink binaries and configuration. +可以在所有Mesos主节点和代理节点上安装Flink。或者在部署期间从Flink网站拉取二进制文件,并在启动应用程序主服务器之前应用自定义配置。使用Docker容器来管理Flink二进制文件和配置会是个更方便,更易于维护的方法。 -This is controlled via the following configuration entries: +通过以下配置控制的: ``` mesos.resourcemanager.tasks.container.type: mesos _or_ docker ``` -If set to ‘docker’, specify the image name: +如果设置为 `docker`,需要指定镜像名称 ``` mesos.resourcemanager.tasks.container.image.name: image_name @@ -103,19 +103,19 @@ mesos.resourcemanager.tasks.container.image.name: image_name ### Standalone -In the `/bin` directory of the Flink distribution, you find two startup scripts which manage the Flink processes in a Mesos cluster: +Flink的 `/bin` 目录下可以找到两个脚本用来管理Mesos集群中的Flink进程 -1. `mesos-appmaster.sh` This starts the Mesos application master which will register the Mesos scheduler. It is also responsible for starting up the worker nodes. +1. `mesos-appmaster.sh` 启动Mesos master,负责注册Mesos调度程序,还负责启动工作节点。 -2. `mesos-taskmanager.sh` The entry point for the Mesos worker processes. You don’t need to explicitly execute this script. It is automatically launched by the Mesos worker node to bring up a new TaskManager. +2. `mesos-taskmanager.sh` Mesos工作进程的入口点。不需要执行此脚本。它由Mesos工作节点自动启动以启动新的TaskManager。 -In order to run the `mesos-appmaster.sh` script you have to define `mesos.master` in the `flink-conf.yaml` or pass it via `-Dmesos.master=...` to the Java process. +执行 `mesos-appmaster.sh` 脚本需要在`flink-conf.yaml`中定义 `mesos.master`或者通过 `-Dmesos.master=…` 传递给Java进程 -When executing `mesos-appmaster.sh`, it will create a job manager on the machine where you executed the script. In contrast to that, the task managers will be run as Mesos tasks in the Mesos cluster. +执行`mesos-appmaster.sh`时,它将在执行脚本的机器上创建一个JobManager。TaskManager将作为Mesos集群中的Mesos任务运行。 -#### General configuration +#### 基本配置 -It is possible to completely parameterize a Mesos application through Java properties passed to the Mesos application master. This also allows to specify general Flink configuration parameters. For example: +可以通过传递给Mesos应用程序主机的Java属性完全参数化Mesos应用程序。还允许指定常规Flink配置参数。例如 ``` bin/mesos-appmaster.sh \ @@ -129,17 +129,17 @@ bin/mesos-appmaster.sh \ -Dparallelism.default=10 ``` -**Note:** If Flink is in [legacy mode](//ci.apache.org/projects/flink/flink-docs-release-1.7/ops/config.html#legacy), you should additionally define the number of task managers that are started by Mesos via [`mesos.initial-tasks`](//ci.apache.org/projects/flink/flink-docs-release-1.7/ops/config.html#mesos-initial-tasks). +**Note:** 如果Flink处于 [legacy mode](//ci.apache.org/projects/flink/flink-docs-release-1.7/ops/config.html#legacy), 还需要定义TaskManager数量 参阅[`mesos.initial-tasks`](//ci.apache.org/projects/flink/flink-docs-release-1.7/ops/config.html#mesos-initial-tasks). -### High Availability +### 高可用 -You will need to run a service like Marathon or Apache Aurora which takes care of restarting the Flink master process in case of node or process failures. In addition, Zookeeper needs to be configured like described in the [High Availability section of the Flink docs](//ci.apache.org/projects/flink/flink-docs-release-1.7/ops/jobmanager_high_availability.html). +还需要运行Marathon或Apache Aurora之类的服务,来确保节点或进程出现故障时重新启动Flink主进程。此外还需要一些额外的Zookeeper配置,参阅 [Flink 文档-高可用](//ci.apache.org/projects/flink/flink-docs-release-1.7/ops/jobmanager_high_availability.html). #### Marathon -Marathon needs to be set up to launch the `bin/mesos-appmaster.sh` script. In particular, it should also adjust any configuration parameters for the Flink cluster. +执行 `bin/mesos-appmaster.sh` 脚本来启动Marathon,文件内还包含了一些Flink集群的配置参数。 -Here is an example configuration for Marathon: +以下是Marathon的示例配置: ``` { @@ -150,9 +150,9 @@ Here is an example configuration for Marathon: } ``` -When running Flink with Marathon, the whole Flink cluster including the job manager will be run as Mesos tasks in the Mesos cluster. +使用Marathon运行Flink时,包含JobManager的整个Flink集群将作为Mesos集群中的Mesos任务运行。 -### Configuration parameters +### 配置参数 -For a list of Mesos specific configuration, refer to the [Mesos section](//ci.apache.org/projects/flink/flink-docs-release-1.7/ops/config.html#mesos) of the configuration documentation. +有关Mesos特定配置的列表,请参阅 [Mesos配置](//ci.apache.org/projects/flink/flink-docs-release-1.7/ops/config.html#mesos)。 diff --git a/docs/1.7/108.md b/docs/1.7/108.md index 6b4bdfdb01418fc8a51b402c45f634f587087904..6c6b30dadf8e0daa738da6123640352f8fd2a3e1 100644 --- a/docs/1.7/108.md +++ b/docs/1.7/108.md @@ -1,74 +1,74 @@ -# Docker Setup +# Docker安装 -[Docker](https://www.docker.com) is a popular container runtime. There are Docker images for Apache Flink available on Docker Hub which can be used to deploy a session cluster. The Flink repository also contains tooling to create container images to deploy a job cluster. +[Docker](https://www.docker.com) 是一个流行的运行时容器。 Docker Hub上有用于Apache Flink的Docker镜像,可用于部署Flink群集。Flink源还有用于部署作业集群的镜像。 -## Flink session cluster +## Flink集群 -A Flink session cluster can be used to run multiple jobs. Each job needs to be submitted to the cluster after it has been deployed. +Flink群集可用于运行多个作业。部署后,多个作业可逐个提交到集群。 -### Docker images +### Docker 镜像 -The [Flink Docker repository](https://hub.docker.com/_/flink/) is hosted on Docker Hub and serves images of Flink version 1.2.1 and later. + [Flink Docker repository](https://hub.docker.com/_/flink/) 是Flink托管在 Docker Hub上的地址,提供了Flink1.2.1版本以上的镜像。 -Images for each supported combination of Hadoop and Scala are available, and tag aliases are provided for convenience. +镜像支持Hadoop和scala的环境组合,为了方便通tag别名来区分。 -Beginning with Flink 1.5, image tags that omit a Hadoop version (e.g. `-hadoop28`) correspond to Hadoop-free releases of Flink that do not include a bundled Hadoop distribution. +从Flink 1.5版本以后, 镜像tags省略了Hadoop版本 (例如 `-hadoop28`) 对应于不包含Hadoop发行版的Flink的无Hadoop版本。 -For example, the following aliases can be used: _(`1.5.y` indicates the latest release of Flink 1.5)_ +举例,下面的别名可以使用: _(`1.5.y` 表示Flink 1.5的最新版本)_ * `flink:latest` → `flink:<latest-flink>-scala_<latest-scala>` * `flink:1.5` → `flink:1.5.y-scala_2.11` * `flink:1.5-hadoop27` → `flink:1.5.y-hadoop27-scala_2.11` -**Note:** The Docker images are provided as a community project by individuals on a best-effort basis. They are not official releases by the Apache Flink PMC. +**注意:** Docker镜像是由个人方式提供的社区项目。不是Apache Flink PMC的正式版本。 -## Flink job cluster +## Flink作业集群 -A Flink job cluster is a dedicated cluster which runs a single job. The job is part of the image and, thus, there is no extra job submission needed. +Flink作业集群是运行单个作业的专用集群。这项是镜像内容的一部分,不需要额外的操作。 -### Docker images +### Docker 镜像 -The Flink job cluster image needs to contain the user code jars of the job for which the cluster is started. Therefore, one needs to build a dedicated container image for every job. The `flink-container` module contains a `build.sh` script which can be used to create such an image. Please see the [instructions](https://github.com/apache/flink/blob/release-1.7/flink-container/docker/README.md) for more details. +Flink作业集群镜像需要包含用户代码jar。因此,需要为每个作业单独构建镜像。 `flink-container` 模块包含了一个 `build.sh` 脚本,用于创建此类镜像,详情可参阅 [instructions](https://github.com/apache/flink/blob/release-1.7/flink-container/docker/README.md) 。 -## Flink with Docker Compose +## Flink 和 Docker Compose -[Docker Compose](https://docs.docker.com/compose/) is a convenient way to run a group of Docker containers locally. +[Docker Compose](https://docs.docker.com/compose/) 是一种在本地运行一组Docker容器的便捷方式。 -Example config files for a [session cluster](https://github.com/docker-flink/examples/blob/master/docker-compose.yml) and a [job cluster](https://github.com/apache/flink/blob/release-1.7/flink-container/docker/docker-compose.yml) are available on GitHub. +Github上提供了 [安装](https://github.com/docker-flink/examples/blob/master/docker-compose.yml) 和 [作业集群](https://github.com/apache/flink/blob/release-1.7/flink-container/docker/docker-compose.yml) 的配置文件示例 -### Usage +### 用法 -* Launch a cluster in the foreground +* 启动集群 ``` docker-compose up ``` -* Launch a cluster in the background +* 后台启动集群 ``` docker-compose up -d ``` -* Scale the cluster up or down to _N_ TaskManagers +* 调整集群中为 _N_个 TaskManagers ``` docker-compose scale taskmanager=<N> ``` -* Kill the cluster +* 关闭集群 ``` docker-compose kill ``` -When the cluster is running, you can visit the web UI at [http://localhost:8081](http://localhost:8081). You can also use the web UI to submit a job to a session cluster. +Flink集群运行时,可以通过 [http://localhost:8081](http://localhost:8081)来访问WEB UI,并通过UI来提交作业。 -To submit a job to a session cluster via the command line, you must copy the JAR to the JobManager container and submit the job from there. +通过命令行将作业提交到Flink群集,必须将JAR复制到JobManager容器中并从那里提交作业。 -For example: +例如: ``` $ JOBMANAGER_CONTAINER=$(docker ps --filter name=jobmanager --format={{.ID}}) diff --git a/docs/1.7/109.md b/docs/1.7/109.md index 683636dfb1e500a74db262a9586f5f95c60aaed5..d9d8e7c4a7fd6a219fa07f5ffa03667584513bbe 100644 --- a/docs/1.7/109.md +++ b/docs/1.7/109.md @@ -1,28 +1,28 @@ -# Kubernetes Setup +# Kubernetes 设置 -This page describes how to deploy a Flink job and session cluster on [Kubernetes](https://kubernetes.io). +本章介绍如何在 [Kubernetes](https://kubernetes.io)上部署Flink作业 -## Setup Kubernetes +## 安装 Kubernetes -Please follow [Kubernetes’ setup guide](https://kubernetes.io/docs/setup/) in order to deploy a Kubernetes cluster. If you want to run Kubernetes locally, we recommend using [MiniKube](https://kubernetes.io/docs/setup/minikube/). +按照 [Kubernetes’ setup guide](https://kubernetes.io/docs/setup/) 来部署kubernetes集群,如果需要在本地运行Kubernetes,建议使用 [MiniKube](https://kubernetes.io/docs/setup/minikube/). -**Note:** If using MiniKube please make sure to execute `minikube ssh 'sudo ip link set docker0 promisc on'` before deploying a Flink cluster. Otherwise Flink components are not able to self reference themselves through a Kubernetes service. +**注意:** 如果使用MiniKube,请确保在部署前执行 `minikube ssh 'sudo ip link set docker0 promisc on'` 命令,窦泽,Flink组件无法通过Kubernetes服务来引用自己 -## Flink session cluster on Kubernetes +## Kubernetes上的Flink集群 -A Flink session cluster is executed as a long-running Kubernetes Deployment. Note that you can run multiple Flink jobs on a session cluster. Each job needs to be submitted to the cluster after the cluster has been deployed. +Flink会话群集作为长期运行的Kubernetes部署而执行, 所以可以在集群中运行多个Flink作业,集群部署后,需要把任务逐个提交到集群。 -A basic Flink session cluster deployment in Kubernetes has three components: +在Kubernetes上部署的Flink会话通常有三个组件: -* a Deployment/Job which runs the JobManager -* a Deployment for a pool of TaskManagers -* a Service exposing the JobManager’s REST and UI ports +* 部署/作业运行JobManager +* 部署TaskManagers池 +* 提供 JobManager’的 REST的 UI服务和端口 -### Deploy Flink session cluster on Kubernetes +### 在Kubernetes上部署Flink会话集群 -Using the resource definitions for a [session cluster](#session-cluster-resource-definitions), launch the cluster with the `kubectl` command: +使用 `kubectl` 命令来定义 [session cluster](#session-cluster-resource-definitions)的资源 : ``` kubectl create -f jobmanager-service.yaml @@ -30,12 +30,12 @@ kubectl create -f jobmanager-deployment.yaml kubectl create -f taskmanager-deployment.yaml ``` -You can then access the Flink UI via `kubectl proxy`: +可以通过 `kubectl proxy` 来访问Flink UI: -1. Run `kubectl proxy` in a terminal -2. Navigate to [http://localhost:8001/api/v1/namespaces/default/services/flink-jobmanager:ui/proxy](http://localhost:8001/api/v1/namespaces/default/services/flink-jobmanager:ui/proxy) in your browser +1. 在终端执行`kubectl proxy` 命令 +2. 在浏览器中输入 [http://localhost:8001/api/v1/namespaces/default/services/flink-jobmanager:ui/proxy](http://localhost:8001/api/v1/namespaces/default/services/flink-jobmanager:ui/proxy) 即可访问 -In order to terminate the Flink session cluster, use `kubectl`: +需要停止Flink集群,使用 `kubectl`: ``` kubectl delete -f jobmanager-deployment.yaml @@ -43,27 +43,27 @@ kubectl delete -f taskmanager-deployment.yaml kubectl delete -f jobmanager-service.yaml ``` -## Flink job cluster on Kubernetes +## Kubernetes上的Flink集群 -A Flink job cluster is a dedicated cluster which runs a single job. The job is part of the image and, thus, there is no extra job submission needed. +Flink作业集群是运行单个作业的专用集群。是镜像的一部分,所以不需要额外的工作 -### Creating the job-specific image +### 创建专用作业的镜像 -The Flink job cluster image needs to contain the user code jars of the job for which the cluster is started. Therefore, one needs to build a dedicated container image for every job. Please follow these [instructions](https://github.com/apache/flink/blob/release-1.7/flink-container/docker/README.md) to build the Docker image. +Flink作业集群映像需要包含启动集群的作业的用户代码jar,因此,需要为每个作业构建专用的镜像。按照 [说明](https://github.com/apache/flink/blob/release-1.7/flink-container/docker/README.md) 来构建Docker镜像 -### Deploy Flink job cluster on Kubernetes +### 在Kubernetes上部署Flink作业集群 -In order to deploy the a job cluster on Kubernetes please follow these [instructions](https://github.com/apache/flink/blob/release-1.7/flink-container/kubernetes/README.md#deploy-flink-job-cluster). +在Kubernetes上部署作业集群,请参阅 [instructions](https://github.com/apache/flink/blob/release-1.7/flink-container/kubernetes/README.md#deploy-flink-job-cluster). -## Advanced Cluster Deployment +## 集群的高级部署 -An early version of a [Flink Helm chart](https://github.com/docker-flink/examples) is available on GitHub. +GitHub 上提供了早期版本的[Flink Helm chart](https://github.com/docker-flink/examples) 。 -## Appendix +## 附录 -### Session cluster resource definitions +### 会话集群的资源文件 -The Deployment definitions use the pre-built image `flink:latest` which can be found [on Docker Hub](https://hub.docker.com/r/_/flink/). The image is built from this [Github repository](https://github.com/docker-flink/docker-flink). +在 [on Docker Hub](https://hub.docker.com/r/_/flink/)上可以通过 `flink:latest`标签来找到已经构建好的镜像,镜像是从 [Github repository](https://github.com/docker-flink/docker-flink)这里构建的。 `jobmanager-deployment.yaml` diff --git a/docs/1.7/110.md b/docs/1.7/110.md index bba5720cc3e0884ac621d5d8377eb961be33fb79..7d27904e205d7a3decfd51d7b8035fb9b6dd5709 100644 --- a/docs/1.7/110.md +++ b/docs/1.7/110.md @@ -1,33 +1,31 @@ +Amazon 服务 (AWS) - -# Amazon Web Services (AWS) - -Amazon Web Services offers cloud computing services on which you can run Flink. +Amazon Web Services 提供可以运行Flink的云计算服务。 ## EMR: Elastic MapReduce -[Amazon Elastic MapReduce](https://aws.amazon.com/elasticmapreduce/) (Amazon EMR) is a web service that makes it easy to quickly setup a Hadoop cluster. This is the **recommended way** to run Flink on AWS as it takes care of setting up everything. +[Amazon Elastic MapReduce](https://aws.amazon.com/elasticmapreduce/) (Amazon EMR) 是一种Web服务,可以轻松快速地设置Hadoop集群。因为它负责设置所有内容,所以在AWS上运行Flink **非常推荐**,使用这种方式 。 -### Standard EMR Installation +### 标准 EMR 安装 -Flink is a supported application on Amazon EMR. [Amazon’s documentation](http://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-flink.html) describes configuring Flink, creating and monitoring a cluster, and working with jobs. +Flink是Amazon EMR上受支持的应用程序。 [亚马逊的文档](http://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-flink.html) 描述了如何配置Flink、创建和监控集群以及处理作业。 -### Custom EMR Installation +### 自定义 EMR 安装 -Amazon EMR services are regularly updated to new releases but a version of Flink which is not available can be manually installed in a stock EMR cluster. +Amazon EMR 会定期更新到最新版本, 也可以在EMR集群中安装不同版本的Flink -**Create EMR Cluster** +**创建 EMR 集群** -The EMR documentation contains [examples showing how to start an EMR cluster](http://docs.aws.amazon.com/ElasticMapReduce/latest/ManagementGuide/emr-gs-launch-sample-cluster.html). You can follow that guide and install any EMR release. You don’t need to install the _All Applications_ part of the EMR release, but can stick to _Core Hadoop_. +EMR文档包含了 [创建 EMR集群](http://docs.aws.amazon.com/ElasticMapReduce/latest/ManagementGuide/emr-gs-launch-sample-cluster.html)。通过该文档可以安装任何版本的EMR,不需要安装EMR中的_All Applications_部分,但是_Core Hadoop_是必须的。 -Note Access to S3 buckets requires [configuration of IAM roles](http://docs.aws.amazon.com/ElasticMapReduce/latest/ManagementGuide/emr-iam-roles.html) when creating an EMR cluster. +注意 访问S3存储需要配置IAM角色,文档请参阅 [配置 IAM角色](http://docs.aws.amazon.com/ElasticMapReduce/latest/ManagementGuide/emr-iam-roles.html) 。 -**Install Flink on EMR Cluster** +**在EMR集群中安装Flink** -After creating your cluster, you can [connect to the master node](http://docs.aws.amazon.com/ElasticMapReduce/latest/ManagementGuide/emr-connect-master-node.html) and install Flink: +在创建集群之后,连接到主节点并安装Flink,文档请参阅 [连接到主节点](http://docs.aws.amazon.com/ElasticMapReduce/latest/ManagementGuide/emr-connect-master-node.html) 。 -1. Go the [Downloads Page](http://flink.apache.org/downloads.html) and **download a binary version of Flink matching the Hadoop version** of your EMR cluster, e.g. Hadoop 2.7 for EMR releases 4.3.0, 4.4.0, or 4.5.0. -2. Extract the Flink distribution and you are ready to deploy [Flink jobs via YARN](yarn_setup.html) after **setting the Hadoop config directory**: +1. 转到 [下载页](http://flink.apache.org/downloads.html) 并 **下载与EMR集群中Hadoop版本匹配的二进制版本** , 例如, Hadoop 2.7 for EMR releases 4.3.0, 4.4.0, or 4.5.0. +2. 解压Flink,**设置Hadoop配置目录**后,可以运行Flink作业,文档请参阅 [Flink jobs via YARN](yarn_setup.html) : @@ -37,11 +35,11 @@ HADOOP_CONF_DIR=/etc/hadoop/conf ./bin/flink run -m yarn-cluster -yn 1 examples/ -## S3: Simple Storage Service +## S3: 简单存储 -[Amazon Simple Storage Service](http://aws.amazon.com/s3/) (Amazon S3) provides cloud object storage for a variety of use cases. You can use S3 with Flink for **reading** and **writing data** as well in conjunction with the [streaming **state backends**](//ci.apache.org/projects/flink/flink-docs-release-1.7/ops/state/state_backends.html) or even as a YARN object storage. +[Amazon Simple Storage Service](http://aws.amazon.com/s3/) (Amazon S3)为各种实例提供云对象存储,S3可以作为Flink的数据**输入** 或 **输出** ,或者是为 [streaming **state backends**](//ci.apache.org/projects/flink/flink-docs-release-1.7/ops/state/state_backends.html) 、以及YARN提供对象存储。 -You can use S3 objects like regular files by specifying paths in the following format: +通过以下格式指定路径来使用常规文件等S3对象: @@ -51,32 +49,32 @@ s3:/// -The endpoint can either be a single file or a directory, for example: +endpoint 是单个文件或目录,例如: ``` -// Read from S3 bucket +// 从S3中读取 env.readTextFile("s3:///"); -// Write to S3 bucket +// 写入到S3 stream.writeAsText("s3:///"); -// Use S3 as FsStatebackend +// 使用S3作为 FsStatebackend env.setStateBackend(new FsStateBackend("s3:///")); ``` -Note that these examples are _not_ exhaustive and you can use S3 in other places as well, including your [high availability setup](../jobmanager_high_availability.html) or the [RocksDBStateBackend](//ci.apache.org/projects/flink/flink-docs-release-1.7/ops/state/state_backends.html#the-rocksdbstatebackend); everywhere that Flink expects a FileSystem URI. +上文的示例并没有列出全部情况,也可以在其他地方使用S3存储,包括 [高可用安装](../jobmanager_high_availability.html) 或 [RocksDBStateBackend](//ci.apache.org/projects/flink/flink-docs-release-1.7/ops/state/state_backends.html#the-rocksdbstatebackend)等 任何Flink期望的文件地址。 -For most use cases, you may use one of our shaded `flink-s3-fs-hadoop` and `flink-s3-fs-presto` S3 filesystem wrappers which are fairly easy to set up. For some cases, however, e.g. for using S3 as YARN’s resource storage dir, it may be necessary to set up a specific Hadoop S3 FileSystem implementation. Both ways are described below. +多数情况下我们使用 `flink-s3-fs-hadoop` 和 `flink-s3-fs-presto` 作为S3在Hadoop 上的实现 ,例如,使用S3作为YARN的资源存储目录,可能需要设置特定的Hadoop S3 FileSystem实现。两种方式如下所述。 -### Shaded Hadoop/Presto S3 file systems (recommended) +### Shaded Hadoop/Presto S3 文件系统 (推荐) -**Note:** You don’t have to configure this manually if you are running [Flink on EMR](#emr-elastic-mapreduce). +**Note:** 如果在EMR上运行Flink,则无需手动配置,文档可以参阅 [Flink on EMR](#emr-elastic-mapreduce). -To use either `flink-s3-fs-hadoop` or `flink-s3-fs-presto`, copy the respective JAR file from the `opt` directory to the `lib` directory of your Flink distribution before starting Flink, e.g. +要使用 `flink-s3-fs-hadoop` 或 `flink-s3-fs-presto`,co在启动Flink之前将相应的JAR文件从 `opt` 目录拷贝到Flink的 `lib` 目录下,例如 @@ -86,23 +84,23 @@ cp ./opt/flink-s3-fs-presto-1.7.1.jar ./lib/ -Both `flink-s3-fs-hadoop` and `flink-s3-fs-presto` register default FileSystem wrappers for URIs with the `s3://` scheme, `flink-s3-fs-hadoop` also registers for `s3a://` and `flink-s3-fs-presto` also registers for `s3p://`, so you can use this to use both at the same time. + `flink-s3-fs-hadoop` 和 `flink-s3-fs-presto` 注册了不同的文件URI `s3://` 实现, `flink-s3-fs-hadoop` 还注册了 `s3a://`, `flink-s3-fs-presto` 注册了 `s3p://`,可以同时使用这些不同的URI。 -#### Configure Access Credentials +#### 配置访问凭据 -After setting up the S3 FileSystem wrapper, you need to make sure that Flink is allowed to access your S3 buckets. +设置S3 文件系统实现之后,需要保证Flink程序可以访问S3存储。 -##### Identity and Access Management (IAM) (Recommended) +##### 身份和访问管理 (IAM) (推荐) -The recommended way of setting up credentials on AWS is via [Identity and Access Management (IAM)](http://docs.aws.amazon.com/IAM/latest/UserGuide/introduction.html). You can use IAM features to securely give Flink instances the credentials that they need in order to access S3 buckets. Details about how to do this are beyond the scope of this documentation. Please refer to the AWS user guide. What you are looking for are [IAM Roles](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html). +比较推荐使用身份和访问管理 (IAM)来设置Access key,可以通过IAM提供的功能来让Flink程序安全的访问S3存储。 详细可以参阅[文档身份和访问管理 (IAM)](http://docs.aws.amazon.com/IAM/latest/UserGuide/introduction.html)。具体用户角色权限控制可以参考 [IAM Roles](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html). -If you set this up correctly, you can manage access to S3 within AWS and don’t need to distribute any access keys to Flink. +如果正确设置此选项,则可以在AWS中管理对S3的访问,并且Flink访问S3不需要任何keys -##### Access Keys (Discouraged) +##### Access Keys (不推荐) -Access to S3 can be granted via your **access and secret key pair**. Please note that this is discouraged since the [introduction of IAM roles](https://blogs.aws.amazon.com/security/post/Tx1XG3FX6VMU6O5/A-safer-way-to-distribute-AWS-credentials-to-EC2). +还可以通过 **access key**来对S3进行访问。相关内容可以参阅 [ IAM的角色](https://blogs.aws.amazon.com/security/post/Tx1XG3FX6VMU6O5/A-safer-way-to-distribute-AWS-credentials-to-EC2). -You need to configure both `s3.access-key` and `s3.secret-key` in Flink’s `flink-conf.yaml`: +这种情况下,需要在Flink的 `flink-conf.yaml`文件中同时配置`s3.access-key` 和 `s3.secret-key` @@ -113,24 +111,24 @@ s3.secret-key: your-secret-key -### Hadoop-provided S3 file systems - manual setup +### Hadoop提供的S3文件系统-手动设置 -**Note:** You don’t have to configure this manually if you are running [Flink on EMR](#emr-elastic-mapreduce). +**Note:** 在EMR上运行Flink不需要配置,请参阅 [Flink on EMR](#emr-elastic-mapreduce). -This setup is a bit more complex and we recommend using our shaded Hadoop/Presto file systems instead (see above) unless required otherwise, e.g. for using S3 as YARN’s resource storage dir via the `fs.defaultFS` configuration property in Hadoop’s `core-site.xml`. +因为这个设置比较复杂,所以除非是有其他需求,否则,还是建议才用上面的方式来实现。例如,修改Hadoop的 `core-site.xml`文件中的 `fs.defaultFS`配置来把S3作为存储目录. -#### Set S3 FileSystem +#### 设置 S3 文件系统 -Interaction with S3 happens via one of [Hadoop’s S3 FileSystem clients](https://wiki.apache.org/hadoop/AmazonS3): + S3相关内容可以参阅 [Hadoop’s S3 FileSystem clients](https://wiki.apache.org/hadoop/AmazonS3): -1. `S3AFileSystem` (**recommended** for Hadoop 2.7 and later): file system for reading and writing regular files using Amazon’s SDK internally. No maximum file size and works with IAM roles. -2. `NativeS3FileSystem` (for Hadoop 2.6 and earlier): file system for reading and writing regular files. Maximum object size is 5GB and does not work with IAM roles. +1. `S3AFileSystem` (**推荐** Hadoop 2.7 及其以上版本使用): 用于在内部使用Amazon SDK读取和写入常规文件的文件系统。没有最大文件大小并且与IAM角色一起使用。 +2. `NativeS3FileSystem` (Hadoop 2.6 及一起版本): 用于读写常规文件的文件系统。最大对象大小为5GB,不适用于IAM角色。 -##### `S3AFileSystem` (Recommended) +##### `S3AFileSystem` (推荐) -This is the recommended S3 FileSystem implementation to use. It uses Amazon’s SDK internally and works with IAM roles (see [Configure Access Credentials](#configure-access-credentials-1)). +推荐使用的S3 FileSystem实现。它在内部使用Amazon的SDK并与IAM角色一起使用 (参阅 [配置访问凭据](#configure-access-credentials-1)). -You need to point Flink to a valid Hadoop configuration, which contains the following properties in `core-site.xml`: +需要修改Flink中Hadoop的配置,配置文件 `core-site.xml`: @@ -154,13 +152,13 @@ You need to point Flink to a valid Hadoop configuration, which contains the foll -This registers `S3AFileSystem` as the default FileSystem for URIs with the `s3a://` scheme. +这里注册 `S3AFileSystem` 作为 `s3a://` URI开头的文件实现. ##### `NativeS3FileSystem` -This file system is limited to files up to 5GB in size and it does not work with IAM roles (see [Configure Access Credentials](#configure-access-credentials-1)), meaning that you have to manually configure your AWS credentials in the Hadoop config file. +此文件系统仅限于最大5GB的文件,并且不适用于IAM角色(参阅 [配置访问凭据](#configure-access-credentials-1)),所以需要在配置文件中配置AWS的access key。 -You need to point Flink to a valid Hadoop configuration, which contains the following property in `core-site.xml`: +需要修改Flink中Hadoop的配置,配置文件`core-site.xml`: @@ -173,14 +171,14 @@ You need to point Flink to a valid Hadoop configuration, which contains the foll -This registers `NativeS3FileSystem` as the default FileSystem for URIs with the `s3://` scheme. +这里注册 `NativeS3FileSystem` 作为 `s3://` URI开头文件的实现. -#### Hadoop Configuration +#### Hadoop 配置 -You can specify the [Hadoop configuration](../config.html#hdfs) in various ways pointing Flink to the path of the Hadoop configuration directory, for example +可以采用如下两种方式指定 [Hadoop 配置](../config.html#hdfs) 把Flink指向Hadoop的配置目录 -* by setting the environment variable `HADOOP_CONF_DIR`, or -* by setting the `fs.hdfs.hadoopconf` configuration option in `flink-conf.yaml`: +* 设置环境变量`HADOOP_CONF_DIR`, +* 设置 `flink-conf.yaml`文件中的 `fs.hdfs.hadoopconf`: @@ -190,27 +188,31 @@ fs.hdfs.hadoopconf: /path/to/etc/hadoop -This registers `/path/to/etc/hadoop` as Hadoop’s configuration directory with Flink. Flink will look for the `core-site.xml` and `hdfs-site.xml` files in the specified directory. + `/path/to/etc/hadoop` 被注册为Hadoop的配置目录. Flink 会在目录下查找 `core-site.xml` 和 `hdfs-site.xml` 文件。 + +#### 配置访问凭据 + +**Note:** 如果在EMR上直接运行Flink,则无需额外配置,请参阅 [Flink on EMR](#emr-elastic-mapreduce). -#### Configure Access Credentials +Flink中使用 S3 存储时,需要保证Flink可以访问到S3存储。 -**Note:** You don’t have to configure this manually if you are running [Flink on EMR](#emr-elastic-mapreduce). +##### 身份和访问管理 (IAM) (推荐) -After setting up the S3 FileSystem, you need to make sure that Flink is allowed to access your S3 buckets. +比较推荐使用身份和访问管理 (IAM)来设置Access key,可以通过IAM提供的功能来让Flink程序安全的访问S3存储。 详细可以参阅[文档身份和访问管理 (IAM)](http://docs.aws.amazon.com/IAM/latest/UserGuide/introduction.html)。具体用户角色权限控制可以参考 [IAM 角色介绍](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html). -##### Identity and Access Management (IAM) (Recommended) +如果正确设置此选项,则可以在AWS中管理对S3的访问,并且Flink访问S3不需要任何keys -When using `S3AFileSystem`, the recommended way of setting up credentials on AWS is via [Identity and Access Management (IAM)](http://docs.aws.amazon.com/IAM/latest/UserGuide/introduction.html). You can use IAM features to securely give Flink instances the credentials that they need in order to access S3 buckets. Details about how to do this are beyond the scope of this documentation. Please refer to the AWS user guide. What you are looking for are [IAM Roles](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html). +##### Access Keys (不推荐) -If you set this up correctly, you can manage access to S3 within AWS and don’t need to distribute any access keys to Flink. +还可以通过 **access key**来对S3进行访问。相关内容可以参阅 [ IAM的角色](https://blogs.aws.amazon.com/security/post/Tx1XG3FX6VMU6O5/A-safer-way-to-distribute-AWS-credentials-to-EC2). -Note that this only works with `S3AFileSystem` and not `NativeS3FileSystem`. +请注意,这只适用于 `S3AFileSystem` 而不是 `NativeS3FileSystem`. -##### Access Keys with `S3AFileSystem` (Discouraged) +##### 通过Access Keys 访问 `S3AFileSystem` (不推荐) -Access to S3 can be granted via your **access and secret key pair**. Please note that this is discouraged since the [introduction of IAM roles](https://blogs.aws.amazon.com/security/post/Tx1XG3FX6VMU6O5/A-safer-way-to-distribute-AWS-credentials-to-EC2). +可以用过 **Access Keys**来授权对S3存储的访问。 但是这种操作并不推荐,请参阅 [IAM 角色介绍](https://blogs.aws.amazon.com/security/post/Tx1XG3FX6VMU6O5/A-safer-way-to-distribute-AWS-credentials-to-EC2). -For `S3AFileSystem` you need to configure both `fs.s3a.access.key` and `fs.s3a.secret.key` in Hadoop’s `core-site.xml`: +对于 `S3AFileSystem` 需要在Hadoop的`core-site.xml`文件中同时配置 `fs.s3a.access.key` 和 `fs.s3a.secret.key` : @@ -228,11 +230,11 @@ For `S3AFileSystem` you need to configure both `fs.s3a.access.key` and `fs.s3a.s -##### Access Keys with `NativeS3FileSystem` (Discouraged) +##### 通过Access Keys访问`NativeS3FileSystem` (不推荐) -Access to S3 can be granted via your **access and secret key pair**. But this is discouraged and you should use `S3AFileSystem` [with the required IAM roles](https://blogs.aws.amazon.com/security/post/Tx1XG3FX6VMU6O5/A-safer-way-to-distribute-AWS-credentials-to-EC2). +可以用过 **Access Keys**来授权对S3存储的访问。 此文件系统已经过时,最好使用 `S3AFileSystem`来代替,详情参阅 [IAM 角色要求](https://blogs.aws.amazon.com/security/post/Tx1XG3FX6VMU6O5/A-safer-way-to-distribute-AWS-credentials-to-EC2). -For `NativeS3FileSystem` you need to configure both `fs.s3.awsAccessKeyId` and `fs.s3.awsSecretAccessKey` in Hadoop’s `core-site.xml`: + 对于`NativeS3FileSystem` 需要在Hadoop的 `core-site.xml`文件中同事配置 `fs.s3.awsAccessKeyId` 和 `fs.s3.awsSecretAccessKey` : @@ -250,21 +252,21 @@ For `NativeS3FileSystem` you need to configure both `fs.s3.awsAccessKeyId` and ` -#### Provide S3 FileSystem Dependency +#### 提供 S3 FileSystem 依赖 -**Note:** You don’t have to configure this manually if you are running [Flink on EMR](#emr-elastic-mapreduce). +**Note:** 在EMR上运行Flink则无需配置,请参阅 [Flink on EMR](#emr-elastic-mapreduce). -Hadoop’s S3 FileSystem clients are packaged in the `hadoop-aws` artifact (Hadoop version 2.6 and later). This JAR and all its dependencies need to be added to Flink’s classpath, i.e. the class path of both Job and TaskManagers. Depending on which FileSystem implementation and which Flink and Hadoop version you use, you need to provide different dependencies (see below). +Hadoop的S3 FileSystem客户端打包在 `hadoop-aws` jar中(Hadoop 2.6及其以后)。需要将JAR及其所有依赖项添加到Flink的类路径中,即Job和TaskManagers的类路径。根据使用的FileSystem实现以及使用的Flink和Hadoop版本,需要添加不同的依赖项(请参阅下文)。 -There are multiple ways of adding JARs to Flink’s class path, the easiest being simply to drop the JARs in Flink’s `lib` folder. You need to copy the `hadoop-aws` JAR with all its dependencies. You can also export the directory containing these JARs as part of the `HADOOP_CLASSPATH` environment variable on all machines. +有多种方法可以将JAR添加到Flink的类路径中,最简单的方法就是将JAR放在Flink的 `lib` 目录下。 拷贝 `hadoop-aws` JAR 和他的所有依赖复制到lib下,还可以把一个目录指定为 `HADOOP_CLASSPATH` 环境变量。 ##### Flink for Hadoop 2.7 -Depending on which file system you use, please add the following dependencies. You can find these as part of the Hadoop binaries in `hadoop-2.7/share/hadoop/tools/lib`: +根据您使用的操作系统,请添加以下依赖项。可以在`hadoop-2.7/share/hadoop/tools/lib`找到一部分: * `S3AFileSystem`: * `hadoop-aws-2.7.3.jar` - * `aws-java-sdk-s3-1.11.183.jar` and its dependencies: + * `aws-java-sdk-s3-1.11.183.jar` 及其依赖: * `aws-java-sdk-core-1.11.183.jar` * `aws-java-sdk-kms-1.11.183.jar` * `jackson-annotations-2.6.7.jar` @@ -277,11 +279,11 @@ Depending on which file system you use, please add the following dependencies. Y * `hadoop-aws-2.7.3.jar` * `guava-11.0.2.jar` -Note that `hadoop-common` is available as part of Flink, but Guava is shaded by Flink. +注意 `hadoop-common` 是Flink的一部分, 但 Guava不是。 ##### Flink for Hadoop 2.6 -Depending on which file system you use, please add the following dependencies. You can find these as part of the Hadoop binaries in `hadoop-2.6/share/hadoop/tools/lib`: +根据您使用的操作系统,请添加以下依赖项。可以在`hadoop-2.6/share/hadoop/tools/lib`找到一部分: * `S3AFileSystem`: * `hadoop-aws-2.6.4.jar` @@ -296,19 +298,19 @@ Depending on which file system you use, please add the following dependencies. Y * `hadoop-aws-2.6.4.jar` * `guava-11.0.2.jar` -Note that `hadoop-common` is available as part of Flink, but Guava is shaded by Flink. +注意 `hadoop-common` 是Flink的一部分, 但 Guava不是。 -##### Flink for Hadoop 2.4 and earlier +##### Flink for Hadoop 2.4及其以下版本 -These Hadoop versions only have support for `NativeS3FileSystem`. This comes pre-packaged with Flink for Hadoop 2 as part of `hadoop-common`. You don’t need to add anything to the classpath. +2.4及其以下版本只支持 `NativeS3FileSystem`相应依赖已经包含在`hadoop-common`中了,所以不需要额外添加依赖。 -## Common Issues +## 常见问题 -The following sections lists common issues when working with Flink on AWS. +下面列出了在AWS上使用Flink时的部分常见问题。 ### Missing S3 FileSystem Configuration -If your job submission fails with an Exception message noting that `No file system found with scheme s3` this means that no FileSystem has been configured for S3\. Please check out the configuration sections for our [shaded Hadoop/Presto](#shaded-hadooppresto-s3-file-systems-recommended) or [generic Hadoop](#set-s3-filesystem) file systems for details on how to configure this properly. +如果作业提交失败,并显示 `No file system found with scheme s3` 这说明S3文件系统的配置不正确,需要检查文件配置,参照 [Shaded Hadoop/Presto S3 文件系统 ](#shaded-hadooppresto-s3-file-systems-recommended) 或者 [Hadoop基本配置](#set-s3-filesystem) 保证配置正确性 @@ -330,7 +332,7 @@ Caused by: java.io.IOException: No file system found with scheme s3, ### AWS Access Key ID and Secret Access Key Not Specified -If you see your job failing with an Exception noting that the `AWS Access Key ID and Secret Access Key must be specified as the username or password`, your access credentials have not been set up properly. Please refer to the access credential section for our [shaded Hadoop/Presto](#configure-access-credentials) or [generic Hadoop](#configure-access-credentials-1) file systems for details on how to configure this. +如果作业失败并显示异常 `AWS Access Key ID and Secret Access Key must be specified as the username or password`, 未正确设置您的访问凭据。有关如何配置请参阅 [shaded Hadoop/Presto](#configure-access-credentials) or [Hadoop基本配置](#configure-access-credentials-1) 。 @@ -362,7 +364,7 @@ Caused by: java.lang.IllegalArgumentException: AWS Access Key ID and Secret Acce ### ClassNotFoundException: NativeS3FileSystem/S3AFileSystem Not Found -If you see this Exception, the S3 FileSystem is not part of the class path of Flink. Please refer to [S3 FileSystem dependency section](#provide-s3-filesystem-dependency) for details on how to configure this properly. +看到此异常,表示S3 FileSystem不是Flink的类路径的一部分请参阅 [S3 文件存储依赖](#provide-s3-filesystem-dependency) 。 @@ -391,7 +393,7 @@ Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3native ### IOException: `400: Bad Request` -If you have configured everything properly, but get a `Bad Request` Exception **and** your S3 bucket is located in region `eu-central-1`, you might be running an S3 client, which does not support [Amazon’s signature version 4](http://docs.aws.amazon.com/AmazonS3/latest/API/sig-v4-authenticating-requests.html). +如果您已正确配置所有内容,但是请求却显示 `Bad Request` 异常 **and** S3 bucket在 `eu-central-1`可用区, 可能是因为S3客户端不支持,请参阅[Amazon’s signature version 4](http://docs.aws.amazon.com/AmazonS3/latest/API/sig-v4-authenticating-requests.html). @@ -403,7 +405,7 @@ Caused by: org.jets3t.service.impl.rest.HttpException [...] -or +或 @@ -413,9 +415,9 @@ com.amazonaws.services.s3.model.AmazonS3Exception: Status Code: 400, AWS Service -This should not apply to our shaded Hadoop/Presto S3 file systems but can occur for Hadoop-provided S3 file systems. In particular, all Hadoop versions up to 2.7.2 running `NativeS3FileSystem` (which depend on `JetS3t 0.9.0` instead of a version [>= 0.9.4](http://www.jets3t.org/RELEASE_NOTES.html)) are affected but users also reported this happening with the `S3AFileSystem`. +Hadoop/Presto S3 存储不会有问题,但是 Hadoop-provided S3 file systems会有。Hadoop versions 大于 2.7.2 运行于 `NativeS3FileSystem` (依赖 `JetS3t 0.9.0` 版本 [>= 0.9.4](http://www.jets3t.org/RELEASE_NOTES.html)) 都会受影响,部分 `S3AFileSystem`也可能出现该异常。 -Except for changing the bucket region, you may also be able to solve this by [requesting signature version 4 for request authentication](https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingAWSSDK.html#specify-signature-version), e.g. by adding this to Flink’s JVM options in `flink-conf.yaml` (see [configuration](../config.html#common-options)): +除了更改可用区之外,可以参阅亚马逊的 [requesting signature version 4 for request authentication](https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingAWSSDK.html#specify-signature-version)进行修改, 例如. 在 `flink-conf.yaml` 中添加JVM参数(参阅 [配置](../config.html#common-options)): @@ -427,7 +429,7 @@ env.java.opts: -Dcom.amazonaws.services.s3.enableV4 ### NullPointerException at org.apache.hadoop.fs.LocalDirAllocator -This Exception is usually caused by skipping the local buffer directory configuration `fs.s3a.buffer.dir` for the `S3AFileSystem`. Please refer to the [S3AFileSystem configuration](#s3afilesystem-recommended) section to see how to configure the `S3AFileSystem` properly. +此异常通常是由跳过本地缓冲区目录配置引起 `S3AFileSystem`的`fs.s3a.buffer.dir`配置。参阅 [S3A文件系统配置](#s3afilesystem-recommended) 来正确配置.