# 谷歌计算引擎设置 # Google Compute Engine Setup 本文提供如何在[谷歌计算引擎](https://cloud.google.com/compute/)集群上全面自动的基于Hadoop 1 或者Hadoop 2设置Flink,谷歌的 [bdutil](https://cloud.google.com/hadoop/bdutil) 使启动和部署基于Hadoop的flink集群成为可能。请按照以下步骤操作。 This documentation provides instructions on how to setup Flink fully automatically with Hadoop 1 or Hadoop 2 on top of a [Google Compute Engine](https://cloud.google.com/compute/) cluster. This is made possible by Google’s [bdutil](https://cloud.google.com/hadoop/bdutil) which starts a cluster and deploys Flink with Hadoop. To get started, just follow the steps below. # 先决条件 # Prerequisites ## 安装谷歌云套件 ## Install Google Cloud SDK 请按照指示如何安装[谷歌云套件](https://cloud.google.com/sdk/)。特别是,请务必使用以下命令通过谷歌云进行身份验证: ``` gcloud auth login ``` Please follow the instructions on how to setup the [Google Cloud SDK](https://cloud.google.com/sdk/). In particular, make sure to authenticate with Google Cloud using the following command: ``` gcloud auth login ``` ## 安装bdutil ## Install bdutil 目前还没有包含Flink扩展的bdutil发布。 但是,您可以从[GitHub](https://github.com/GoogleCloudPlatform/bdutil)获得Flink支持的最新版本的bdutil: ``` git clone https://github.com/GoogleCloudPlatform/bdutil.git ``` 下载源代码后,切换到新创建的`bdutil`目录并继续执行后续步骤。 At the moment, there is no bdutil release yet which includes the Flink extension. However, you can get the latest version of bdutil with Flink support from [GitHub](https://github.com/GoogleCloudPlatform/bdutil): ``` git clone https://github.com/GoogleCloudPlatform/bdutil.git ``` After you have downloaded the source, change into the newly created `bdutil` directory and continue with the next steps. # 在Google Compute Engine上部署Flink # Deploying Flink on Google Compute Engine ## 设置一个桶 ## Set up a bucket 如果尚未执行此操作,请为bdutil配置和暂存文件创建存储桶。 可以使用gsutil创建一个新存储桶: ``` gsutil mb gs:// ``` If you have not done so, create a bucket for the bdutil config and staging files. A new bucket can be created with gsutil: ``` gsutil mb gs:// ``` ## 调整bdutil配置 ## Adapt the bdutil config 要使用bdutil部署Flink,请至少调整bdutil_env.sh中的以下变量: ``` CONFIGBUCKET="" PROJECT="" NUM_WORKERS= # set this to 'n1-standard-2' if you're using the free trial GCE_MACHINE_TYPE="" # for example: "europe-west1-d" GCE_ZONE="" ``` To deploy Flink with bdutil, adapt at least the following variables in bdutil_env.sh. ``` CONFIGBUCKET="" PROJECT="" NUM_WORKERS= # set this to 'n1-standard-2' if you're using the free trial GCE_MACHINE_TYPE="" # for example: "europe-west1-d" GCE_ZONE="" ``` ## 调整Flink配置 ## Adapt the Flink config bdutil的Flink扩展为您处理配置。 您还可以在`extensions / flink / flink_env.sh`中调整配置变量。 如果您想进一步配置,请查看[配置Flink](../ config.html)。 使用`bin / stop-cluster`和`bin / start-cluster`更改配置后,必须重新启动Flink。 bdutil’s Flink extension handles the configuration for you. You may additionally adjust configuration variables in `extensions/flink/flink_env.sh`. If you want to make further configuration, please take a look at [configuring Flink](../config.html). You will have to restart Flink after changing its configuration using `bin/stop-cluster` and `bin/start-cluster`. ## 使用Flink创建一个集群 ## Bring up a cluster with Flink 要在Google Compute Engine上调出Flink群集,请执行: ``` ./bdutil -e extensions/flink/flink_env.sh deploy ``` To bring up the Flink cluster on Google Compute Engine, execute: ``` ./bdutil -e extensions/flink/flink_env.sh deploy ``` ## 运行一个Flink示例作业: ``` ./bdutil shell cd /home/hadoop/flink-install/bin ./flink run ../examples/batch/WordCount.jar gs://dataflow-samples/shakespeare/othello.txt gs:///output ``` ## Run a Flink example job: ``` ./bdutil shell cd /home/hadoop/flink-install/bin ./flink run ../examples/batch/WordCount.jar gs://dataflow-samples/shakespeare/othello.txt gs:///output ``` ## 关闭你的集群 ## Shut down your cluster 关闭集群就像执行下面一样简单 ``` ./bdutil -e extensions/flink/flink_env.sh delete ``` Shutting down a cluster is as simple as executing ``` ./bdutil -e extensions/flink/flink_env.sh delete ```