Please be aware that this tutorial needs the following privileges for the user in AMI:
Please be aware that this tutorial needs the following privileges for the user in IAM:
- AmazonEC2FullAccess
- AmazonS3FullAccess
...
...
@@ -27,14 +27,6 @@ Please be aware that this tutorial needs the following privileges for the user i
- AWSKeyManagementServicePowerUser
By the time we write this tutorial, we noticed that Chinese AWS users
might suffer from authentication problems when running this tutorial.
Our solution is that we create a VM instance with the default Amazon
AMI and in the same zone as our cluster runs, so we can SSH to this VM
instance as a tunneling server and control our cluster and jobs from
it.
## PaddlePaddle on AWS
Here we will show you step by step on how to run PaddlePaddle training on AWS cluster.
...
...
@@ -59,7 +51,7 @@ gpg2 --fingerprint FC8A365E
```
The correct key fingerprint is `18AD 5014 C99E F7E3 BA5F 6CE9 50BD D3E0 FC8A 365E`
Go to the [releases](https://github.com/coreos/kube-aws/releases) and download the latest release tarball and detached signature (.sig) for your architecture.
Go to the [releases](https://github.com/coreos/kube-aws/releases) and download release tarball (this tutorial is using v0.9.1) and detached signature (.sig) for your architecture.
Make the kubectl binary executable and move it to your PATH (e.g. `/usr/local/bin`):
```
chmod +x ./kubectl
sudo mv ./kubectl /usr/local/bin/kubectl
```
### Configure AWS Credentials
...
...
@@ -109,17 +109,18 @@ aws configure
```
Fill in the required fields (You can get your AWS aceess key id and AWS secrete access key by following [this](http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html) instruction):
Fill in the required fields:
```
AWS Access Key ID: YOUR_ACCESS_KEY_ID
AWS Secrete Access Key: YOUR_SECRETE_ACCESS_KEY
Default region name: us-west-2
Default region name: us-west-1
Default output format: json
```
`YOUR_ACCESS_KEY_ID`, and `YOUR_SECRETE_ACCESS_KEY` is the IAM key and secret from [Create AWS Account and IAM Account](#create-aws-account-and-iam-account)
Verify that your credentials work by describing any instances you may already have running on your account:
```
...
...
@@ -134,7 +135,9 @@ The keypair that will authenticate SSH access to your EC2 instances. The public
Follow [EC2 Keypair docs](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-key-pairs.html) to create a EC2 key pair
After creating a key pair, you will use the name you gave the keys to configure the cluster. Key pairs are only available to EC2 instances in the same region.
After creating a key pair, you will use the key pair name to configure the cluster.
Key pairs are only available to EC2 instances in the same region. We are using us-west-1 in our tutorial, so make sure to creat key pairs in that region (N. California).
#### KMS key
...
...
@@ -143,12 +146,12 @@ Amazon KMS keys are used to encrypt and decrypt cluster TLS assets. If you alrea
You can create a KMS key in the AWS console, or with the aws command line tool:
`AWS_ACCOUNT_ID`: You can get it from following command line:
```
aws sts get-caller-identity --output text --query Account
```
`MY_CLUSTER_NAME`: Pick a MY_CLUSTER_NAME that you like, you will use it later as well.
#### External DNS name
When the cluster is created, the controller will expose the TLS-secured API on a public IP address. You will need to create an A record for the external DNS hostname you want to point to this IP address. You can find the API external IP address after the cluster is created by invoking kube-aws status.
When the cluster is created, the controller will expose the TLS-secured API on a DNS name.
The A record of that DNS name needs to be point to the cluster ip address.
We will need to use DNS name later in tutorial. If you don't already own one, you can choose any DNS name (e.g., `paddle`) and modify `/etc/hosts` to associate cluster ip with that DNS name.
#### S3 bucket
You need to create an S3 bucket before startup the Kubernetes cluster.
command (need to have a global unique name):
There are some bug in aws cli in creating S3 bucket, so let's use [web console](https://console.aws.amazon.com/s3/home?region=us-west-1).
Here `us-west-1c` is used for parameter `--availability-zone`, but supported availability zone varies among AWS accounts.
`MY_CLUSTER_NAME`: the one you picked in [KMS key](#kms-key)
`MY_EXTERNAL_DNS_NAME`: see [External DNS name](#external-dns-name)
Please check if `us-west-1c` is supported by `aws ec2 --region us-west-1 describe-availability-zones`, if not switch to other supported availability zone. (e.g., `us-west-1a`, or `us-west-1b`)
`KEY_PAIR_NAME`: see [EC2 key pair](#ec2-key-pair)
`--kms-key-arn`: the "Arn" in [KMS key](#kms-key)
Here `us-west-1a` is used for parameter `--availability-zone`, but supported availability zone varies among AWS accounts.
Please check if `us-west-1a` is supported by `aws ec2 --region us-west-1 describe-availability-zones`, if not switch to other supported availability zone. (e.g., `us-west-1a`, or `us-west-1b`)
Note: please don't use `us-west-1c`. Subnets can currently only be created in the following availability zones: us-west-1b, us-west-1a.
There will now be a cluster.yaml file in the asset directory. This is the main configuration file for your cluster.
#### Render contents of the asset directory
In the simplest case, you can have kube-aws generate both your TLS identities and certificate authority for you.
```
$ kube-aws render credentials --generate-ca
kube-aws render credentials --generate-ca
```
The next command generates the default set of cluster assets in your asset directory.
```
sh $ kube-aws render stack
kube-aws render stack
```
Here's what the directory structure looks like:
...
...
@@ -292,15 +314,41 @@ These assets (templates and credentials) are used to create, update and interact
#### Create the instances defined in the CloudFormation template
Now for the exciting part, creating your cluster (choose any `<prefix>`):
Now let's create your cluster (choose any PREFIX for the command below):
```
$ kube-aws up --s3-uri s3://<your-bucket-name>/<prefix>
kube-aws up --s3-uri s3://BUCKET_NAME/PREFIX
```
`BUCKET_NAME`: the bucket name that you used in [S3 bucket](#s3-bucket)
#### Configure DNS
You can invoke `kube-aws status` to get the cluster API endpoint after cluster creation, if necessary. This command can take a while. And use command `dig` to check the load balancer hostname to get the ip address, use this ip to setup an A record for your external dns name.
You can invoke `kube-aws status` to get the cluster API endpoint after cluster creation.
```
$ kube-aws status
Cluster Name: paddle-cluster
Controller DNS Name: paddle-cl-ElbAPISe-EEOI3EZPR86C-531251350.us-west-1.elb.amazonaws.com
```
Use command `dig` to check the load balancer hostname to get the ip address.