提交 ddb205c9 编写于 作者: S Sarah Maddox 提交者: Kubernetes Prow Robot

kfctl updates for v0.7 - GCP docs only (#1234)

* WIP v0.7 updates to kfctl init

* More updates to deployment; updated customization; added interim URIs for config files.

* Updated section on accessing UI in CLI deployment.

* Updated kfctl commands in remaining GCP deployment docs.

* Updated the guide to securing clusters for kfctl v0.7.

* Updated the troubleshooting guide for kfctl v0.7.

* Fixed list formatting throughout the guide to securing clusters.

* Addressed review comments and set KF latest version.

* WIP updating for change in config file name.

* Further clarifications to the deployment and deletion guides.

* Updated the customization guide for change in config name and delete.

* Updated for new config file name in the guide to monitoring IAP.

* Changed KFAPP variable in auth guide.

* Updated Filestore guide.

* A few tweaks.

* Updated cluster security guide.

* Updated the troubleshooting guide.

* Standardised CONFIG_FILE to contain full path.

* Updated guide to custom domains.

* Added GPU info and clarified KF-NAME/KF_DIR.

* Updated to latest RC and config URLs.
上级 485945a8
......@@ -9,7 +9,7 @@ weight = 4
When you [set up Kubeflow for GCP](/docs/gke/deploy), it will automatically
[provision three service accounts](https://www.kubeflow.org/docs/gke/deploy/deploy-cli/#gcp-service-accounts)
with different privileges in the `kubeflow` namespace. In particular, the `${KFAPP}-user` service account is
with different privileges in the `kubeflow` namespace. In particular, the `${KF_NAME}-user` service account is
meant to grant your user services access to GCP. The credentials to this service account can be accessed within
the cluster as a [Kubernetes secret](https://kubernetes.io/docs/concepts/configuration/secret/) called `user-gcp-sa`.
......@@ -104,7 +104,7 @@ so be careful which Pods you grant access to.
1. **Set the `GOOGLE_APPLICATION_CREDENTIALS` environment variable** to point to the service account.
GCP libraries will use this environment variable to find the service account and authenticate with GCP.
The following YAML describes a Pod that has access to the `${KFAPP}-user` service account:
The following YAML describes a Pod that has access to the `${KF_NAME}-user` service account:
```
apiVersion: v1
kind: Pod
......
......@@ -17,11 +17,54 @@ Cloud Filestore is very useful for creating a shared filesystem that can be moun
This guide assumes you have already set up Kubeflow on GCP. If you haven't done
so, follow the guide to [deploying Kubeflow on GCP](/docs/gke/deploy/).
The instructions below assume that the `${KFAPP}` environment variable contains
the *name* (not the full path) of the directory containing your Kubeflow
configurations. See the
[Kubeflow deployment guide](/docs/gke/deploy/deploy-cli/) for details of this
directory.
This guide assumes the following settings:
* The `${KF_DIR}` environment variable contains the path to
your Kubeflow application directory, which holds your Kubeflow configuration
files. For example, `/opt/my-kubeflow/`.
```
export KF_DIR=<path to your Kubeflow application directory>
```
* The `${CONFIG_FILE}` environment variable contains the path to your
Kubeflow configuration file.
```
export CONFIG_FILE=${KF_DIR}/kfctl_gcp_iap.yaml
```
Or:
```
export CONFIG_FILE=${KF_DIR}/kfctl_gcp_basic_auth.yaml
```
* The `${KF_NAME}` environment variable contains the name of your Kubeflow
deployment. You can find the name in your `${CONFIG_FILE}`
configuration file, as the value for the `metadata.name` key.
```
export KF_NAME=<the name of your Kubeflow deployment>
```
* The `${PROJECT}` environment variable contains the ID of your GCP project.
You can find the project ID in
your `${CONFIG_FILE}` configuraiton file, as the value for the `project` key.
```
export PROJECT=<your GCP project ID>
```
* The `${ZONE}` environment variable contains the GCP zone where your
Kubeflow resources are deployed.
```
export ZONE=<your GCP zone>
```
* For further background about the above settings, see the guide to
[deploying Kubeflow with the CLI](/docs/gke/deploy/deploy-cli).
## Create a Cloud Filestore instance
......@@ -31,7 +74,7 @@ use you can skip this section.
Copy the Cloud Filestore deployment manager configs to the `gcp_config` directory:
```
cd /<path-to-kubeflow-deployment>/${KFAPP}
cd ${KF_DIR}
cp .cache/${VERSION}/deployment/gke/deployment_manager_configs/gcfs.yaml \
./gcp_config/
```
......@@ -49,9 +92,9 @@ Edit `gcfs.yaml` to match your desired configuration:
Using [yq](https://github.com/kislyuk/yq):
```
cd /<path-to-kubeflow-deployment>/${KFAPP}
cd ${KF_DIR}
. env.sh
yq -r ".resources[0].properties.instanceId=\"${DEPLOYMENT_NAME}\"" gcp_config/gcfs.yaml > gcp_config/gcfs.yaml.new
yq -r ".resources[0].properties.instanceId=\"${KF_NAME}\"" gcp_config/gcfs.yaml > gcp_config/gcfs.yaml.new
mv gcp_config/gcfs.yaml.new gcp_config/gcfs.yaml
```
......@@ -63,8 +106,8 @@ Apply the changes:
-->
```
cd /<path-to-kubeflow-deployment>/${KFAPP}/gcp_config
gcloud --project=${PROJECT} deployment-manager deployments create ${KFAPP-NAME}-nfs --config=gcfs.yaml
cd ${KF_DIR}/gcp_config
gcloud --project=${PROJECT} deployment-manager deployments create ${KF_NAME}-nfs --config=gcfs.yaml
```
If you get an error **legacy networks are not supported** follow the instructions
......
......@@ -15,19 +15,19 @@ so, follow the guide to
## Using your own domain
If you want to use your own domain instead of **${name}.endpoints.${project}.cloud.goog**, follow these instructions:
If you want to use your own domain instead of **${KF_NAME}.endpoints.${PROJECT}.cloud.goog**, follow these instructions:
1. Remove the `cloud-endpoints` component:
```
cd ${KFAPP}/kustomize
cd ${KF_DIR}/kustomize
kubectl delete -f cloud-endpoints.yaml
```
1. Set the domain for your ingress to be the fully qualified domain name:
```
cd ${KFAPP}/kustomize
cd ${KF_DIR}/kustomize
gvim iap-ingress.yaml # Or basic-auth-ingress.yaml
```
......@@ -47,7 +47,7 @@ If you want to use your own domain instead of **${name}.endpoints.${project}.clo
1. Get the address of the static IP address created:
```
IPNAME=${DEPLOYMENT_NAME}-ip
IPNAME=${KF_NAME}-ip
gcloud --project=${PROJECT} compute addresses describe --global ${IPNAME}
```
......
......@@ -7,56 +7,109 @@ weight = 2
This guide describes how to customize your deployment of Kubeflow on Google
Kubernetes Engine (GKE) in Google Cloud Platform (GCP).
## Customizing Kubeflow before deployment
The Kubeflow deployment process is divided into two steps, **build** and
**apply**, so that you can modify your configuration before deploying your
Kubeflow cluster.
Follow the guide to [deploying Kubeflow on GCP](/docs/gke/deploy/deploy-cli/).
When you reach the
[setup and deploy step](/docs/gke/deploy/deploy-cli/#set-up-and-deploy),
**skip the `kfctl apply` command** and run the **`kfctl build`** command
instead, as described in that step. Now you can edit the configuration files
before deploying Kubeflow.
## Customizing an existing deployment
You can also customize an existing Kubeflow deployment. In that case, this
guide assumes that you have already followed the guide to
[deploying Kubeflow on GCP](/docs/gke/deploy/deploy-cli/) and have deployed
Kubeflow to a GKE cluster.
## Before you start
This guide assumes you have already set up Kubeflow with GKE. If you haven't done
so, follow the guide to [deploying Kubeflow on GCP](/docs/gke/deploy/).
This guide assumes the following settings:
## Customizing Kubeflow
* The `${KF_DIR}` environment variable contains the path to
your Kubeflow application directory, which holds your Kubeflow configuration
files. For example, `/opt/my-kubeflow/`.
You can use [kustomize](https://kustomize.io/) to customize Kubeflow.
```
export KF_DIR=<path to your Kubeflow application directory>
```
* The `${CONFIG_FILE}` environment variable contains the path to your
Kubeflow configuration file.
```
export CONFIG_FILE=${KF_DIR}/kfctl_gcp_iap.yaml
```
Or:
```
export CONFIG_FILE=${KF_DIR}/kfctl_gcp_basic_auth.yaml
```
* The `${KF_NAME}` environment variable contains the name of your Kubeflow
deployment. You can find the name in your
`${CONFIG_FILE}` configuration file, as the value for the `metadata.name` key.
```
export KF_NAME=<the name of your Kubeflow deployment>
```
* The `${PROJECT}` environment variable contains the ID of your GCP project.
You can find the project ID in your
`${CONFIG_FILE}` configuration file, as the value for the `project` key.
```
export PROJECT=<your GCP project ID>
```
The deployment process is divided into two steps, **generate** and **apply**, so that you can
modify your deployment before actually deploying.
* For further background about the above settings, see the guide to
[deploying Kubeflow with the CLI](/docs/gke/deploy/deploy-cli).
To customize GCP resources (such as your Kubernetes Engine cluster), you can modify the deployment manager configs in **${KFAPP}/gcp_config**.
## Customizing GCP resources
Many changes can be applied to an existing configuration in which case you can run:
To customize GCP resources, such as your Kubernetes Engine cluster, you can
modify the Deployment Manager configuration settings in `${KF_DIR}/gcp_config`.
After modifying your existing configuration, run the following command to apply
the changes:
```
cd ${KFAPP}
kfctl apply platform
cd ${KF_DIR}
kfctl apply -V -f ${CONFIG_FILE}
```
or using Deployment Manager directly:
Alternatively, you can use Deployment Manager directly:
```
cd ${KFAPP}/gcp_config
gcloud deployment-manager --project=${PROJECT} deployments update ${DEPLOYMENT_NAME} --config=cluster-kubeflow.yaml
cd ${KF_DIR}/gcp_config
gcloud deployment-manager --project=${PROJECT} deployments update ${KF_NAME} --config=cluster-kubeflow.yaml
```
* **PROJECT** Name of your GCP project. You could find it in `${KFAPP}/app.yaml`.
* **DEPLOYMENT_NAME** Name of your Kubeflow app. You could also find it in `${KFAPP}/app.yaml`.
In specific, `.metadata.name`
Some changes (such as the VM service account for Kubernetes Engine) can only be set at creation time; in this case you need
to tear down your deployment before recreating it:
```
cd ${KFAPP}
kfctl delete all
kfctl apply all
cd ${KF_DIR}
kfctl delete -f ${CONFIG_FILE}
kfctl apply -V -f ${CONFIG_FILE}
```
To customize the Kubeflow resources running within the cluster you can modify the kustomize manifests in **${KFAPP}/kustomize**.
For example, to modify settings for the Jupyter web app:
## Customizing Kubernetes resources
```
cd ${KFAPP}/kustomize
gvim jupyter-web-app.yaml
```
You can use [kustomize](https://kustomize.io/) to customize Kubeflow.
To customize the Kubernetes resources running within the cluster, you can modify
the kustomize manifests in `${KF_DIR}/kustomize`.
Find and replace the parameter values:
For example, to modify settings for the Jupyter web app:
1. Open `${KF_DIR}/kustomize/jupyter-web-app.yaml` in a text editor.
1. Find and replace the parameter values:
```
apiVersion: v1
data:
......@@ -74,30 +127,61 @@ metadata:
namespace: kubeflow
```
You can then redeploy using `kfctl`:
1. Redeploy Kubeflow using kfctl:
```
cd ${KFAPP}
kfctl apply k8s
```
```
cd ${KF_DIR}
kfctl apply -V -f ${CONFIG_FILE}
```
Or use kubectl directly:
```
cd ${KF_DIR}/kustomize
kubectl apply -f jupyter-web-app.yaml
```
## Common customizations
<a id="gpu-config"></a>
### Add GPU nodes to your cluster
To add GPU accelerators to your Kubeflow cluster, you have the following
options:
* Pick a GCP zone that provides NVIDIA Tesla K80 Accelerators
(`nvidia-tesla-k80`).
* Or disable node-autoprovisioning in your Kubeflow cluster.
* Or change your node-autoprovisioning configuration.
To see which accelerators are available in each zone, run the following
command:
or using kubectl directly:
```
cd ${KFAPP}/kustomize
kubectl apply -f jupyter-web-app.yaml
gcloud compute accelerator-types list
```
## Common customizations
To disable node-autoprovisioning, run `kfctl build` as described above.
Then edit `${KF_DIR}/gcp_config/cluster-kubeflow.yaml` and set
[`enabled`](https://github.com/kubeflow/manifests/blob/4d2939d6c1a5fd862610382fde130cad33bfef75/gcp/deployment_manager_configs/cluster-kubeflow.yaml#L73)
to `false`:
Add GPU nodes to your cluster:
```
...
gpu-type: nvidia-tesla-k80
autoprovisioning-config:
enabled: false
...
```
* Set gpu-pool-initialNodeCount [here](https://github.com/kubeflow/kubeflow/blob/{{< params "githubbranch" >}}/deployment/gke/deployment_manager_configs/cluster-kubeflow.yaml#L56).
You must also set
[`gpu-pool-initialNodeCount`](https://github.com/kubeflow/manifests/blob/4d2939d6c1a5fd862610382fde130cad33bfef75/gcp/deployment_manager_configs/cluster-kubeflow.yaml#L58).
Add Cloud TPUs to your cluster:
### Add Cloud TPUs to your cluster
* Set `enable_tpu:true` [here](https://github.com/kubeflow/kubeflow/blob/{{< params "githubbranch" >}}/deployment/gke/deployment_manager_configs/cluster-kubeflow.yaml#L78).
Set [`enable_tpu:true`](https://github.com/kubeflow/manifests/blob/4d2939d6c1a5fd862610382fde130cad33bfef75/gcp/deployment_manager_configs/cluster-kubeflow.yaml#L80)
in `${KF_DIR}/gcp_config/cluster-kubeflow.yaml`.
Add VMs with more CPUs or RAM:
### Add VMs with more CPUs or RAM
* Change the machineType.
* There are two node pools:
......@@ -105,7 +189,7 @@ Add VMs with more CPUs or RAM:
* one for GPU machines [here](https://github.com/kubeflow/kubeflow/blob/{{< params "githubbranch" >}}/scripts/gke/deployment_manager_configs/cluster.jinja#L149).
* When making changes to the node pools you also need to bump the pool-version [here](https://github.com/kubeflow/kubeflow/blob/{{< params "githubbranch" >}}/scripts/gke/deployment_manager_configs/cluster-kubeflow.yaml#L37) before you update the deployment.
Add users to Kubeflow:
### Add users to Kubeflow
* To grant users access to Kubeflow, add the “IAP-secured Web App User” role on the [IAM page in the GCP console](https://console.cloud.google.com/iam-admin/iam). Make sure you are in the same project as your Kubeflow deployment.
......
......@@ -7,26 +7,47 @@ weight = 6
This page shows you how to use the CLI to delete a Kubeflow deployment on
Google Cloud Platform (GCP).
## Before you start
This guide assumes the following settings:
* The `${KF_DIR}` environment variable contains the path to
your Kubeflow application directory, which holds your Kubeflow configuration
files. For example, `/opt/my-kubeflow/`.
```
export KF_DIR=<path to your Kubeflow application directory>
```
* The `${CONFIG_FILE}` environment variable contains the path to your
Kubeflow configuration file.
```
export CONFIG_FILE=${KF_DIR}/kfctl_gcp_iap.yaml
```
Or:
```
export CONFIG_FILE=${KF_DIR}/kfctl_gcp_basic_auth.yaml
```
For further background about the above settings, see the guide to
[deploying Kubeflow with the CLI](/docs/gke/deploy/deploy-cli).
## Deleting your deployment
Run the following commands to delete your deployment and reclaim all GCP
resources:
```
cd ${KFAPP}
# If you want to delete all the resources, including storage:
kfctl delete all --delete_storage
kfctl delete -f ${CONFIG_FILE} --delete_storage
# If you want to preserve storage, which contains metadata and information
# from Kubeflow Pipelines:
kfctl delete all
kfctl delete -f ${CONFIG_FILE}
```
The environment variable `${KFAPP}` must contain the _name_ of the directory
that contains your Kubeflow configurations. This directory was created when you
deployed Kubeflow.
* The name of the directory is the same as the name of your Kubeflow deployment.
If you deployed Kubeflow [using the UI](/docs/gke/deploy/deploy-ui/), the
value of `${KFAPP}` is the value of the **Deployment name** field on the UI.
* If you deployed Kubeflow [using the CLI](/docs/gke/deploy/deploy-cli/), use
the same value as you used when you ran `kfctl init`.
You should consider preserving storage if you may want to relaunch
Kubeflow in the future and restore the data from your
......
......@@ -32,9 +32,21 @@ Before installing Kubeflow on the command line:
access to sensitive data. Alternatively, you can use basic authentication
with a username and password.
## Deploy Kubeflow
<a id="prepare-environment"></a>
## Prepare your environment
Follow these steps to deploy Kubeflow:
Follow these steps to download the kfctl binary for the Kubeflow CLI and set
some handy environment variables:
1. Download the kfctl {{% kf-latest-version %}} release from the
[Kubeflow releases
page](https://github.com/kubeflow/kubeflow/releases/tag/{{% kf-latest-version %}}).
1. Unpack the tar ball:
```
tar -xvf kfctl_{{% kf-latest-version %}}_<platform>.tar.gz
```
1. Create user credentials. You only need to run this command once:
......@@ -42,94 +54,152 @@ Follow these steps to deploy Kubeflow:
gcloud auth application-default login
```
1. Create environment variables for your access control services:
1. Create environment variables to make the deployment process easier:
```
# Set your GCP project ID and the zone where you want to create
# the Kubeflow deployment:
export PROJECT=<your GCP project ID>
gcloud config set project ${PROJECT}
export ZONE=<your GCP zone>
gcloud config set compute/zone ${ZONE}
```bash
# If using Cloud IAP, create environment variables from the
# OAuth client ID and secret that you obtained earlier:
# Use the following kfctl configuration file for authentication with
# Cloud IAP (recommended):
export CONFIG_URI="{{% config-uri-gcp-iap %}}"
# If using Cloud IAP for authentication, create environment variables
# from the OAuth client ID and secret that you obtained earlier:
export CLIENT_ID=<CLIENT_ID from OAuth page>
export CLIENT_SECRET=<CLIENT_SECRET from OAuth page>
# If using basic authentication, create environment variables for
# username and password:
# Alternatively, use the following kfctl configuration if you want to use
# basic authentication:
export CONFIG_URI="{{% config-uri-gcp-basic-auth %}}"
# If using basic authentication, create environment variables
# for username and password:
export KUBEFLOW_USERNAME=<your username>
export KUBEFLOW_PASSWORD=<your password>
# Set KF_NAME to the name of your Kubeflow deployment. You also use this
# value as directory name when creating your configuration directory.
# See the detailed description in the text below this code snippet.
# For example, your deployment name can be 'my-kubeflow' or 'kf-test'.
export KF_NAME=<your choice of name for the Kubeflow deployment>
# Set the path to the base directory where you want to store one or more
# Kubeflow deployments. For example, /opt/.
# Then set the Kubeflow application directory for this deployment.
export BASE_DIR=<path to a base directory>
export KF_DIR=${BASE_DIR}/${KF_NAME}
# The following command is optional. It adds the kfctl binary to your path.
# If you don't add kfctl to your path, you must use the full path
# each time you run kfctl.
export PATH=$PATH:<path to your kfctl file>
```
1. Download a `kfctl` release from the
[Kubeflow releases page](https://github.com/kubeflow/kubeflow/releases/).
Notes:
1. Unpack the tar ball:
* **${PROJECT}** - The project ID of the GCP project where you want Kubeflow
deployed.
* **${ZONE}** - The GCP zone where you want to create the Kubeflow deployment.
You can see a list of zones in the
[Compute Engine documentation](https://cloud.google.com/compute/docs/regions-zones/#available).
If you plan to use accelerators, you must choose a zone that supports the
type you want. See the guide to
[customizing your Kubeflow deployment](/docs/gke/customizing-gke/#gpu-config).
* **${CONFIG_URI}** - The GitHub address of the configuration YAML file that
you want to use to deploy Kubeflow. For GCP deployments, the following
configurations are available:
* `{{% config-uri-gcp-iap %}}`
* `{{% config-uri-gcp-basic-auth %}}`
When you run `kfctl apply` or `kfctl build` (see the next step), kfctl creates
a local version of the configuration YAML file which you can further
customize if necessary.
* **${KF_NAME}** - The name of your Kubeflow deployment.
If you want a custom deployment name, specify that name here.
For example, `my-kubeflow` or `kf-test`.
The value of KF_NAME must consist of lower case alphanumeric characters or
'-', and must start and end with an alphanumeric character.
The value of this variable cannot be greater than 25 characters. It must
contain just a name, not a directory path.
You also use this value as directory name when creating the directory where
your Kubeflow configurations are stored, that is, the Kubeflow application
directory.
* **${KF_DIR}** - The full path to your Kubeflow application directory.
<a id="set-up-and-deploy"></a>
## Set up and deploy Kubeflow
To set up and deploy Kubeflow using the **default settings**,
run the `kfctl apply` command:
```
mkdir -p ${KF_DIR}
cd ${KF_DIR}
kfctl apply -V -f ${CONFIG_URI}
```
## Alternatively, set up your configuration for later deployment
If you want to customize your configuration before deploying Kubeflow, you can
set up your configuration files first, then edit the configuration, then
deploy Kubeflow:
1. Run the `kfctl build` command to set up your configuration:
```
tar -xvf kfctl_<release tag>_<platform>.tar.gz
mkdir -p ${KF_DIR}
cd ${KF_DIR}
kfctl build -V -f ${CONFIG_URI}
```
1. Run the following commands to set up and deploy Kubeflow. The code below
includes an optional command to add the binary `kfctl` to your path. If you
don't add the binary to your path, you must use the full path to the `kfctl`
binary each time you run it.
1. Edit the configuration files, as described in the guide to
[customizing your Kubeflow deployment](/docs/gke/customizing-gke/).
```bash
# The following command is optional, to make kfctl binary easier to use.
export PATH=$PATH:<path to your kfctl file>
1. Set an environment variable for your local configuration file:
# Set KFAPP to the name of your Kubeflow application. See detailed
# description in the text below this code snippet.
# For example, 'kubeflow-test' or 'kfw-test'.
export KFAPP=<your choice of application directory name>
```
export CONFIG_FILE=${KF_DIR}/kfctl_gcp_iap.yaml
```
export ZONE=<your target GCP zone> # where the deployment will be created
export PROJECT=<your GCP project ID>
Or:
# Run the following commands for the default installation which uses Cloud IAP:
export CONFIG="{{% config-uri-gcp-iap %}}"
kfctl init ${KFAPP} --project=${PROJECT} --config=${CONFIG} -V
# Alternatively, run these commands if you want to use basic authentication:
export CONFIG="{{% config-uri-gcp-basic-auth %}}"
kfctl init ${KFAPP} --project=${PROJECT} --config=${CONFIG} -V --use_basic_auth
cd ${KFAPP}
kfctl generate all -V --zone ${ZONE}
kfctl apply all -V
```
* **${KFAPP}** - the _name_ of a directory where you want Kubeflow
configurations to be stored. This directory is created when you run
`kfctl init`. If you want a custom deployment name, specify that name here.
The value of this variable becomes the name of your deployment.
The value of KFAPP must consist of lower case alphanumeric characters or
'-', and must start and end with an alphanumeric character.
For example, 'kubeflow-test' or 'kfw-test'.
The value of this variable cannot be greater than 25 characters. It must
contain just the directory name, not the full path to the directory.
The content of this directory is described in the next section.
* **${PROJECT}** - the project ID of the GCP project where you want Kubeflow
deployed.
* **${ZONE}** - You can see a list of zones [here](https://cloud.google.com/compute/docs/regions-zones/#available).
If you plan to use accelerators, make sure to pick a zone that supports the type you want.
* When you run `kfctl init` you need to choose to use either IAP or basic
authentication, as described above.
* `kfctl generate all` attempts to fetch your email address from your
credential. If it can't find a valid email address, you need to pass a
valid email address with flag `--email <your email address>`. This email
address becomes an administrator in the configuration of your Kubeflow
deployment.
```
export CONFIG_FILE=${KF_DIR}/kfctl_gcp_basic_auth.yaml
```
1. Run the `kfctl apply` command to deploy Kubeflow:
```
kfctl apply -V -f ${CONFIG_FILE}
```
## Check your deployment
Follow these steps to verify the deployment:
1. The deployment process creates a separate deployment for your data storage.
After running `kfctl apply` you should notice two new [deployments](https://console.cloud.google.com/dm/deployments):
* **{KFAPP}-storage**: This deployment has persistent volumes for your
After running `kfctl apply` you should notice two new
[deployments](https://console.cloud.google.com/dm/deployments):
* **{KF_NAME}-storage**: This deployment has persistent volumes for your
pipelines.
* **{KFAPP}**: This deployment has all the components of Kubeflow, including
* **{KF_NAME}**: This deployment has all the components of Kubeflow, including
a [GKE cluster](https://console.cloud.google.com/kubernetes/list)
named **${KFAPP}** with Kubeflow installed.
named **${KF_NAME}** with Kubeflow installed.
1. When the deployment finishes, check the resources installed in the namespace
`kubeflow` in your new cluster. To do this from the command line, first set
your `kubectl` credentials to point to the new cluster:
```
gcloud container clusters get-credentials ${KFAPP} --zone ${ZONE} --project ${PROJECT}
gcloud container clusters get-credentials ${KF_NAME} --zone ${ZONE} --project ${PROJECT}
```
Then see what's installed in the `kubeflow` namespace of your GKE cluster:
......@@ -138,80 +208,149 @@ Follow these steps to deploy Kubeflow:
kubectl -n kubeflow get all
```
1. Access the Kubeflow central dashboard at the following URI when it becomes
available:
## Access the Kubeflow user interface (UI)
Follow these steps to access the Kubeflow central dashboard:
1. Enter the following URI into your browser address bar. It can take 20
minutes for the URI to become available:
```
https://<KFAPP>.endpoints.<project-id>.cloud.goog/
https://<KF_NAME>.endpoints.<project-id>.cloud.goog/
```
* It can take 20 minutes for the URI to become available.
You can run the following command to get the URI for your deployment:
```
kubectl -n istio-system get ingress
NAME HOSTS ADDRESS PORTS AGE
envoy-ingress your-kubeflow-name.endpoints.your-gcp-project.cloud.goog 34.102.232.34 80 5d13h
```
The following command sets an environment variable named `HOST` to the URI:
```
export HOST=$(kubectl -n istio-system get ingress envoy-ingress -o=jsonpath={.spec.rules[0].host})
```
1. Follow the instructions on the UI to create a namespace. See the guide to
[creation of profiles](/docs/other-guides/multi-user-overview/#automatic-creation-of-profiles).
Notes:
* It can take 20 minutes for the URI to become available.
Kubeflow needs to provision a signed SSL certificate and register a DNS
name.
* If you own/manage the domain or a subdomain with
* If you own or manage the domain or a subdomain with
[Cloud DNS](https://cloud.google.com/dns/docs/)
then you can configure this process to be much faster.
See [kubeflow/kubeflow#731](https://github.com/kubeflow/kubeflow/issues/731).
1. We recommend that you check in the contents of your **${KFAPP}** directory
into source control.
## Understanding the deployment process
The `kfctl` deployment process includes by the following commands:
This section gives you more details about the kfctl configuration and
deployment process, so that you can customize your Kubeflow deployment if
necessary.
### kfctl process and configuration
The kfctl deployment process includes the following commands:
* `kfctl build` - (Optional) Creates configuration files defining the various
resources in your deployment. You only need to run `kfctl build` if you want
to edit the resources before running `kfctl apply`. See the guide to
[customizing your Kubeflow deployment](/docs/gke/customizing-gke/).
* `kfctl apply` - Creates or updates the resources.
* `kfctl delete` - Deletes the resources.
The kfctl deployment process applies default values to certain properties
as follows:
* **Email address:** kfctl attempts to fetch your email address from your
Cloud SDK configuration. You can run `gcloud config list` to see the default
email address, which the command output lists as the **account**.
If kfctl can't find a valid email address, you must use the
flag `--email <your email address>` to pass a valid email address. This email
address becomes an administrator in the configuration of your Kubeflow
deployment.
* **GCP project ID:** kfctl attempts to fetch your project ID from your
Cloud SDK configuration. You can run `gcloud config list` to see your
active project ID.
* **GCP zone:** kfctl attempts to fetch the zone from your Cloud SDK
configuration. You can run `gcloud config list` to see your active zone.
* **init** - performs a one-time setup.
* **generate** - creates configuration files defining the various resources.
* **apply** - creates or updates the resources.
* **delete** - deletes the resources.
* **Kubeflow deployment name:** kfctl defaults to the name of the directory
where you run the `kfctl build` or `kfctl apply` command.
With the exception of `init`, all commands take an argument which describes the
set of resources to apply the command to. This argument can be one of the
following:
You can also explicitly set the following values in your `${CONFIG_FILE}`
configuration file:
* **platform** - all GCP resources; that is, anything that doesn't run on
Kubernetes.
* **k8s** - all resources that run on Kubernetes.
* **all** - all GCP and Kubernetes resources.
* Kubeflow deployment name
* GCP project
* GCP zone
* Email address
### App layout
Your Kubeflow app directory **${KFAPP}** contains the following files and directories:
The following snippet shows you how to set values in the configuration file
using [yq](https://github.com/mikefarah/yq/releases):
* **app.yaml** defines configurations related to your Kubeflow deployment.
```
yq w -i ${CONFIG_FILE} spec.plugins[0].spec.project ${PROJECT}
yq w -i ${CONFIG_FILE} spec.plugins[0].spec.zone ${ZONE}
yq w -i ${CONFIG_FILE} metadata.name ${KF_NAME}
```
* The values are set when you run `kfctl init`.
* The values are snapshotted inside **app.yaml** to make your app
self contained.
### Application layout
Your Kubeflow application directory **${KF_DIR}** contains the following files and
directories:
* **${CONFIG_FILE}** is a YAML file that defines configurations related to your
Kubeflow deployment.
* This file is a copy of the GitHub-based configuration YAML file that
you used when deploying Kubeflow:
* either `{{% config-uri-gcp-iap %}}`
* or `{{% config-uri-gcp-basic-auth %}}`.
* When you run `kfctl apply` or `kfctl build`, kfctl creates
a local version of the configuration file, **${CONFIG_FILE}**,
which you can further customize if necessary.
* **gcp_config** is a directory that contains
[Deployment Manager configuration files](https://cloud.google.com/deployment-manager/docs/configuration/)
defining your GCP infrastructure.
* The directory is created when you run `kfctl generate platform`.
* The directory is created when you run `kfctl build` or `kfctl apply`.
* You can modify these configurations to customize your GCP infrastructure.
After modifying a configuration, run `kfctl apply` again.
* **kustomize** is a directory that contains the kustomize packages for Kubeflow
applications. See
[how Kubeflow uses kustomize](/docs/other-guides/kustomize/).
* The directory is created when you run `kfctl generate`.
* The directory is created when you run `kfctl build` or `kfctl apply`.
* You can customize the Kubernetes resources by modifying the manifests and
running `kfctl apply` again.
We recommend that you check in the contents of your **${KF_DIR}** directory
into source control.
### GCP service accounts
Creating a deployment using `kfctl` creates three service accounts in your
GCP project. These service accounts are created using the [principle of least
The kfctl deployment process creates three service accounts in your
GCP project. These service accounts follow the [principle of least
privilege](https://en.wikipedia.org/wiki/Principle_of_least_privilege).
The three service accounts are:
The service accounts are:
* `${KFAPP}-admin` is used for some admin tasks like configuring the load
* `${KF_NAME}-admin` is used for some admin tasks like configuring the load
balancers. The principle is that this account is needed to deploy Kubeflow but
not needed to actually run jobs.
* `${KFAPP}-user` is intended to be used by training jobs and models to access
* `${KF_NAME}-user` is intended to be used by training jobs and models to access
GCP resources (Cloud Storage, BigQuery, etc.). This account has a much smaller
set of privileges compared to `admin`.
* `${KFAPP}-vm` is used only for the virtual machine (VM) service account. This
* `${KF_NAME}-vm` is used only for the virtual machine (VM) service account. This
account has the minimal permissions needed to send metrics and logs to
[Stackdriver](https://cloud.google.com/stackdriver/).
......@@ -221,12 +360,12 @@ The three service accounts are:
[end-to-end MNIST tutorial](/docs/gke/gcp-e2e/) or the
[GitHub issue summarization
example](https://github.com/kubeflow/examples/tree/master/github_issue_summarization).
* See how to [delete](/docs/gke/deploy/delete-cli) your Kubeflow deployment
* See how to [delete](/docs/gke/deploy/delete-cli/) your Kubeflow deployment
using the CLI.
* See how to [customize](/docs/gke/customizing-gke) your Kubeflow
* See how to [customize](/docs/gke/customizing-gke/) your Kubeflow
deployment.
* See how to [upgrade Kubeflow](/docs/upgrading/upgrade/) and how to
[upgrade or reinstall a Kubeflow Pipelines
deployment](/docs/pipelines/upgrade/).
* [Troubleshoot](/docs/gke/troubleshooting-gke) any issues you may
* [Troubleshoot](/docs/gke/troubleshooting-gke/) any issues you may
find.
......@@ -133,17 +133,27 @@ problems:
recreating and redeploying Kubeflow with a different
name.
For example if you originally ran the following `kfctl init` command:
For example if you originally ran the following commands to deploy Kubeflow:
```
kfctl init myapp --project=myproject --config=myconfig -V
export KF_NAME=my-app
export BASE_DIR=<path to a base directory>
export KF_DIR=${BASE_DIR}/${KF_NAME}
mkdir -p ${KF_DIR}
cd ${KF_DIR}
kfctl apply -V -f ${CONFIG_FILE}
```
Then rerun `kfctl init` with a different name that you haven't used
Then rerun the commands with a different name that you haven't used
before:
```
kfctl init myapp-unique --project=myproject --config=myconfig -V
export KF_NAME=my-app-unique
export BASE_DIR=<path to a base directory>
export KF_DIR=${BASE_DIR}/${KF_NAME}
mkdir -p ${KF_DIR}
cd ${KF_DIR}
kfctl apply -V -f ${CONFIG_FILE}
```
1. Wait for the load balancer to report the back ends as healthy:
......
......@@ -231,60 +231,15 @@ export PROJECT_NUMBER=$(gcloud projects describe ${PROJECT} --format='value(proj
* You can use the Cloud console to monitor your GCB job.
1. Follow the guide to [deploying Kubeflow on GCP](/docs/gke/deploy/deploy-cli/).
When you reach the
[setup and deploy step](/docs/gke/deploy/deploy-cli/#set-up-and-deploy),
**skip the `kfctl apply` command** and run the **`kfctl build`** command
instead, as described in that step. Now you can edit the configuration files
before deploying Kubeflow. Retain the environment variables that you set
during the setup, including `${KF_NAME}`, `${KF_DIR}`, and `${CONFIG_FILE}`.
1. Follow the [instructions](https://www.kubeflow.org/docs/gke/deploy/oauth-setup/) for creating an OAuth client
1. Create environment variables for IAP OAuth access
```bash
export CLIENT_ID=<CLIENT_ID from OAuth page>
export CLIENT_SECRET=<CLIENT_SECRET from OAuth page>
```
1. Download a `kfctl` release from the
[Kubeflow releases page](https://github.com/kubeflow/kubeflow/releases/).
1. Unpack the tar ball:
```
tar -xvf kfctl_<release tag>_<platform>.tar.gz
```
* **Optional** Add the kfctl binary to your path.
* If you don't add the kfctl binary to your path then in all subsequent
steps you will need to replace `kfctl` with the full path to the binary.
1. Initialize the directory containing your Kubeflow deployment config files
```bash
export PROJECT=<your GCP project ID>
export KFAPP=<your choice of application directory name>
# Run the following commands for the default installation which uses Cloud IAP:
export CONFIG="{{% config-uri-gcp-iap %}}"
kfctl init ${KFAPP} --project=${PROJECT} --config=${CONFIG} -V
# Alternatively, run these commands if you want to use basic authentication:
export CONFIG="{{% config-uri-gcp-basic-auth %}}"
kfctl init ${KFAPP} --project=${PROJECT} --config=${CONFIG} -V --use_basic_auth
cd ${KFAPP}
kfctl generate all -V
```
* **${KFAPP}** - the _name_ of a directory where you want Kubeflow
configurations to be stored. This directory is created when you run
`kfctl init`. If you want a custom deployment name, specify that name here.
The value of this variable becomes the name of your deployment.
The value of this variable cannot be greater than 25 characters. It must
contain just the directory name, not the full path to the directory.
The content of this directory is described in the next section.
* **${PROJECT}** - the ID of the GCP project where you want Kubeflow
deployed.
* `kfctl generate all` attempts to fetch your email address from your
credential. If it can't find a valid email address, you need to pass a
valid email address with flag `--email <your email address>`. This email
address becomes an administrator in the configuration of your Kubeflow
deployment.
1. Enable private clusters by editing `${KFAPP}/gcp_configs/cluster-kubeflow.yaml` and updating the following two parameters:
1. Enable private clusters by editing `${KF_DIR}/gcp_config/cluster-kubeflow.yaml` and updating the following two parameters:
```
privatecluster: true
......@@ -293,14 +248,14 @@ export PROJECT_NUMBER=$(gcloud projects describe ${PROJECT} --format='value(proj
1. Remove components which are not useful in private clusters:
```
cd ${KFAPP}/kustomize
cd ${KF_DIR}/kustomize
kubectl delete -f cert-manager.yaml
```
1. Create the deployment:
```
cd ${KFAPP}
kfctl apply platform
cd ${KF_DIR}
kfctl apply -V -f ${CONFIG_FILE}
```
* If you get an error **legacy networks not supported**, follow the
......@@ -309,11 +264,11 @@ export PROJECT_NUMBER=$(gcloud projects describe ${PROJECT} --format='value(proj
* You will need to manually create the network as a work around for [kubeflow/kubeflow#3071](https://github.com/kubeflow/kubeflow/issues/3071)
```
cd ${KFAPP}/gcp_configs
gcloud --project=${PROJECT} deployment-manager deployments create ${KFAPP}-network --config=network.yaml
cd ${KF_DIR}/gcp_config
gcloud --project=${PROJECT} deployment-manager deployments create ${KF_NAME}-network --config=network.yaml
```
* Then edit **gcp_config/cluster.jinja** to add a field **network** in your cluster
* Then edit `${KF_DIR}/gcp_config/cluster.jinja` to add a field **network** in your cluster
```
cluster:
......@@ -327,12 +282,12 @@ export PROJECT_NUMBER=$(gcloud projects describe ${PROJECT} --format='value(proj
gcloud --project=${PROJECT} compute networks list
```
* The name will contain the value ${KFAPP}
* The name will contain the value ${KF_NAME}
1. Update iap-ingress component parameters:
```
cd ${KFAPP}/kustomize
cd ${KF_DIR}/kustomize
gvim basic-auth-ingress.yaml # Or iap-ingress.yaml if you are using IAP
```
......@@ -360,7 +315,7 @@ export PROJECT_NUMBER=$(gcloud projects describe ${PROJECT} --format='value(proj
you can get it by running
```
cd ${KFAPP}/kustomize
cd ${KF_DIR}/kustomize
grep hostname: basic-auth-ingress.yaml
```
......@@ -380,8 +335,8 @@ export PROJECT_NUMBER=$(gcloud projects describe ${PROJECT} --format='value(proj
1. Apply all the Kubernetes resources:
```
cd ${KFAPP}
kfctl apply k8s
cd ${KF_DIR}
kfctl apply -V -f ${CONFIG_FILE}
```
1. Wait for Kubeflow to become accessible and then access it at
......
......@@ -15,6 +15,55 @@ This guide covers troubleshooting specifically for
For more help, try the
[general Kubeflow troubleshooting guide](/docs/other-guides/troubleshooting).
This guide assumes the following settings:
* The `${KF_DIR}` environment variable contains the path to
your Kubeflow application directory, which holds your Kubeflow configuration
files. For example, `/opt/my-kubeflow/`.
```
export KF_DIR=<path to your Kubeflow application directory>
```
* The `${CONFIG_FILE}` environment variable contains the path to your
Kubeflow configuration file.
```
export CONFIG_FILE=${KF_DIR}/kfctl_gcp_iap.yaml
```
Or:
```
export CONFIG_FILE=${KF_DIR}/kfctl_gcp_basic_auth.yaml
```
* The `${KF_NAME}` environment variable contains the name of your Kubeflow
deployment. You can find the name in your `${CONFIG_FILE}`
configuration file, as the value for the `metadata.name` key.
```
export KF_NAME=<the name of your Kubeflow deployment>
```
* The `${PROJECT}` environment variable contains the ID of your GCP project.
You can find the project ID in
your `${CONFIG_FILE}` configuration file, as the value for the `project` key.
```
export PROJECT=<your GCP project ID>
```
* The `${ZONE}` environment variable contains the GCP zone where your
Kubeflow resources are deployed.
```
export ZONE=<your GCP zone>
```
* For further background about the above settings, see the guide to
[deploying Kubeflow with the CLI](/docs/gke/deploy/deploy-cli).
## Troubleshooting kubeflow deployment on GCP
Here are some tips for troubleshooting Cloud IAP.
......@@ -43,7 +92,7 @@ the cluster. This section assumes
you are using [Cloud Endpoints](https://cloud.google.com/endpoints/) and a DNS name of the following pattern
```
https://${DEPLOYMENT_NAME}.endpoints.${PROJECT}.cloud.goog
https://${KF_NAME}.endpoints.${PROJECT}.cloud.goog
```
Symptoms:
......@@ -52,11 +101,11 @@ Symptoms:
* nslookup for the domain name doesn't return the IP address associated with the ingress
```
nslookup ${DEPLOYMENT_NAME}.endpoints.${PROJECT}.cloud.goog
nslookup ${KF_NAME}.endpoints.${PROJECT}.cloud.goog
Server: 127.0.0.1
Address: 127.0.0.1#53
** server can't find ${DEPLOYMENT_NAME}.endpoints.${PROJECT}.cloud.goog: NXDOMAIN
** server can't find ${KF_NAME}.endpoints.${PROJECT}.cloud.goog: NXDOMAIN
```
Troubleshooting
......@@ -64,8 +113,8 @@ Troubleshooting
1. Check the `cloudendpoints` resource
```
kubectl get cloudendpoints -o yaml ${DEPLOYMENT_NAME}
kubectl describe cloudendpoints ${DEPLOYMENT_NAME}
kubectl get cloudendpoints -o yaml ${KF_NAME}
kubectl describe cloudendpoints ${KF_NAME}
```
* Check if there are errors indicating problems creating the endpoint
......@@ -329,7 +378,7 @@ ERROR: (gcloud.deployment-manager.deployments.update) Error in Operation [operat
To fix this we can create a new network:
```
cd ${KFAPP}
cd ${KF_DIR}
cp .cache/master/deployment/gke/deployment_manager_configs/network.* \
./gcp_config/
```
......@@ -341,8 +390,8 @@ Edit `gcfs.yaml` to use the name of the newly created network.
Apply the changes.
```
cd ${KFAPP}
kfctl apply platform
cd ${KF_DIR}
kfctl apply -V -f ${CONFIG}
```
## CPU platform unavailable in requested zone
......
https://raw.githubusercontent.com/kubeflow/kubeflow/cfb336b7/bootstrap/config/kfctl_gcp_basic_auth.0.6.2.yaml
\ No newline at end of file
https://raw.githubusercontent.com/kubeflow/manifests/v0.7-branch/kfdef/kfctl_gcp_basic_auth.0.7.0.yaml
\ No newline at end of file
https://raw.githubusercontent.com/kubeflow/kubeflow/cfb336b7/bootstrap/config/kfctl_gcp_iap.0.6.2.yaml
\ No newline at end of file
https://raw.githubusercontent.com/kubeflow/manifests/v0.7-branch/kfdef/kfctl_gcp_iap.0.7.0.yaml
\ No newline at end of file
v0.6.2
\ No newline at end of file
v0.7.0-rc.5
\ No newline at end of file
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册