提交 ddb205c9 编写于 作者: S Sarah Maddox 提交者: Kubernetes Prow Robot

kfctl updates for v0.7 - GCP docs only (#1234)

* WIP v0.7 updates to kfctl init

* More updates to deployment; updated customization; added interim URIs for config files.

* Updated section on accessing UI in CLI deployment.

* Updated kfctl commands in remaining GCP deployment docs.

* Updated the guide to securing clusters for kfctl v0.7.

* Updated the troubleshooting guide for kfctl v0.7.

* Fixed list formatting throughout the guide to securing clusters.

* Addressed review comments and set KF latest version.

* WIP updating for change in config file name.

* Further clarifications to the deployment and deletion guides.

* Updated the customization guide for change in config name and delete.

* Updated for new config file name in the guide to monitoring IAP.

* Changed KFAPP variable in auth guide.

* Updated Filestore guide.

* A few tweaks.

* Updated cluster security guide.

* Updated the troubleshooting guide.

* Standardised CONFIG_FILE to contain full path.

* Updated guide to custom domains.

* Added GPU info and clarified KF-NAME/KF_DIR.

* Updated to latest RC and config URLs.
上级 485945a8
...@@ -9,7 +9,7 @@ weight = 4 ...@@ -9,7 +9,7 @@ weight = 4
When you [set up Kubeflow for GCP](/docs/gke/deploy), it will automatically When you [set up Kubeflow for GCP](/docs/gke/deploy), it will automatically
[provision three service accounts](https://www.kubeflow.org/docs/gke/deploy/deploy-cli/#gcp-service-accounts) [provision three service accounts](https://www.kubeflow.org/docs/gke/deploy/deploy-cli/#gcp-service-accounts)
with different privileges in the `kubeflow` namespace. In particular, the `${KFAPP}-user` service account is with different privileges in the `kubeflow` namespace. In particular, the `${KF_NAME}-user` service account is
meant to grant your user services access to GCP. The credentials to this service account can be accessed within meant to grant your user services access to GCP. The credentials to this service account can be accessed within
the cluster as a [Kubernetes secret](https://kubernetes.io/docs/concepts/configuration/secret/) called `user-gcp-sa`. the cluster as a [Kubernetes secret](https://kubernetes.io/docs/concepts/configuration/secret/) called `user-gcp-sa`.
...@@ -104,7 +104,7 @@ so be careful which Pods you grant access to. ...@@ -104,7 +104,7 @@ so be careful which Pods you grant access to.
1. **Set the `GOOGLE_APPLICATION_CREDENTIALS` environment variable** to point to the service account. 1. **Set the `GOOGLE_APPLICATION_CREDENTIALS` environment variable** to point to the service account.
GCP libraries will use this environment variable to find the service account and authenticate with GCP. GCP libraries will use this environment variable to find the service account and authenticate with GCP.
The following YAML describes a Pod that has access to the `${KFAPP}-user` service account: The following YAML describes a Pod that has access to the `${KF_NAME}-user` service account:
``` ```
apiVersion: v1 apiVersion: v1
kind: Pod kind: Pod
......
...@@ -17,11 +17,54 @@ Cloud Filestore is very useful for creating a shared filesystem that can be moun ...@@ -17,11 +17,54 @@ Cloud Filestore is very useful for creating a shared filesystem that can be moun
This guide assumes you have already set up Kubeflow on GCP. If you haven't done This guide assumes you have already set up Kubeflow on GCP. If you haven't done
so, follow the guide to [deploying Kubeflow on GCP](/docs/gke/deploy/). so, follow the guide to [deploying Kubeflow on GCP](/docs/gke/deploy/).
The instructions below assume that the `${KFAPP}` environment variable contains This guide assumes the following settings:
the *name* (not the full path) of the directory containing your Kubeflow
configurations. See the * The `${KF_DIR}` environment variable contains the path to
[Kubeflow deployment guide](/docs/gke/deploy/deploy-cli/) for details of this your Kubeflow application directory, which holds your Kubeflow configuration
directory. files. For example, `/opt/my-kubeflow/`.
```
export KF_DIR=<path to your Kubeflow application directory>
```
* The `${CONFIG_FILE}` environment variable contains the path to your
Kubeflow configuration file.
```
export CONFIG_FILE=${KF_DIR}/kfctl_gcp_iap.yaml
```
Or:
```
export CONFIG_FILE=${KF_DIR}/kfctl_gcp_basic_auth.yaml
```
* The `${KF_NAME}` environment variable contains the name of your Kubeflow
deployment. You can find the name in your `${CONFIG_FILE}`
configuration file, as the value for the `metadata.name` key.
```
export KF_NAME=<the name of your Kubeflow deployment>
```
* The `${PROJECT}` environment variable contains the ID of your GCP project.
You can find the project ID in
your `${CONFIG_FILE}` configuraiton file, as the value for the `project` key.
```
export PROJECT=<your GCP project ID>
```
* The `${ZONE}` environment variable contains the GCP zone where your
Kubeflow resources are deployed.
```
export ZONE=<your GCP zone>
```
* For further background about the above settings, see the guide to
[deploying Kubeflow with the CLI](/docs/gke/deploy/deploy-cli).
## Create a Cloud Filestore instance ## Create a Cloud Filestore instance
...@@ -31,7 +74,7 @@ use you can skip this section. ...@@ -31,7 +74,7 @@ use you can skip this section.
Copy the Cloud Filestore deployment manager configs to the `gcp_config` directory: Copy the Cloud Filestore deployment manager configs to the `gcp_config` directory:
``` ```
cd /<path-to-kubeflow-deployment>/${KFAPP} cd ${KF_DIR}
cp .cache/${VERSION}/deployment/gke/deployment_manager_configs/gcfs.yaml \ cp .cache/${VERSION}/deployment/gke/deployment_manager_configs/gcfs.yaml \
./gcp_config/ ./gcp_config/
``` ```
...@@ -49,9 +92,9 @@ Edit `gcfs.yaml` to match your desired configuration: ...@@ -49,9 +92,9 @@ Edit `gcfs.yaml` to match your desired configuration:
Using [yq](https://github.com/kislyuk/yq): Using [yq](https://github.com/kislyuk/yq):
``` ```
cd /<path-to-kubeflow-deployment>/${KFAPP} cd ${KF_DIR}
. env.sh . env.sh
yq -r ".resources[0].properties.instanceId=\"${DEPLOYMENT_NAME}\"" gcp_config/gcfs.yaml > gcp_config/gcfs.yaml.new yq -r ".resources[0].properties.instanceId=\"${KF_NAME}\"" gcp_config/gcfs.yaml > gcp_config/gcfs.yaml.new
mv gcp_config/gcfs.yaml.new gcp_config/gcfs.yaml mv gcp_config/gcfs.yaml.new gcp_config/gcfs.yaml
``` ```
...@@ -63,8 +106,8 @@ Apply the changes: ...@@ -63,8 +106,8 @@ Apply the changes:
--> -->
``` ```
cd /<path-to-kubeflow-deployment>/${KFAPP}/gcp_config cd ${KF_DIR}/gcp_config
gcloud --project=${PROJECT} deployment-manager deployments create ${KFAPP-NAME}-nfs --config=gcfs.yaml gcloud --project=${PROJECT} deployment-manager deployments create ${KF_NAME}-nfs --config=gcfs.yaml
``` ```
If you get an error **legacy networks are not supported** follow the instructions If you get an error **legacy networks are not supported** follow the instructions
......
...@@ -15,19 +15,19 @@ so, follow the guide to ...@@ -15,19 +15,19 @@ so, follow the guide to
## Using your own domain ## Using your own domain
If you want to use your own domain instead of **${name}.endpoints.${project}.cloud.goog**, follow these instructions: If you want to use your own domain instead of **${KF_NAME}.endpoints.${PROJECT}.cloud.goog**, follow these instructions:
1. Remove the `cloud-endpoints` component: 1. Remove the `cloud-endpoints` component:
``` ```
cd ${KFAPP}/kustomize cd ${KF_DIR}/kustomize
kubectl delete -f cloud-endpoints.yaml kubectl delete -f cloud-endpoints.yaml
``` ```
1. Set the domain for your ingress to be the fully qualified domain name: 1. Set the domain for your ingress to be the fully qualified domain name:
``` ```
cd ${KFAPP}/kustomize cd ${KF_DIR}/kustomize
gvim iap-ingress.yaml # Or basic-auth-ingress.yaml gvim iap-ingress.yaml # Or basic-auth-ingress.yaml
``` ```
...@@ -47,7 +47,7 @@ If you want to use your own domain instead of **${name}.endpoints.${project}.clo ...@@ -47,7 +47,7 @@ If you want to use your own domain instead of **${name}.endpoints.${project}.clo
1. Get the address of the static IP address created: 1. Get the address of the static IP address created:
``` ```
IPNAME=${DEPLOYMENT_NAME}-ip IPNAME=${KF_NAME}-ip
gcloud --project=${PROJECT} compute addresses describe --global ${IPNAME} gcloud --project=${PROJECT} compute addresses describe --global ${IPNAME}
``` ```
......
...@@ -7,56 +7,109 @@ weight = 2 ...@@ -7,56 +7,109 @@ weight = 2
This guide describes how to customize your deployment of Kubeflow on Google This guide describes how to customize your deployment of Kubeflow on Google
Kubernetes Engine (GKE) in Google Cloud Platform (GCP). Kubernetes Engine (GKE) in Google Cloud Platform (GCP).
## Customizing Kubeflow before deployment
The Kubeflow deployment process is divided into two steps, **build** and
**apply**, so that you can modify your configuration before deploying your
Kubeflow cluster.
Follow the guide to [deploying Kubeflow on GCP](/docs/gke/deploy/deploy-cli/).
When you reach the
[setup and deploy step](/docs/gke/deploy/deploy-cli/#set-up-and-deploy),
**skip the `kfctl apply` command** and run the **`kfctl build`** command
instead, as described in that step. Now you can edit the configuration files
before deploying Kubeflow.
## Customizing an existing deployment
You can also customize an existing Kubeflow deployment. In that case, this
guide assumes that you have already followed the guide to
[deploying Kubeflow on GCP](/docs/gke/deploy/deploy-cli/) and have deployed
Kubeflow to a GKE cluster.
## Before you start ## Before you start
This guide assumes you have already set up Kubeflow with GKE. If you haven't done This guide assumes the following settings:
so, follow the guide to [deploying Kubeflow on GCP](/docs/gke/deploy/).
## Customizing Kubeflow * The `${KF_DIR}` environment variable contains the path to
your Kubeflow application directory, which holds your Kubeflow configuration
files. For example, `/opt/my-kubeflow/`.
You can use [kustomize](https://kustomize.io/) to customize Kubeflow. ```
export KF_DIR=<path to your Kubeflow application directory>
```
* The `${CONFIG_FILE}` environment variable contains the path to your
Kubeflow configuration file.
```
export CONFIG_FILE=${KF_DIR}/kfctl_gcp_iap.yaml
```
Or:
```
export CONFIG_FILE=${KF_DIR}/kfctl_gcp_basic_auth.yaml
```
* The `${KF_NAME}` environment variable contains the name of your Kubeflow
deployment. You can find the name in your
`${CONFIG_FILE}` configuration file, as the value for the `metadata.name` key.
```
export KF_NAME=<the name of your Kubeflow deployment>
```
* The `${PROJECT}` environment variable contains the ID of your GCP project.
You can find the project ID in your
`${CONFIG_FILE}` configuration file, as the value for the `project` key.
```
export PROJECT=<your GCP project ID>
```
The deployment process is divided into two steps, **generate** and **apply**, so that you can * For further background about the above settings, see the guide to
modify your deployment before actually deploying. [deploying Kubeflow with the CLI](/docs/gke/deploy/deploy-cli).
To customize GCP resources (such as your Kubernetes Engine cluster), you can modify the deployment manager configs in **${KFAPP}/gcp_config**. ## Customizing GCP resources
Many changes can be applied to an existing configuration in which case you can run: To customize GCP resources, such as your Kubernetes Engine cluster, you can
modify the Deployment Manager configuration settings in `${KF_DIR}/gcp_config`.
After modifying your existing configuration, run the following command to apply
the changes:
``` ```
cd ${KFAPP} cd ${KF_DIR}
kfctl apply platform kfctl apply -V -f ${CONFIG_FILE}
``` ```
or using Deployment Manager directly: Alternatively, you can use Deployment Manager directly:
``` ```
cd ${KFAPP}/gcp_config cd ${KF_DIR}/gcp_config
gcloud deployment-manager --project=${PROJECT} deployments update ${DEPLOYMENT_NAME} --config=cluster-kubeflow.yaml gcloud deployment-manager --project=${PROJECT} deployments update ${KF_NAME} --config=cluster-kubeflow.yaml
``` ```
* **PROJECT** Name of your GCP project. You could find it in `${KFAPP}/app.yaml`.
* **DEPLOYMENT_NAME** Name of your Kubeflow app. You could also find it in `${KFAPP}/app.yaml`.
In specific, `.metadata.name`
Some changes (such as the VM service account for Kubernetes Engine) can only be set at creation time; in this case you need Some changes (such as the VM service account for Kubernetes Engine) can only be set at creation time; in this case you need
to tear down your deployment before recreating it: to tear down your deployment before recreating it:
``` ```
cd ${KFAPP} cd ${KF_DIR}
kfctl delete all kfctl delete -f ${CONFIG_FILE}
kfctl apply all kfctl apply -V -f ${CONFIG_FILE}
``` ```
To customize the Kubeflow resources running within the cluster you can modify the kustomize manifests in **${KFAPP}/kustomize**. ## Customizing Kubernetes resources
For example, to modify settings for the Jupyter web app:
``` You can use [kustomize](https://kustomize.io/) to customize Kubeflow.
cd ${KFAPP}/kustomize To customize the Kubernetes resources running within the cluster, you can modify
gvim jupyter-web-app.yaml the kustomize manifests in `${KF_DIR}/kustomize`.
```
Find and replace the parameter values: For example, to modify settings for the Jupyter web app:
1. Open `${KF_DIR}/kustomize/jupyter-web-app.yaml` in a text editor.
1. Find and replace the parameter values:
``` ```
apiVersion: v1 apiVersion: v1
data: data:
...@@ -74,30 +127,61 @@ metadata: ...@@ -74,30 +127,61 @@ metadata:
namespace: kubeflow namespace: kubeflow
``` ```
You can then redeploy using `kfctl`: 1. Redeploy Kubeflow using kfctl:
```
cd ${KF_DIR}
kfctl apply -V -f ${CONFIG_FILE}
```
Or use kubectl directly:
```
cd ${KF_DIR}/kustomize
kubectl apply -f jupyter-web-app.yaml
```
## Common customizations
<a id="gpu-config"></a>
### Add GPU nodes to your cluster
To add GPU accelerators to your Kubeflow cluster, you have the following
options:
* Pick a GCP zone that provides NVIDIA Tesla K80 Accelerators
(`nvidia-tesla-k80`).
* Or disable node-autoprovisioning in your Kubeflow cluster.
* Or change your node-autoprovisioning configuration.
To see which accelerators are available in each zone, run the following
command:
``` ```
cd ${KFAPP} gcloud compute accelerator-types list
kfctl apply k8s
``` ```
To disable node-autoprovisioning, run `kfctl build` as described above.
Then edit `${KF_DIR}/gcp_config/cluster-kubeflow.yaml` and set
[`enabled`](https://github.com/kubeflow/manifests/blob/4d2939d6c1a5fd862610382fde130cad33bfef75/gcp/deployment_manager_configs/cluster-kubeflow.yaml#L73)
to `false`:
or using kubectl directly:
``` ```
cd ${KFAPP}/kustomize ...
kubectl apply -f jupyter-web-app.yaml gpu-type: nvidia-tesla-k80
autoprovisioning-config:
enabled: false
...
``` ```
## Common customizations You must also set
[`gpu-pool-initialNodeCount`](https://github.com/kubeflow/manifests/blob/4d2939d6c1a5fd862610382fde130cad33bfef75/gcp/deployment_manager_configs/cluster-kubeflow.yaml#L58).
Add GPU nodes to your cluster:
* Set gpu-pool-initialNodeCount [here](https://github.com/kubeflow/kubeflow/blob/{{< params "githubbranch" >}}/deployment/gke/deployment_manager_configs/cluster-kubeflow.yaml#L56).
Add Cloud TPUs to your cluster: ### Add Cloud TPUs to your cluster
* Set `enable_tpu:true` [here](https://github.com/kubeflow/kubeflow/blob/{{< params "githubbranch" >}}/deployment/gke/deployment_manager_configs/cluster-kubeflow.yaml#L78). Set [`enable_tpu:true`](https://github.com/kubeflow/manifests/blob/4d2939d6c1a5fd862610382fde130cad33bfef75/gcp/deployment_manager_configs/cluster-kubeflow.yaml#L80)
in `${KF_DIR}/gcp_config/cluster-kubeflow.yaml`.
Add VMs with more CPUs or RAM: ### Add VMs with more CPUs or RAM
* Change the machineType. * Change the machineType.
* There are two node pools: * There are two node pools:
...@@ -105,7 +189,7 @@ Add VMs with more CPUs or RAM: ...@@ -105,7 +189,7 @@ Add VMs with more CPUs or RAM:
* one for GPU machines [here](https://github.com/kubeflow/kubeflow/blob/{{< params "githubbranch" >}}/scripts/gke/deployment_manager_configs/cluster.jinja#L149). * one for GPU machines [here](https://github.com/kubeflow/kubeflow/blob/{{< params "githubbranch" >}}/scripts/gke/deployment_manager_configs/cluster.jinja#L149).
* When making changes to the node pools you also need to bump the pool-version [here](https://github.com/kubeflow/kubeflow/blob/{{< params "githubbranch" >}}/scripts/gke/deployment_manager_configs/cluster-kubeflow.yaml#L37) before you update the deployment. * When making changes to the node pools you also need to bump the pool-version [here](https://github.com/kubeflow/kubeflow/blob/{{< params "githubbranch" >}}/scripts/gke/deployment_manager_configs/cluster-kubeflow.yaml#L37) before you update the deployment.
Add users to Kubeflow: ### Add users to Kubeflow
* To grant users access to Kubeflow, add the “IAP-secured Web App User” role on the [IAM page in the GCP console](https://console.cloud.google.com/iam-admin/iam). Make sure you are in the same project as your Kubeflow deployment. * To grant users access to Kubeflow, add the “IAP-secured Web App User” role on the [IAM page in the GCP console](https://console.cloud.google.com/iam-admin/iam). Make sure you are in the same project as your Kubeflow deployment.
......
...@@ -7,26 +7,47 @@ weight = 6 ...@@ -7,26 +7,47 @@ weight = 6
This page shows you how to use the CLI to delete a Kubeflow deployment on This page shows you how to use the CLI to delete a Kubeflow deployment on
Google Cloud Platform (GCP). Google Cloud Platform (GCP).
## Before you start
This guide assumes the following settings:
* The `${KF_DIR}` environment variable contains the path to
your Kubeflow application directory, which holds your Kubeflow configuration
files. For example, `/opt/my-kubeflow/`.
```
export KF_DIR=<path to your Kubeflow application directory>
```
* The `${CONFIG_FILE}` environment variable contains the path to your
Kubeflow configuration file.
```
export CONFIG_FILE=${KF_DIR}/kfctl_gcp_iap.yaml
```
Or:
```
export CONFIG_FILE=${KF_DIR}/kfctl_gcp_basic_auth.yaml
```
For further background about the above settings, see the guide to
[deploying Kubeflow with the CLI](/docs/gke/deploy/deploy-cli).
## Deleting your deployment
Run the following commands to delete your deployment and reclaim all GCP Run the following commands to delete your deployment and reclaim all GCP
resources: resources:
``` ```
cd ${KFAPP}
# If you want to delete all the resources, including storage: # If you want to delete all the resources, including storage:
kfctl delete all --delete_storage kfctl delete -f ${CONFIG_FILE} --delete_storage
# If you want to preserve storage, which contains metadata and information # If you want to preserve storage, which contains metadata and information
# from Kubeflow Pipelines: # from Kubeflow Pipelines:
kfctl delete all kfctl delete -f ${CONFIG_FILE}
``` ```
The environment variable `${KFAPP}` must contain the _name_ of the directory
that contains your Kubeflow configurations. This directory was created when you
deployed Kubeflow.
* The name of the directory is the same as the name of your Kubeflow deployment.
If you deployed Kubeflow [using the UI](/docs/gke/deploy/deploy-ui/), the
value of `${KFAPP}` is the value of the **Deployment name** field on the UI.
* If you deployed Kubeflow [using the CLI](/docs/gke/deploy/deploy-cli/), use
the same value as you used when you ran `kfctl init`.
You should consider preserving storage if you may want to relaunch You should consider preserving storage if you may want to relaunch
Kubeflow in the future and restore the data from your Kubeflow in the future and restore the data from your
......
...@@ -32,9 +32,21 @@ Before installing Kubeflow on the command line: ...@@ -32,9 +32,21 @@ Before installing Kubeflow on the command line:
access to sensitive data. Alternatively, you can use basic authentication access to sensitive data. Alternatively, you can use basic authentication
with a username and password. with a username and password.
## Deploy Kubeflow <a id="prepare-environment"></a>
## Prepare your environment
Follow these steps to deploy Kubeflow: Follow these steps to download the kfctl binary for the Kubeflow CLI and set
some handy environment variables:
1. Download the kfctl {{% kf-latest-version %}} release from the
[Kubeflow releases
page](https://github.com/kubeflow/kubeflow/releases/tag/{{% kf-latest-version %}}).
1. Unpack the tar ball:
```
tar -xvf kfctl_{{% kf-latest-version %}}_<platform>.tar.gz
```
1. Create user credentials. You only need to run this command once: 1. Create user credentials. You only need to run this command once:
...@@ -42,176 +54,303 @@ Follow these steps to deploy Kubeflow: ...@@ -42,176 +54,303 @@ Follow these steps to deploy Kubeflow:
gcloud auth application-default login gcloud auth application-default login
``` ```
1. Create environment variables for your access control services: 1. Create environment variables to make the deployment process easier:
```bash ```
# If using Cloud IAP, create environment variables from the # Set your GCP project ID and the zone where you want to create
# OAuth client ID and secret that you obtained earlier: # the Kubeflow deployment:
export PROJECT=<your GCP project ID>
gcloud config set project ${PROJECT}
export ZONE=<your GCP zone>
gcloud config set compute/zone ${ZONE}
# Use the following kfctl configuration file for authentication with
# Cloud IAP (recommended):
export CONFIG_URI="{{% config-uri-gcp-iap %}}"
# If using Cloud IAP for authentication, create environment variables
# from the OAuth client ID and secret that you obtained earlier:
export CLIENT_ID=<CLIENT_ID from OAuth page> export CLIENT_ID=<CLIENT_ID from OAuth page>
export CLIENT_SECRET=<CLIENT_SECRET from OAuth page> export CLIENT_SECRET=<CLIENT_SECRET from OAuth page>
# If using basic authentication, create environment variables for # Alternatively, use the following kfctl configuration if you want to use
# username and password: # basic authentication:
export CONFIG_URI="{{% config-uri-gcp-basic-auth %}}"
# If using basic authentication, create environment variables
# for username and password:
export KUBEFLOW_USERNAME=<your username> export KUBEFLOW_USERNAME=<your username>
export KUBEFLOW_PASSWORD=<your password> export KUBEFLOW_PASSWORD=<your password>
# Set KF_NAME to the name of your Kubeflow deployment. You also use this
# value as directory name when creating your configuration directory.
# See the detailed description in the text below this code snippet.
# For example, your deployment name can be 'my-kubeflow' or 'kf-test'.
export KF_NAME=<your choice of name for the Kubeflow deployment>
# Set the path to the base directory where you want to store one or more
# Kubeflow deployments. For example, /opt/.
# Then set the Kubeflow application directory for this deployment.
export BASE_DIR=<path to a base directory>
export KF_DIR=${BASE_DIR}/${KF_NAME}
# The following command is optional. It adds the kfctl binary to your path.
# If you don't add kfctl to your path, you must use the full path
# each time you run kfctl.
export PATH=$PATH:<path to your kfctl file>
``` ```
1. Download a `kfctl` release from the Notes:
[Kubeflow releases page](https://github.com/kubeflow/kubeflow/releases/).
* **${PROJECT}** - The project ID of the GCP project where you want Kubeflow
deployed.
* **${ZONE}** - The GCP zone where you want to create the Kubeflow deployment.
You can see a list of zones in the
[Compute Engine documentation](https://cloud.google.com/compute/docs/regions-zones/#available).
If you plan to use accelerators, you must choose a zone that supports the
type you want. See the guide to
[customizing your Kubeflow deployment](/docs/gke/customizing-gke/#gpu-config).
* **${CONFIG_URI}** - The GitHub address of the configuration YAML file that
you want to use to deploy Kubeflow. For GCP deployments, the following
configurations are available:
* `{{% config-uri-gcp-iap %}}`
* `{{% config-uri-gcp-basic-auth %}}`
When you run `kfctl apply` or `kfctl build` (see the next step), kfctl creates
a local version of the configuration YAML file which you can further
customize if necessary.
* **${KF_NAME}** - The name of your Kubeflow deployment.
If you want a custom deployment name, specify that name here.
For example, `my-kubeflow` or `kf-test`.
The value of KF_NAME must consist of lower case alphanumeric characters or
'-', and must start and end with an alphanumeric character.
The value of this variable cannot be greater than 25 characters. It must
contain just a name, not a directory path.
You also use this value as directory name when creating the directory where
your Kubeflow configurations are stored, that is, the Kubeflow application
directory.
* **${KF_DIR}** - The full path to your Kubeflow application directory.
1. Unpack the tar ball: <a id="set-up-and-deploy"></a>
## Set up and deploy Kubeflow
``` To set up and deploy Kubeflow using the **default settings**,
tar -xvf kfctl_<release tag>_<platform>.tar.gz run the `kfctl apply` command:
```
1. Run the following commands to set up and deploy Kubeflow. The code below ```
includes an optional command to add the binary `kfctl` to your path. If you mkdir -p ${KF_DIR}
don't add the binary to your path, you must use the full path to the `kfctl` cd ${KF_DIR}
binary each time you run it. kfctl apply -V -f ${CONFIG_URI}
```
```bash ## Alternatively, set up your configuration for later deployment
# The following command is optional, to make kfctl binary easier to use.
export PATH=$PATH:<path to your kfctl file>
# Set KFAPP to the name of your Kubeflow application. See detailed If you want to customize your configuration before deploying Kubeflow, you can
# description in the text below this code snippet. set up your configuration files first, then edit the configuration, then
# For example, 'kubeflow-test' or 'kfw-test'. deploy Kubeflow:
export KFAPP=<your choice of application directory name>
export ZONE=<your target GCP zone> # where the deployment will be created 1. Run the `kfctl build` command to set up your configuration:
export PROJECT=<your GCP project ID>
# Run the following commands for the default installation which uses Cloud IAP: ```
export CONFIG="{{% config-uri-gcp-iap %}}" mkdir -p ${KF_DIR}
kfctl init ${KFAPP} --project=${PROJECT} --config=${CONFIG} -V cd ${KF_DIR}
# Alternatively, run these commands if you want to use basic authentication: kfctl build -V -f ${CONFIG_URI}
export CONFIG="{{% config-uri-gcp-basic-auth %}}" ```
kfctl init ${KFAPP} --project=${PROJECT} --config=${CONFIG} -V --use_basic_auth
cd ${KFAPP} 1. Edit the configuration files, as described in the guide to
kfctl generate all -V --zone ${ZONE} [customizing your Kubeflow deployment](/docs/gke/customizing-gke/).
kfctl apply all -V
``` 1. Set an environment variable for your local configuration file:
* **${KFAPP}** - the _name_ of a directory where you want Kubeflow
configurations to be stored. This directory is created when you run ```
`kfctl init`. If you want a custom deployment name, specify that name here. export CONFIG_FILE=${KF_DIR}/kfctl_gcp_iap.yaml
The value of this variable becomes the name of your deployment. ```
The value of KFAPP must consist of lower case alphanumeric characters or
'-', and must start and end with an alphanumeric character. Or:
For example, 'kubeflow-test' or 'kfw-test'.
The value of this variable cannot be greater than 25 characters. It must ```
contain just the directory name, not the full path to the directory. export CONFIG_FILE=${KF_DIR}/kfctl_gcp_basic_auth.yaml
The content of this directory is described in the next section. ```
* **${PROJECT}** - the project ID of the GCP project where you want Kubeflow
deployed. 1. Run the `kfctl apply` command to deploy Kubeflow:
* **${ZONE}** - You can see a list of zones [here](https://cloud.google.com/compute/docs/regions-zones/#available).
If you plan to use accelerators, make sure to pick a zone that supports the type you want.
* When you run `kfctl init` you need to choose to use either IAP or basic
authentication, as described above.
* `kfctl generate all` attempts to fetch your email address from your
credential. If it can't find a valid email address, you need to pass a
valid email address with flag `--email <your email address>`. This email
address becomes an administrator in the configuration of your Kubeflow
deployment.
```
kfctl apply -V -f ${CONFIG_FILE}
```
## Check your deployment
Follow these steps to verify the deployment:
1. The deployment process creates a separate deployment for your data storage. 1. The deployment process creates a separate deployment for your data storage.
After running `kfctl apply` you should notice two new [deployments](https://console.cloud.google.com/dm/deployments): After running `kfctl apply` you should notice two new
* **{KFAPP}-storage**: This deployment has persistent volumes for your [deployments](https://console.cloud.google.com/dm/deployments):
* **{KF_NAME}-storage**: This deployment has persistent volumes for your
pipelines. pipelines.
* **{KFAPP}**: This deployment has all the components of Kubeflow, including * **{KF_NAME}**: This deployment has all the components of Kubeflow, including
a [GKE cluster](https://console.cloud.google.com/kubernetes/list) a [GKE cluster](https://console.cloud.google.com/kubernetes/list)
named **${KFAPP}** with Kubeflow installed. named **${KF_NAME}** with Kubeflow installed.
1. When the deployment finishes, check the resources installed in the namespace 1. When the deployment finishes, check the resources installed in the namespace
`kubeflow` in your new cluster. To do this from the command line, first set `kubeflow` in your new cluster. To do this from the command line, first set
your `kubectl` credentials to point to the new cluster: your `kubectl` credentials to point to the new cluster:
``` ```
gcloud container clusters get-credentials ${KFAPP} --zone ${ZONE} --project ${PROJECT} gcloud container clusters get-credentials ${KF_NAME} --zone ${ZONE} --project ${PROJECT}
``` ```
Then see what's installed in the `kubeflow` namespace of your GKE cluster: Then see what's installed in the `kubeflow` namespace of your GKE cluster:
``` ```
kubectl -n kubeflow get all kubectl -n kubeflow get all
``` ```
1. Access the Kubeflow central dashboard at the following URI when it becomes ## Access the Kubeflow user interface (UI)
available:
``` Follow these steps to access the Kubeflow central dashboard:
https://<KFAPP>.endpoints.<project-id>.cloud.goog/
``` 1. Enter the following URI into your browser address bar. It can take 20
* It can take 20 minutes for the URI to become available. minutes for the URI to become available:
Kubeflow needs to provision a signed SSL certificate and register a DNS
name. ```
* If you own/manage the domain or a subdomain with https://<KF_NAME>.endpoints.<project-id>.cloud.goog/
[Cloud DNS](https://cloud.google.com/dns/docs/) ```
then you can configure this process to be much faster.
See [kubeflow/kubeflow#731](https://github.com/kubeflow/kubeflow/issues/731). You can run the following command to get the URI for your deployment:
```
kubectl -n istio-system get ingress
NAME HOSTS ADDRESS PORTS AGE
envoy-ingress your-kubeflow-name.endpoints.your-gcp-project.cloud.goog 34.102.232.34 80 5d13h
```
The following command sets an environment variable named `HOST` to the URI:
```
export HOST=$(kubectl -n istio-system get ingress envoy-ingress -o=jsonpath={.spec.rules[0].host})
```
1. Follow the instructions on the UI to create a namespace. See the guide to
[creation of profiles](/docs/other-guides/multi-user-overview/#automatic-creation-of-profiles).
Notes:
1. We recommend that you check in the contents of your **${KFAPP}** directory * It can take 20 minutes for the URI to become available.
into source control. Kubeflow needs to provision a signed SSL certificate and register a DNS
name.
* If you own or manage the domain or a subdomain with
[Cloud DNS](https://cloud.google.com/dns/docs/)
then you can configure this process to be much faster.
See [kubeflow/kubeflow#731](https://github.com/kubeflow/kubeflow/issues/731).
## Understanding the deployment process ## Understanding the deployment process
The `kfctl` deployment process includes by the following commands: This section gives you more details about the kfctl configuration and
deployment process, so that you can customize your Kubeflow deployment if
necessary.
* **init** - performs a one-time setup. ### kfctl process and configuration
* **generate** - creates configuration files defining the various resources.
* **apply** - creates or updates the resources.
* **delete** - deletes the resources.
With the exception of `init`, all commands take an argument which describes the The kfctl deployment process includes the following commands:
set of resources to apply the command to. This argument can be one of the
following:
* **platform** - all GCP resources; that is, anything that doesn't run on * `kfctl build` - (Optional) Creates configuration files defining the various
Kubernetes. resources in your deployment. You only need to run `kfctl build` if you want
* **k8s** - all resources that run on Kubernetes. to edit the resources before running `kfctl apply`. See the guide to
* **all** - all GCP and Kubernetes resources. [customizing your Kubeflow deployment](/docs/gke/customizing-gke/).
* `kfctl apply` - Creates or updates the resources.
* `kfctl delete` - Deletes the resources.
### App layout The kfctl deployment process applies default values to certain properties
as follows:
Your Kubeflow app directory **${KFAPP}** contains the following files and directories: * **Email address:** kfctl attempts to fetch your email address from your
Cloud SDK configuration. You can run `gcloud config list` to see the default
email address, which the command output lists as the **account**.
If kfctl can't find a valid email address, you must use the
flag `--email <your email address>` to pass a valid email address. This email
address becomes an administrator in the configuration of your Kubeflow
deployment.
* **GCP project ID:** kfctl attempts to fetch your project ID from your
Cloud SDK configuration. You can run `gcloud config list` to see your
active project ID.
* **GCP zone:** kfctl attempts to fetch the zone from your Cloud SDK
configuration. You can run `gcloud config list` to see your active zone.
* **Kubeflow deployment name:** kfctl defaults to the name of the directory
where you run the `kfctl build` or `kfctl apply` command.
You can also explicitly set the following values in your `${CONFIG_FILE}`
configuration file:
* Kubeflow deployment name
* GCP project
* GCP zone
* Email address
* **app.yaml** defines configurations related to your Kubeflow deployment.
* The values are set when you run `kfctl init`. The following snippet shows you how to set values in the configuration file
* The values are snapshotted inside **app.yaml** to make your app using [yq](https://github.com/mikefarah/yq/releases):
self contained.
```
yq w -i ${CONFIG_FILE} spec.plugins[0].spec.project ${PROJECT}
yq w -i ${CONFIG_FILE} spec.plugins[0].spec.zone ${ZONE}
yq w -i ${CONFIG_FILE} metadata.name ${KF_NAME}
```
### Application layout
Your Kubeflow application directory **${KF_DIR}** contains the following files and
directories:
* **${CONFIG_FILE}** is a YAML file that defines configurations related to your
Kubeflow deployment.
* This file is a copy of the GitHub-based configuration YAML file that
you used when deploying Kubeflow:
* either `{{% config-uri-gcp-iap %}}`
* or `{{% config-uri-gcp-basic-auth %}}`.
* When you run `kfctl apply` or `kfctl build`, kfctl creates
a local version of the configuration file, **${CONFIG_FILE}**,
which you can further customize if necessary.
* **gcp_config** is a directory that contains * **gcp_config** is a directory that contains
[Deployment Manager configuration files](https://cloud.google.com/deployment-manager/docs/configuration/) [Deployment Manager configuration files](https://cloud.google.com/deployment-manager/docs/configuration/)
defining your GCP infrastructure. defining your GCP infrastructure.
* The directory is created when you run `kfctl generate platform`. * The directory is created when you run `kfctl build` or `kfctl apply`.
* You can modify these configurations to customize your GCP infrastructure. * You can modify these configurations to customize your GCP infrastructure.
After modifying a configuration, run `kfctl apply` again.
* **kustomize** is a directory that contains the kustomize packages for Kubeflow * **kustomize** is a directory that contains the kustomize packages for Kubeflow
applications. See applications. See
[how Kubeflow uses kustomize](/docs/other-guides/kustomize/). [how Kubeflow uses kustomize](/docs/other-guides/kustomize/).
* The directory is created when you run `kfctl generate`. * The directory is created when you run `kfctl build` or `kfctl apply`.
* You can customize the Kubernetes resources by modifying the manifests and * You can customize the Kubernetes resources by modifying the manifests and
running `kfctl apply` again. running `kfctl apply` again.
We recommend that you check in the contents of your **${KF_DIR}** directory
into source control.
### GCP service accounts ### GCP service accounts
Creating a deployment using `kfctl` creates three service accounts in your The kfctl deployment process creates three service accounts in your
GCP project. These service accounts are created using the [principle of least GCP project. These service accounts follow the [principle of least
privilege](https://en.wikipedia.org/wiki/Principle_of_least_privilege). privilege](https://en.wikipedia.org/wiki/Principle_of_least_privilege).
The three service accounts are: The service accounts are:
* `${KFAPP}-admin` is used for some admin tasks like configuring the load * `${KF_NAME}-admin` is used for some admin tasks like configuring the load
balancers. The principle is that this account is needed to deploy Kubeflow but balancers. The principle is that this account is needed to deploy Kubeflow but
not needed to actually run jobs. not needed to actually run jobs.
* `${KFAPP}-user` is intended to be used by training jobs and models to access * `${KF_NAME}-user` is intended to be used by training jobs and models to access
GCP resources (Cloud Storage, BigQuery, etc.). This account has a much smaller GCP resources (Cloud Storage, BigQuery, etc.). This account has a much smaller
set of privileges compared to `admin`. set of privileges compared to `admin`.
* `${KFAPP}-vm` is used only for the virtual machine (VM) service account. This * `${KF_NAME}-vm` is used only for the virtual machine (VM) service account. This
account has the minimal permissions needed to send metrics and logs to account has the minimal permissions needed to send metrics and logs to
[Stackdriver](https://cloud.google.com/stackdriver/). [Stackdriver](https://cloud.google.com/stackdriver/).
...@@ -221,12 +360,12 @@ The three service accounts are: ...@@ -221,12 +360,12 @@ The three service accounts are:
[end-to-end MNIST tutorial](/docs/gke/gcp-e2e/) or the [end-to-end MNIST tutorial](/docs/gke/gcp-e2e/) or the
[GitHub issue summarization [GitHub issue summarization
example](https://github.com/kubeflow/examples/tree/master/github_issue_summarization). example](https://github.com/kubeflow/examples/tree/master/github_issue_summarization).
* See how to [delete](/docs/gke/deploy/delete-cli) your Kubeflow deployment * See how to [delete](/docs/gke/deploy/delete-cli/) your Kubeflow deployment
using the CLI. using the CLI.
* See how to [customize](/docs/gke/customizing-gke) your Kubeflow * See how to [customize](/docs/gke/customizing-gke/) your Kubeflow
deployment. deployment.
* See how to [upgrade Kubeflow](/docs/upgrading/upgrade/) and how to * See how to [upgrade Kubeflow](/docs/upgrading/upgrade/) and how to
[upgrade or reinstall a Kubeflow Pipelines [upgrade or reinstall a Kubeflow Pipelines
deployment](/docs/pipelines/upgrade/). deployment](/docs/pipelines/upgrade/).
* [Troubleshoot](/docs/gke/troubleshooting-gke) any issues you may * [Troubleshoot](/docs/gke/troubleshooting-gke/) any issues you may
find. find.
...@@ -133,17 +133,27 @@ problems: ...@@ -133,17 +133,27 @@ problems:
recreating and redeploying Kubeflow with a different recreating and redeploying Kubeflow with a different
name. name.
For example if you originally ran the following `kfctl init` command: For example if you originally ran the following commands to deploy Kubeflow:
``` ```
kfctl init myapp --project=myproject --config=myconfig -V export KF_NAME=my-app
export BASE_DIR=<path to a base directory>
export KF_DIR=${BASE_DIR}/${KF_NAME}
mkdir -p ${KF_DIR}
cd ${KF_DIR}
kfctl apply -V -f ${CONFIG_FILE}
``` ```
Then rerun `kfctl init` with a different name that you haven't used Then rerun the commands with a different name that you haven't used
before: before:
``` ```
kfctl init myapp-unique --project=myproject --config=myconfig -V export KF_NAME=my-app-unique
export BASE_DIR=<path to a base directory>
export KF_DIR=${BASE_DIR}/${KF_NAME}
mkdir -p ${KF_DIR}
cd ${KF_DIR}
kfctl apply -V -f ${CONFIG_FILE}
``` ```
1. Wait for the load balancer to report the back ends as healthy: 1. Wait for the load balancer to report the back ends as healthy:
......
...@@ -221,70 +221,25 @@ export PROJECT_NUMBER=$(gcloud projects describe ${PROJECT} --format='value(proj ...@@ -221,70 +221,25 @@ export PROJECT_NUMBER=$(gcloud projects describe ${PROJECT} --format='value(proj
PROJECT=<PROJECT> make copy-gcb PROJECT=<PROJECT> make copy-gcb
``` ```
* This is needed because your GKE nodes won't be able to pull images from non GCR * This is needed because your GKE nodes won't be able to pull images from non GCR
registries because they don't have public internet addresses registries because they don't have public internet addresses
* gcloud may return an error even though the job is * gcloud may return an error even though the job is
submited successfully and will run successfully submited successfully and will run successfully
see [kubeflow/kubeflow#3105](https://github.com/kubeflow/kubeflow/issues/3105) see [kubeflow/kubeflow#3105](https://github.com/kubeflow/kubeflow/issues/3105)
* You can use the Cloud console to monitor your GCB job. * You can use the Cloud console to monitor your GCB job.
1. Follow the guide to [deploying Kubeflow on GCP](/docs/gke/deploy/deploy-cli/).
When you reach the
[setup and deploy step](/docs/gke/deploy/deploy-cli/#set-up-and-deploy),
**skip the `kfctl apply` command** and run the **`kfctl build`** command
instead, as described in that step. Now you can edit the configuration files
before deploying Kubeflow. Retain the environment variables that you set
during the setup, including `${KF_NAME}`, `${KF_DIR}`, and `${CONFIG_FILE}`.
1. Follow the [instructions](https://www.kubeflow.org/docs/gke/deploy/oauth-setup/) for creating an OAuth client 1. Enable private clusters by editing `${KF_DIR}/gcp_config/cluster-kubeflow.yaml` and updating the following two parameters:
1. Create environment variables for IAP OAuth access
```bash
export CLIENT_ID=<CLIENT_ID from OAuth page>
export CLIENT_SECRET=<CLIENT_SECRET from OAuth page>
```
1. Download a `kfctl` release from the
[Kubeflow releases page](https://github.com/kubeflow/kubeflow/releases/).
1. Unpack the tar ball:
```
tar -xvf kfctl_<release tag>_<platform>.tar.gz
```
* **Optional** Add the kfctl binary to your path.
* If you don't add the kfctl binary to your path then in all subsequent
steps you will need to replace `kfctl` with the full path to the binary.
1. Initialize the directory containing your Kubeflow deployment config files
```bash
export PROJECT=<your GCP project ID>
export KFAPP=<your choice of application directory name>
# Run the following commands for the default installation which uses Cloud IAP:
export CONFIG="{{% config-uri-gcp-iap %}}"
kfctl init ${KFAPP} --project=${PROJECT} --config=${CONFIG} -V
# Alternatively, run these commands if you want to use basic authentication:
export CONFIG="{{% config-uri-gcp-basic-auth %}}"
kfctl init ${KFAPP} --project=${PROJECT} --config=${CONFIG} -V --use_basic_auth
cd ${KFAPP}
kfctl generate all -V
```
* **${KFAPP}** - the _name_ of a directory where you want Kubeflow
configurations to be stored. This directory is created when you run
`kfctl init`. If you want a custom deployment name, specify that name here.
The value of this variable becomes the name of your deployment.
The value of this variable cannot be greater than 25 characters. It must
contain just the directory name, not the full path to the directory.
The content of this directory is described in the next section.
* **${PROJECT}** - the ID of the GCP project where you want Kubeflow
deployed.
* `kfctl generate all` attempts to fetch your email address from your
credential. If it can't find a valid email address, you need to pass a
valid email address with flag `--email <your email address>`. This email
address becomes an administrator in the configuration of your Kubeflow
deployment.
1. Enable private clusters by editing `${KFAPP}/gcp_configs/cluster-kubeflow.yaml` and updating the following two parameters:
``` ```
privatecluster: true privatecluster: true
...@@ -293,109 +248,109 @@ export PROJECT_NUMBER=$(gcloud projects describe ${PROJECT} --format='value(proj ...@@ -293,109 +248,109 @@ export PROJECT_NUMBER=$(gcloud projects describe ${PROJECT} --format='value(proj
1. Remove components which are not useful in private clusters: 1. Remove components which are not useful in private clusters:
``` ```
cd ${KFAPP}/kustomize cd ${KF_DIR}/kustomize
kubectl delete -f cert-manager.yaml kubectl delete -f cert-manager.yaml
``` ```
1. Create the deployment: 1. Create the deployment:
``` ```
cd ${KFAPP} cd ${KF_DIR}
kfctl apply platform kfctl apply -V -f ${CONFIG_FILE}
``` ```
* If you get an error **legacy networks not supported**, follow the * If you get an error **legacy networks not supported**, follow the
[troubleshooting guide]( /docs/gke/troubleshooting-gke/#legacy-networks-are-not-supported) to create a new network. [troubleshooting guide]( /docs/gke/troubleshooting-gke/#legacy-networks-are-not-supported) to create a new network.
* You will need to manually create the network as a work around for [kubeflow/kubeflow#3071](https://github.com/kubeflow/kubeflow/issues/3071) * You will need to manually create the network as a work around for [kubeflow/kubeflow#3071](https://github.com/kubeflow/kubeflow/issues/3071)
``` ```
cd ${KFAPP}/gcp_configs cd ${KF_DIR}/gcp_config
gcloud --project=${PROJECT} deployment-manager deployments create ${KFAPP}-network --config=network.yaml gcloud --project=${PROJECT} deployment-manager deployments create ${KF_NAME}-network --config=network.yaml
``` ```
* Then edit **gcp_config/cluster.jinja** to add a field **network** in your cluster * Then edit `${KF_DIR}/gcp_config/cluster.jinja` to add a field **network** in your cluster
``` ```
cluster: cluster:
name: {{ CLUSTER_NAME }} name: {{ CLUSTER_NAME }}
network: <name of the new network> network: <name of the new network>
``` ```
* To get the name of the new network run
``` * To get the name of the new network run
gcloud --project=${PROJECT} compute networks list
``` ```
gcloud --project=${PROJECT} compute networks list
```
* The name will contain the value ${KFAPP} * The name will contain the value ${KF_NAME}
1. Update iap-ingress component parameters: 1. Update iap-ingress component parameters:
``` ```
cd ${KFAPP}/kustomize cd ${KF_DIR}/kustomize
gvim basic-auth-ingress.yaml # Or iap-ingress.yaml if you are using IAP gvim basic-auth-ingress.yaml # Or iap-ingress.yaml if you are using IAP
``` ```
* Find and set the `privateGKECluster` parameter to true: * Find and set the `privateGKECluster` parameter to true:
``` ```
privateGKECluster: "true" privateGKECluster: "true"
``` ```
* Then apply your changes: * Then apply your changes:
``` ```
kubectl apply -f basic-auth-ingress.yaml kubectl apply -f basic-auth-ingress.yaml
``` ```
1. Obtain an HTTPS certificate for your ${FQDN} and create a Kubernetes secret with it. 1. Obtain an HTTPS certificate for your ${FQDN} and create a Kubernetes secret with it.
* You can create a self signed cert using [kube-rsa](https://github.com/kelseyhightower/kube-rsa) * You can create a self signed cert using [kube-rsa](https://github.com/kelseyhightower/kube-rsa)
``` ```
go get github.com/kelseyhightower/kube-rsa go get github.com/kelseyhightower/kube-rsa
kube-rsa ${FQDN} kube-rsa ${FQDN}
``` ```
* The fully qualified domain is the host field specified for your ingress; * The fully qualified domain is the host field specified for your ingress;
you can get it by running you can get it by running
``` ```
cd ${KFAPP}/kustomize cd ${KF_DIR}/kustomize
grep hostname: basic-auth-ingress.yaml grep hostname: basic-auth-ingress.yaml
``` ```
* Then create your Kubernetes secret * Then create your Kubernetes secret
``` ```
kubectl create secret tls --namespace=kubeflow envoy-ingress-tls --cert=ca.pem --key=ca-key.pem kubectl create secret tls --namespace=kubeflow envoy-ingress-tls --cert=ca.pem --key=ca-key.pem
``` ```
* An alternative option is to upgrade to GKE 1.12 or later and use * An alternative option is to upgrade to GKE 1.12 or later and use
[managed certificates](https://cloud.google.com/kubernetes-engine/docs/how-to/managed-certs#migrating_to_google-managed_certificates_from_self-managed_certificates) [managed certificates](https://cloud.google.com/kubernetes-engine/docs/how-to/managed-certs#migrating_to_google-managed_certificates_from_self-managed_certificates)
* See [kubeflow/kubeflow#3079](https://github.com/kubeflow/kubeflow/issues/3079) * See [kubeflow/kubeflow#3079](https://github.com/kubeflow/kubeflow/issues/3079)
1. Update the various kustomize manifests to use `gcr.io` images instead of Docker Hub images. 1. Update the various kustomize manifests to use `gcr.io` images instead of Docker Hub images.
1. Apply all the Kubernetes resources: 1. Apply all the Kubernetes resources:
``` ```
cd ${KFAPP} cd ${KF_DIR}
kfctl apply k8s kfctl apply -V -f ${CONFIG_FILE}
``` ```
1. Wait for Kubeflow to become accessible and then access it at 1. Wait for Kubeflow to become accessible and then access it at
``` ```
https://${FQDN}/ https://${FQDN}/
``` ```
* ${FQDN} is the host associated with your ingress * ${FQDN} is the host associated with your ingress
* You can get it by running `kubectl get ingress` * You can get it by running `kubectl get ingress`
* Follow the [instructions](/docs/gke/deploy/monitor-iap-setup/) to monitor the * Follow the [instructions](/docs/gke/deploy/monitor-iap-setup/) to monitor the
deployment deployment
* It can take 10-20 minutes for the endpoint to become fully available * It can take 10-20 minutes for the endpoint to become fully available
## Next steps ## Next steps
......
...@@ -15,6 +15,55 @@ This guide covers troubleshooting specifically for ...@@ -15,6 +15,55 @@ This guide covers troubleshooting specifically for
For more help, try the For more help, try the
[general Kubeflow troubleshooting guide](/docs/other-guides/troubleshooting). [general Kubeflow troubleshooting guide](/docs/other-guides/troubleshooting).
This guide assumes the following settings:
* The `${KF_DIR}` environment variable contains the path to
your Kubeflow application directory, which holds your Kubeflow configuration
files. For example, `/opt/my-kubeflow/`.
```
export KF_DIR=<path to your Kubeflow application directory>
```
* The `${CONFIG_FILE}` environment variable contains the path to your
Kubeflow configuration file.
```
export CONFIG_FILE=${KF_DIR}/kfctl_gcp_iap.yaml
```
Or:
```
export CONFIG_FILE=${KF_DIR}/kfctl_gcp_basic_auth.yaml
```
* The `${KF_NAME}` environment variable contains the name of your Kubeflow
deployment. You can find the name in your `${CONFIG_FILE}`
configuration file, as the value for the `metadata.name` key.
```
export KF_NAME=<the name of your Kubeflow deployment>
```
* The `${PROJECT}` environment variable contains the ID of your GCP project.
You can find the project ID in
your `${CONFIG_FILE}` configuration file, as the value for the `project` key.
```
export PROJECT=<your GCP project ID>
```
* The `${ZONE}` environment variable contains the GCP zone where your
Kubeflow resources are deployed.
```
export ZONE=<your GCP zone>
```
* For further background about the above settings, see the guide to
[deploying Kubeflow with the CLI](/docs/gke/deploy/deploy-cli).
## Troubleshooting kubeflow deployment on GCP ## Troubleshooting kubeflow deployment on GCP
Here are some tips for troubleshooting Cloud IAP. Here are some tips for troubleshooting Cloud IAP.
...@@ -43,7 +92,7 @@ the cluster. This section assumes ...@@ -43,7 +92,7 @@ the cluster. This section assumes
you are using [Cloud Endpoints](https://cloud.google.com/endpoints/) and a DNS name of the following pattern you are using [Cloud Endpoints](https://cloud.google.com/endpoints/) and a DNS name of the following pattern
``` ```
https://${DEPLOYMENT_NAME}.endpoints.${PROJECT}.cloud.goog https://${KF_NAME}.endpoints.${PROJECT}.cloud.goog
``` ```
Symptoms: Symptoms:
...@@ -52,11 +101,11 @@ Symptoms: ...@@ -52,11 +101,11 @@ Symptoms:
* nslookup for the domain name doesn't return the IP address associated with the ingress * nslookup for the domain name doesn't return the IP address associated with the ingress
``` ```
nslookup ${DEPLOYMENT_NAME}.endpoints.${PROJECT}.cloud.goog nslookup ${KF_NAME}.endpoints.${PROJECT}.cloud.goog
Server: 127.0.0.1 Server: 127.0.0.1
Address: 127.0.0.1#53 Address: 127.0.0.1#53
** server can't find ${DEPLOYMENT_NAME}.endpoints.${PROJECT}.cloud.goog: NXDOMAIN ** server can't find ${KF_NAME}.endpoints.${PROJECT}.cloud.goog: NXDOMAIN
``` ```
Troubleshooting Troubleshooting
...@@ -64,8 +113,8 @@ Troubleshooting ...@@ -64,8 +113,8 @@ Troubleshooting
1. Check the `cloudendpoints` resource 1. Check the `cloudendpoints` resource
``` ```
kubectl get cloudendpoints -o yaml ${DEPLOYMENT_NAME} kubectl get cloudendpoints -o yaml ${KF_NAME}
kubectl describe cloudendpoints ${DEPLOYMENT_NAME} kubectl describe cloudendpoints ${KF_NAME}
``` ```
* Check if there are errors indicating problems creating the endpoint * Check if there are errors indicating problems creating the endpoint
...@@ -329,7 +378,7 @@ ERROR: (gcloud.deployment-manager.deployments.update) Error in Operation [operat ...@@ -329,7 +378,7 @@ ERROR: (gcloud.deployment-manager.deployments.update) Error in Operation [operat
To fix this we can create a new network: To fix this we can create a new network:
``` ```
cd ${KFAPP} cd ${KF_DIR}
cp .cache/master/deployment/gke/deployment_manager_configs/network.* \ cp .cache/master/deployment/gke/deployment_manager_configs/network.* \
./gcp_config/ ./gcp_config/
``` ```
...@@ -341,8 +390,8 @@ Edit `gcfs.yaml` to use the name of the newly created network. ...@@ -341,8 +390,8 @@ Edit `gcfs.yaml` to use the name of the newly created network.
Apply the changes. Apply the changes.
``` ```
cd ${KFAPP} cd ${KF_DIR}
kfctl apply platform kfctl apply -V -f ${CONFIG}
``` ```
## CPU platform unavailable in requested zone ## CPU platform unavailable in requested zone
......
https://raw.githubusercontent.com/kubeflow/kubeflow/cfb336b7/bootstrap/config/kfctl_gcp_basic_auth.0.6.2.yaml https://raw.githubusercontent.com/kubeflow/manifests/v0.7-branch/kfdef/kfctl_gcp_basic_auth.0.7.0.yaml
\ No newline at end of file \ No newline at end of file
https://raw.githubusercontent.com/kubeflow/kubeflow/cfb336b7/bootstrap/config/kfctl_gcp_iap.0.6.2.yaml https://raw.githubusercontent.com/kubeflow/manifests/v0.7-branch/kfdef/kfctl_gcp_iap.0.7.0.yaml
\ No newline at end of file \ No newline at end of file
v0.6.2 v0.7.0-rc.5
\ No newline at end of file \ No newline at end of file
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册