03-k8s.md 15.5 KB
Newer Older
D
danielclow 已提交
1 2
---
title: Deploying a TDengine Cluster in Kubernetes
D
danielclow 已提交
3 4
sidebar_label: Kubernetes
description: This document describes how to deploy TDengine on Kubernetes.
D
danielclow 已提交
5 6 7 8 9 10 11 12
---

TDengine is a cloud-native time-series database that can be deployed on Kubernetes. This document gives a step-by-step description of how you can use YAML files to create a TDengine cluster and introduces common operations for TDengine in a Kubernetes environment. 

## Prerequisites

Before deploying TDengine on Kubernetes, perform the following:

13
* Current steps are compatible with Kubernetes v1.5 and later version.
D
danielclow 已提交
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104
* Install and configure minikube, kubectl, and helm.
* Install and deploy Kubernetes and ensure that it can be accessed and used normally. Update any container registries or other services as necessary.

You can download the configuration files in this document from [GitHub](https://github.com/taosdata/TDengine-Operator/tree/3.0/src/tdengine).

## Configure the service

Create a service configuration file named `taosd-service.yaml`. Record the value of `metadata.name` (in this example, `taos`) for use in the next step. Add the ports required by TDengine:

```yaml
---
apiVersion: v1
kind: Service
metadata:
  name: "taosd"
  labels:
    app: "tdengine"
spec:
  ports:
    - name: tcp6030
      - protocol: "TCP"
      port: 6030
    - name: tcp6041
      - protocol: "TCP"
      port: 6041
  selector:
    app: "tdengine"
```

## Configure the service as StatefulSet

Configure the TDengine service as a StatefulSet.
Create the `tdengine.yaml` file and set `replicas` to 3. In this example, the region is set to Asia/Shanghai and 10 GB of standard storage are allocated per node. You can change the configuration based on your environment and business requirements. 

```yaml
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: "tdengine"
  labels:
    app: "tdengine"
spec:
  serviceName: "taosd"
  replicas: 3
  updateStrategy:
    type: RollingUpdate
  selector:
    matchLabels:
      app: "tdengine"
  template:
    metadata:
      name: "tdengine"
      labels:
        app: "tdengine"
    spec:
      containers:
        - name: "tdengine"
          image: "tdengine/tdengine:3.0.0.0"
          imagePullPolicy: "IfNotPresent"
          ports:
            - name: tcp6030
              - protocol: "TCP"
              containerPort: 6030
            - name: tcp6041
              - protocol: "TCP"
              containerPort: 6041
          env:
            # POD_NAME for FQDN config
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
            # SERVICE_NAME and NAMESPACE for fqdn resolve
            - name: SERVICE_NAME
              value: "taosd"
            - name: STS_NAME
              value: "tdengine"
            - name: STS_NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
            # TZ for timezone settings, we recommend to always set it.
            - name: TZ
              value: "Asia/Shanghai"
            # TAOS_ prefix will configured in taos.cfg, strip prefix and camelCase.
            - name: TAOS_SERVER_PORT
              value: "6030"
            # Must set if you want a cluster.
            - name: TAOS_FIRST_EP
              value: "$(STS_NAME)-0.$(SERVICE_NAME).$(STS_NAMESPACE).svc.cluster.local:$(TAOS_SERVER_PORT)"
105
            # TAOS_FQDN should always be set in k8s env.
D
danielclow 已提交
106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394
            - name: TAOS_FQDN
              value: "$(POD_NAME).$(SERVICE_NAME).$(STS_NAMESPACE).svc.cluster.local"
          volumeMounts:
            - name: taosdata
              mountPath: /var/lib/taos
          readinessProbe:
            exec:
              command:
                - taos-check
            initialDelaySeconds: 5
            timeoutSeconds: 5000
          livenessProbe:
            exec:
              command:
                - taos-check
            initialDelaySeconds: 15
            periodSeconds: 20
  volumeClaimTemplates:
    - metadata:
        name: taosdata
      spec:
        accessModes:
          - "ReadWriteOnce"
        storageClassName: "standard"
        resources:
          requests:
            storage: "10Gi"
```

## Use kubectl to deploy TDengine

Run the following commands:

```bash
kubectl apply -f taosd-service.yaml
kubectl apply -f tdengine.yaml
```

The preceding configuration generates a TDengine cluster with three nodes in which dnodes are automatically configured. You can run the `show dnodes` command to query the nodes in the cluster:

```bash
kubectl exec -i -t tdengine-0 -- taos -s "show dnodes"
kubectl exec -i -t tdengine-1 -- taos -s "show dnodes"
kubectl exec -i -t tdengine-2 -- taos -s "show dnodes"
```

The output is as follows:

```
taos> show dnodes
   id   |            endpoint            | vnodes | support_vnodes |   status   |       create_time       |              note              |
============================================================================================================================================
      1 | tdengine-0.taosd.default.sv... |      0 |            256 | ready      | 2022-08-10 13:14:57.285 |                                |
      2 | tdengine-1.taosd.default.sv... |      0 |            256 | ready      | 2022-08-10 13:15:11.302 |                                |
      3 | tdengine-2.taosd.default.sv... |      0 |            256 | ready      | 2022-08-10 13:15:23.290 |                                |
Query OK, 3 rows in database (0.003655s)
```

## Enable port forwarding

The kubectl port forwarding feature allows applications to access the TDengine cluster running on Kubernetes.

```
kubectl port-forward tdengine-0 6041:6041 &
```

Use curl to verify that the TDengine REST API is working on port 6041:

```
$ curl -u root:taosdata -d "show databases" 127.0.0.1:6041/rest/sql
Handling connection for 6041
{"code":0,"column_meta":[["name","VARCHAR",64],["create_time","TIMESTAMP",8],["vgroups","SMALLINT",2],["ntables","BIGINT",8],["replica","TINYINT",1],["strict","VARCHAR",4],["duration","VARCHAR",10],["keep","VARCHAR",32],["buffer","INT",4],["pagesize","INT",4],["pages","INT",4],["minrows","INT",4],["maxrows","INT",4],["comp","TINYINT",1],["precision","VARCHAR",2],["status","VARCHAR",10],["retention","VARCHAR",60],["single_stable","BOOL",1],["cachemodel","VARCHAR",11],["cachesize","INT",4],["wal_level","TINYINT",1],["wal_fsync_period","INT",4],["wal_retention_period","INT",4],["wal_retention_size","BIGINT",8],["wal_roll_period","INT",4],["wal_segment_size","BIGINT",8]],"data":[["information_schema",null,null,16,null,null,null,null,null,null,null,null,null,null,null,"ready",null,null,null,null,null,null,null,null,null,null],["performance_schema",null,null,10,null,null,null,null,null,null,null,null,null,null,null,"ready",null,null,null,null,null,null,null,null,null,null]],"rows":2} 
```

## Enable the dashboard for visualization

 The minikube dashboard command enables visualized cluster management.

```
$ minikube dashboard
* Verifying dashboard health ...
* Launching proxy ...
* Verifying proxy health ...
* Opening http://127.0.0.1:46617/api/v1/namespaces/kubernetes-dashboard/services/http:kubernetes-dashboard:/proxy/ in your default browser...
http://127.0.0.1:46617/api/v1/namespaces/kubernetes-dashboard/services/http:kubernetes-dashboard:/proxy/
```

In some public clouds, minikube cannot be remotely accessed if it is bound to 127.0.0.1. In this case, use the kubectl proxy command to map the port to 0.0.0.0. Then, you can access the dashboard by using a web browser to open the dashboard URL above on the public IP address and port of the virtual machine.

```
$ kubectl proxy --accept-hosts='^.*$' --address='0.0.0.0'
```

## Scaling Out Your Cluster

TDengine clusters can scale automatically:

```bash
kubectl scale statefulsets tdengine --replicas=4
```

The preceding command increases the number of replicas to 4. After running this command, query the pod status:

```bash
kubectl get pods -l app=tdengine
```

The output is as follows:

```
NAME         READY   STATUS    RESTARTS   AGE
tdengine-0   1/1     Running   0          161m
tdengine-1   1/1     Running   0          161m
tdengine-2   1/1     Running   0          32m
tdengine-3   1/1     Running   0          32m
```

The status of all pods is Running. Once the pod status changes to Ready, you can check the dnode status:

```bash
kubectl exec -i -t tdengine-3 -- taos -s "show dnodes"
```

The following output shows that the TDengine cluster has been expanded to 4 replicas:

```
taos> show dnodes
   id   |            endpoint            | vnodes | support_vnodes |   status   |       create_time       |              note              |
============================================================================================================================================
      1 | tdengine-0.taosd.default.sv... |      0 |            256 | ready      | 2022-08-10 13:14:57.285 |                                |
      2 | tdengine-1.taosd.default.sv... |      0 |            256 | ready      | 2022-08-10 13:15:11.302 |                                |
      3 | tdengine-2.taosd.default.sv... |      0 |            256 | ready      | 2022-08-10 13:15:23.290 |                                |
      4 | tdengine-3.taosd.default.sv... |      0 |            256 | ready      | 2022-08-10 13:33:16.039 |                                |
Query OK, 4 rows in database (0.008377s)
```

## Scaling In Your Cluster

When you scale in a TDengine cluster, your data is migrated to different nodes. You must run the drop dnodes command in TDengine to remove dnodes before scaling in your Kubernetes environment.

Note: In a Kubernetes StatefulSet service, the newest pods are always removed first. For this reason, when you scale in your TDengine cluster, ensure that you drop the newest dnodes.

```
$ kubectl exec -i -t tdengine-0 -- taos -s "drop dnode 4"
```

```bash
$ kubectl exec -it tdengine-0 -- taos -s "show dnodes"

taos> show dnodes
   id   |            endpoint            | vnodes | support_vnodes |   status   |       create_time       |              note              |
============================================================================================================================================
      1 | tdengine-0.taosd.default.sv... |      0 |            256 | ready      | 2022-08-10 13:14:57.285 |                                |
      2 | tdengine-1.taosd.default.sv... |      0 |            256 | ready      | 2022-08-10 13:15:11.302 |                                |
      3 | tdengine-2.taosd.default.sv... |      0 |            256 | ready      | 2022-08-10 13:15:23.290 |                                |
Query OK, 3 rows in database (0.004861s)
```

Verify that the dnode have been successfully removed by running the `kubectl exec -i -t tdengine-0 -- taos -s "show dnodes"` command. Then run the following command to remove the pod:

```
kubectl scale statefulsets tdengine --replicas=3
```

The newest pod in the deployment is removed. Run the `kubectl get pods -l app=tdengine` command to query the pod status:

```
$ kubectl get pods -l app=tdengine
NAME READY STATUS RESTARTS AGE
tdengine-0 1/1 Running 0 4m7s
tdengine-1 1/1 Running 0 3m55s
tdengine-2 1/1 Running 0 2m28s
```

After the pod has been removed, manually delete the PersistentVolumeClaim (PVC). Otherwise, future scale-outs will attempt to use existing data.

```bash
$ kubectl delete pvc taosdata-tdengine-3
```

Your cluster has now been safely scaled in, and you can scale it out again as necessary.

```bash
$ kubectl scale statefulsets tdengine --replicas=4
statefulset.apps/tdengine scaled
it@k8s-2:~/TDengine-Operator/src/tdengine$ kubectl get pods -l app=tdengine
NAME READY STATUS RESTARTS AGE
tdengine-0 1/1 Running 0 35m
tdengine-1 1/1 Running 0 34m
tdengine-2 1/1 Running 0 12m
tdengine-3 0/1 ContainerCreating 0 4s
it@k8s-2:~/TDengine-Operator/src/tdengine$ kubectl get pods -l app=tdengine
NAME READY STATUS RESTARTS AGE
tdengine-0 1/1 Running 0 35m
tdengine-1 1/1 Running 0 34m
tdengine-2 1/1 Running 0 12m
tdengine-3 0/1 Running 0 7s
it@k8s-2:~/TDengine-Operator/src/tdengine$ kubectl exec -it tdengine-0 -- taos -s "show dnodes"

taos> show dnodes
id | endpoint | vnodes | support_vnodes | status | create_time | offline reason |
======================================================================================================================================
1 | tdengine-0.taosd.default.sv... | 0 | 4 | ready | 2022-07-25 17:38:49.012 | |
2 | tdengine-1.taosd.default.sv... | 1 | 4 | ready | 2022-07-25 17:39:01.517 | |
5 | tdengine-2.taosd.default.sv... | 0 | 4 | ready | 2022-07-25 18:01:36.479 | |
6 | tdengine-3.taosd.default.sv... | 0 | 4 | ready | 2022-07-25 18:13:54.411 | |
Query OK, 4 row(s) in set (0.001348s)
```

## Remove a TDengine Cluster

To fully remove a TDengine cluster, you must delete its statefulset, svc, configmap, and pvc entries:

```bash
kubectl delete statefulset -l app=tdengine
kubectl delete svc -l app=tdengine
kubectl delete pvc -l app=tdengine
kubectl delete configmap taoscfg

```

## Troubleshooting

### Error 1

If you remove a pod without first running `drop dnode`, some TDengine nodes will go offline.

```
$ kubectl exec -it tdengine-0 -- taos -s "show dnodes"

taos> show dnodes
id | endpoint | vnodes | support_vnodes | status | create_time | offline reason |
======================================================================================================================================
1 | tdengine-0.taosd.default.sv... | 0 | 4 | ready | 2022-07-25 17:38:49.012 | |
2 | tdengine-1.taosd.default.sv... | 1 | 4 | ready | 2022-07-25 17:39:01.517 | |
5 | tdengine-2.taosd.default.sv... | 0 | 4 | offline | 2022-07-25 18:01:36.479 | status msg timeout |
6 | tdengine-3.taosd.default.sv... | 0 | 4 | offline | 2022-07-25 18:13:54.411 | status msg timeout |
Query OK, 4 row(s) in set (0.001323s)
```

### Error 2

If the number of nodes after a scale-in is less than the value of the replica parameter, the cluster will go down:

Create a database with replica set to 2 and add data.

```bash
kubectl exec -i -t tdengine-0 -- \
  taos -s \
  "create database if not exists test replica 2;
   use test;
   create table if not exists t1(ts timestamp, n int);
   insert into t1 values(now, 1)(now+1s, 2);"


```

Scale in to one node:

```bash
kubectl scale statefulsets tdengine --replicas=1

```

In the TDengine CLI, you can see that no database operations succeed:

```
taos> show dnodes;
   id   |           end_point            | vnodes | cores  |   status   | role  |       create_time       |      offline reason      |
======================================================================================================================================
      1 | tdengine-0.taosd.default.sv... |      2 |     40 | ready      | any   | 2021-06-01 15:55:52.562 |                          |
      2 | tdengine-1.taosd.default.sv... |      1 |     40 | offline    | any   | 2021-06-01 15:56:07.212 | status msg timeout       |
Query OK, 2 row(s) in set (0.000845s)

taos> show dnodes;
   id   |           end_point            | vnodes | cores  |   status   | role  |       create_time       |      offline reason      |
======================================================================================================================================
      1 | tdengine-0.taosd.default.sv... |      2 |     40 | ready      | any   | 2021-06-01 15:55:52.562 |                          |
      2 | tdengine-1.taosd.default.sv... |      1 |     40 | offline    | any   | 2021-06-01 15:56:07.212 | status msg timeout       |
Query OK, 2 row(s) in set (0.000837s)

taos> use test;
Database changed.

taos> insert into t1 values(now, 3);

DB error: Unable to resolve FQDN (0.013874s)

```