description:This document describes how to deploy TDengine on Kubernetes.
description:This document describes how to deploy TDengine on Kubernetes.
---
---
TDengine is a cloud-native time-series database that can be deployed on Kubernetes. This document gives a step-by-step description of how you can use YAML files to create a TDengine cluster and introduces common operations for TDengine in a Kubernetes environment.
## Overview
As a time series database for Cloud Native architecture design, TDengine supports Kubernetes deployment. Here we introduce how to use YAML files to create a highly available TDengine cluster from scratch step by step for production use, and highlight the common operations of TDengine in Kubernetes environment.
To meet [high availability ](https://docs.taosdata.com/tdinternal/high-availability/)requirements, clusters need to meet the following requirements:
- 3 or more dnodes: The vnodes in the vgroup of TDengine are not allowed to be distributed in one dnode at the same time, so if you create a database with 3 copies, the number of dnodes is greater than or equal to 3
- 3 mnodes: nmode is responsible for the management of the entire cluster. TDengine defaults to an mnode. At this time, if the dnode where the mnode is located is dropped, the entire cluster is unavailable at this time
- There are 3 copies of the database, and the copy configuration of TDengine is DB level, which can be satisfied with 3 copies. In a 3-node cluster, any dnode goes offline, which does not affect the normal use of the cluster. **If the number of offline is 2, the cluster is unavailable at this time, and RAFT cannot complete the election** , (Enterprise Edition: In the disaster recovery scenario, any node data file is damaged, which can be restored by pulling up the dnode again)
## Prerequisites
## Prerequisites
Before deploying TDengine on Kubernetes, perform the following:
Before deploying TDengine on Kubernetes, perform the following:
* Current steps are compatible with Kubernetes v1.5 and later version.
- Current steps are compatible with Kubernetes v1.5 and later version.
* Install and configure minikube, kubectl, and helm.
- Install and configure minikube, kubectl, and helm.
* Install and deploy Kubernetes and ensure that it can be accessed and used normally. Update any container registries or other services as necessary.
- Install and deploy Kubernetes and ensure that it can be accessed and used normally. Update any container registries or other services as necessary.
You can download the configuration files in this document from [GitHub](https://github.com/taosdata/TDengine-Operator/tree/3.0/src/tdengine).
You can download the configuration files in this document from [GitHub](https://github.com/taosdata/TDengine-Operator/tree/3.0/src/tdengine).
...
@@ -20,7 +28,7 @@ You can download the configuration files in this document from [GitHub](https://
...
@@ -20,7 +28,7 @@ You can download the configuration files in this document from [GitHub](https://
Create a service configuration file named `taosd-service.yaml`. Record the value of `metadata.name` (in this example, `taos`) for use in the next step. Add the ports required by TDengine:
Create a service configuration file named `taosd-service.yaml`. Record the value of `metadata.name` (in this example, `taos`) for use in the next step. Add the ports required by TDengine:
```yaml
```YAML
---
---
apiVersion: v1
apiVersion: v1
kind: Service
kind: Service
...
@@ -31,10 +39,10 @@ metadata:
...
@@ -31,10 +39,10 @@ metadata:
spec:
spec:
ports:
ports:
- name: tcp6030
- name: tcp6030
-protocol: "TCP"
protocol: "TCP"
port: 6030
port: 6030
- name: tcp6041
- name: tcp6041
-protocol: "TCP"
protocol: "TCP"
port: 6041
port: 6041
selector:
selector:
app: "tdengine"
app: "tdengine"
...
@@ -42,10 +50,11 @@ spec:
...
@@ -42,10 +50,11 @@ spec:
## Configure the service as StatefulSet
## Configure the service as StatefulSet
Configure the TDengine service as a StatefulSet.
According to Kubernetes instructions for various deployments, we will use StatefulSet as the service type of TDengine. Create the file `tdengine.yaml `, where replicas defines the number of cluster nodes as 3. The node time zone is China (Asia/Shanghai), and each node is allocated 5G standard storage (refer to the [Storage Classes ](https://kubernetes.io/docs/concepts/storage/storage-classes/)configuration storage class). You can also modify accordingly according to the actual situation.
Create the `tdengine.yaml` file and set `replicas` to 3. In this example, the region is set to Asia/Shanghai and 10 GB of standard storage are allocated per node. You can change the configuration based on your environment and business requirements.
```yaml
You need to pay attention to the configuration of startupProbe. After the dnode is disconnected for a period of time, restart, and the newly launched dnode will be temporarily unavailable. If the startupProbe configuration is too small, Kubernetes will think that the pod is in an abnormal state and will try to pull the pod again. At this time, dnode will restart frequently and never recover. Refer to [Configure Liveness, Readiness and Startup Probes](https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/)
The preceding configuration generates a TDengine cluster with three nodes in which dnodes are automatically configured. You can run the `show dnodes` command to query the nodes in the cluster:
The above configuration will generate a three-node TDengine cluster, dnode is automatically configured, you can use the show dnodes command to view the nodes of the current cluster:
In some public clouds, minikube cannot be remotely accessed if it is bound to 127.0.0.1. In this case, use the kubectl proxy command to map the port to 0.0.0.0. Then, you can access the dashboard by using a web browser to open the dashboard URL above on the public IP address and port of the virtual machine.
Create a 3 replica database with taosBenchmark, write 100 million data at the same time, and view the data at the same time
When you scale in a TDengine cluster, your data is migrated to different nodes. You must run the drop dnodes command in TDengine to remove dnodes before scaling in your Kubernetes environment.
Note: In a Kubernetes StatefulSet service, the newest pods are always removed first. For this reason, when you scale in your TDengine cluster, ensure that you drop the newest dnodes.
taos> insert into test1.t1 values(now, 1)(now+1s, 2);
In the same way, as for the mnode dropped by the non-leader, reading and writing can of course be performed normally, so there will be no too much display here.
Verify that the dnode have been successfully removed by running the `kubectl exec -i -t tdengine-0 -- taos -s "show dnodes"` command. Then run the following command to remove the pod:
TDengine cluster supports automatic expansion:
```
```Bash
kubectl scale statefulsets tdengine --replicas=3
kubectl scale statefulsets tdengine --replicas=4
```
```
The newest pod in the deployment is removed. Run the `kubectl get pods -l app=tdengine` command to query the pod status:
The parameter `--replica = 4 `in the above command line indicates that you want to expand the TDengine cluster to 4 nodes. After execution, first check the status of the POD:
```
```Bash
$ kubectl get pods -l app=tdengine
kubectl get pod -l app=tdengine -n tdengine-test -o wide
NAME READY STATUS RESTARTS AGE
tdengine-0 1/1 Running 0 4m7s
tdengine-1 1/1 Running 0 3m55s
tdengine-2 1/1 Running 0 2m28s
```
```
After the pod has been removed, manually delete the PersistentVolumeClaim (PVC). Otherwise, future scale-outs will attempt to use existing data.
The output is as follows:
```bash
```Plain
$ kubectl delete pvc taosdata-tdengine-3
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
To fully remove a TDengine cluster, you must delete its statefulset, svc, configmap, and pvc entries:
Since the TDengine cluster will migrate data between nodes during volume expansion and contraction, using the kubectl command to reduce the volume requires first using the "drop dnodes" command ( **If there are 3 replicas of db in the cluster, the number of dnodes after reduction must also be greater than or equal to 3, otherwise the drop dnode operation will be aborted** ), the node deletion is completed before Kubernetes cluster reduction.
```bash
Note: Since Kubernetes Pods in the Statefulset can only be removed in reverse order of creation, the TDengine drop dnode also needs to be removed in reverse order of creation, otherwise the Pod will be in an error state.
After confirming that the removal is successful (use kubectl exec -i -t tdengine-0 --taos -s "show dnodes" to view and confirm the dnode list), use the kubectl command to remove the POD:
After the POD is deleted, the PVC needs to be deleted manually, otherwise the previous data will continue to be used in the next expansion, resulting in the inability to join the cluster normally.
id | endpoint | vnodes | support_vnodes | status | create_time | offline reason |
> **When deleting the pvc, you need to pay attention to the pv persistentVolumeReclaimPolicy policy. It is recommended to change to Delete, so that the pv will be automatically cleaned up when the pvc is deleted, and the underlying csi storage resources will be cleaned up at the same time. If the policy of deleting the pvc to automatically clean up the pv is not configured, and then after deleting the pvc, when manually cleaning up the pv, the csi storage resources corresponding to the pv may not be released.**
kubectl scale statefulsets tdengine --replicas=1
Complete removal of TDengine cluster, need to clean statefulset, svc, configmap, pvc respectively.
In the TDengine CLI, you can see that no database operations succeed:
## Troubleshooting
### Error 1
No "drop dnode" is directly reduced. Since the TDengine has not deleted the node, the reduced pod causes some nodes in the TDengine cluster to be offline.
2 | tdengine-1.taosd.default.sv... | 1 | 40 | offline | any | 2021-06-01 15:56:07.212 | status msg timeout |
Query OK, 2 row(s) in set (0.000837s)
taos> use test;
For the high availability and high reliability of TDengine in the k8s environment, for hardware damage and disaster recovery, it is divided into two levels:
Database changed.
taos> insert into t1 values(now, 3);
1. The disaster recovery capability of the underlying distributed Block Storage, the multi-replica of Block Storage, the current popular distributed Block Storage such as ceph, has the multi-replica capability, extending the storage replica to different racks, cabinets, computer rooms, Data center (or directly use the Block Storage service provided by Public Cloud vendors)
2. TDengine disaster recovery, in TDengine Enterprise, itself has when a dnode permanently offline (TCE-metal disk damage, data sorting loss), re-pull a blank dnode to restore the original dnode work.
DB error: Unable to resolve FQDN (0.013874s)
Finally, welcome to [TDengine Cloud ](https://cloud.tdengine.com/)to experience the one-stop fully managed TDengine Cloud as a Service.
```
> TDengine Cloud is a minimalist fully managed time series data processing Cloud as a Service platform developed based on the open source time series database TDengine. In addition to high-performance time series database, it also has system functions such as caching, subscription and stream computing, and provides convenient and secure data sharing, as well as numerous enterprise-level functions. It allows enterprises in the fields of Internet of Things, Industrial Internet, Finance, IT operation and maintenance monitoring to significantly reduce labor costs and operating costs in the management of time series data.