Merge branch '3.0' of https://github.com/taosdata/TDengine into fix/long_query

2b584d16 · Hongze Cheng · f845a014 · a1cfa351 · 2b584d16 · 2b584d16
12 changed file
--- a/README.md
+++ b/README.md
@@ -15,11 +15,11 @@
 [![Coverage Status](https://coveralls.io/repos/github/taosdata/TDengine/badge.svg?branch=develop)](https://coveralls.io/github/taosdata/TDengine?branch=develop)
 [![CII Best Practices](https://bestpractices.coreinfrastructure.org/projects/4201/badge)](https://bestpractices.coreinfrastructure.org/projects/4201)

-English | [简体中文](README-CN.md) | We are hiring, check [here](https://tdengine.com/careers)
+English | [简体中文](README-CN.md) | [Lean more about TSDB](https://tdengine.com/tsdb)

 # What is TDengine？

-TDengine is an open source, high-performance, cloud native [time-series database](https://tdengine.com/tsdb/what-is-a-time-series-database/) optimized for Internet of Things (IoT), Connected Cars, and Industrial IoT. It enables efficient, real-time data ingestion, processing, and monitoring of TB and even PB scale data per day, generated by billions of sensors and data collectors. TDengine differentiates itself from other time-seires databases with the following advantages:
+TDengine is an open source, high-performance, cloud native [time-series database](https://tdengine.com/tsdb/) optimized for Internet of Things (IoT), Connected Cars, and Industrial IoT. It enables efficient, real-time data ingestion, processing, and monitoring of TB and even PB scale data per day, generated by billions of sensors and data collectors. TDengine differentiates itself from other time-seires databases with the following advantages:

 - **[High-Performance](https://tdengine.com/tdengine/high-performance-time-series-database/)**: TDengine is the only time-series database to solve the high cardinality issue to support billions of data collection points while out performing other time-series databases for data ingestion, querying and data compression.

@@ -33,6 +33,8 @@ TDengine is an open source, high-performance, cloud native [time-series database

 - **[Open Source](https://tdengine.com/tdengine/open-source-time-series-database/)**: TDengine’s core modules, including cluster feature, are all available under open source licenses. It has gathered 18.8k stars on GitHub. There is an active developer community, and over 139k running instances worldwide.

+For a full list of TDengine competitive advantages, please [check here](https://tdengine.com/tdengine/)
+
 # Documentation

 For user manual, system design and architecture, please refer to [TDengine Documentation](https://docs.tdengine.com) ([TDengine 文档](https://docs.taosdata.com))
@@ -319,6 +321,7 @@ TDengine provides abundant developing tools for users to develop on TDengine. Fo

 Please follow the [contribution guidelines](CONTRIBUTING.md) to contribute to the project.

-# Join TDengine WeChat Group
+# Join TDengine User Community

-Add WeChat “tdengine” to join the group，you can communicate with other users.
+- Join [TDengine Discord Channel](https://discord.com/invite/VZdSuUg4pS?utm_id=discord)
+- Join wechat group by adding WeChat “tdengine”
--- a/cmake/cmake.version
+++ b/cmake/cmake.version
@@ -2,7 +2,7 @@
 IF (DEFINED VERNUMBER)
  SET(TD_VER_NUMBER ${VERNUMBER})
 ELSE ()
-  SET(TD_VER_NUMBER "3.0.1.0")
+  SET(TD_VER_NUMBER "3.0.1.1")
 ENDIF ()

 IF (DEFINED VERCOMPATIBLE)

--- a/docs/en/01-index.md
+++ b/docs/en/01-index.md
@@ -4,7 +4,7 @@ sidebar_label: Documentation Home
 slug: /
 ---

-TDengine is an [open-source](https://tdengine.com/tdengine/open-source-time-series-database/), [cloud-native](https://tdengine.com/tdengine/cloud-native-time-series-database/) time-series database optimized for the Internet of Things (IoT), Connected Cars, and Industrial IoT. It enables efficient, real-time data ingestion, processing, and monitoring of TB and even PB scale data per day, generated by billions of sensors and data collectors. This document is the TDengine user manual. It introduces the basic, as well as novel concepts, in TDengine, and also talks in detail about installation, features, SQL, APIs, operation, maintenance, kernel design, and other topics. It’s written mainly for architects, developers, and system administrators.
+TDengine is an [open-source](https://tdengine.com/tdengine/open-source-time-series-database/), [cloud-native](https://tdengine.com/tdengine/cloud-native-time-series-database/) [time-series database](https://tdengine.com/tsdb/) optimized for the Internet of Things (IoT), Connected Cars, and Industrial IoT. It enables efficient, real-time data ingestion, processing, and monitoring of TB and even PB scale data per day, generated by billions of sensors and data collectors. This document is the TDengine user manual. It introduces the basic, as well as novel concepts, in TDengine, and also talks in detail about installation, features, SQL, APIs, operation, maintenance, kernel design, and other topics. It’s written mainly for architects, developers, and system administrators.

 To get an overview of TDengine, such as a feature list, benchmarks, and competitive advantages, please browse through the [Introduction](./intro) section.

@@ -22,6 +22,8 @@ If you want to know more about TDengine tools, the REST API, and connectors for

 If you are very interested in the internal design of TDengine, please read the chapter [Inside TDengine](./tdinternal), which introduces the cluster design, data partitioning, sharding, writing, and reading processes in detail. If you want to study TDengine code or even contribute code, please read this chapter carefully.

+To get more general introduction about time series database, please read through [a series of articles](https://tdengine.com/tsdb/). To lean more competitive advantages about TDengine, please read through [a series of blogs](https://tdengine.com/tdengine/). 
+
 TDengine is an open-source database, and we would love for you to be a part of TDengine. If you find any errors in the documentation or see parts where more clarity or elaboration is needed, please click "Edit this page" at the bottom of each page to edit it directly.

 Together, we make a difference!
--- a/docs/en/02-intro/index.md
+++ b/docs/en/02-intro/index.md
@@ -3,7 +3,7 @@ title: Introduction
 toc_max_heading_level: 2
 ---

-TDengine is an open source, high-performance, cloud native [time-series database](https://tdengine.com/tsdb/) optimized for Internet of Things (IoT), Connected Cars, and Industrial IoT. Its code, including its cluster feature is open source under GNU AGPL v3.0. Besides the database engine, it provides [caching](../develop/cache), [stream processing](../develop/stream), [data subscription](../develop/tmq) and other functionalities to reduce the system complexity and cost of development and operation.
+TDengine is an [open source](https://tdengine.com/tdengine/open-source-time-series-database/), [high-performance](https://tdengine.com/tdengine/high-performance-time-series-database/), [cloud native](https://tdengine.com/tdengine/cloud-native-time-series-database/) [time-series database](https://tdengine.com/tsdb/) optimized for Internet of Things (IoT), Connected Cars, and Industrial IoT. Its code, including its cluster feature is open source under GNU AGPL v3.0. Besides the database engine, it provides [caching](../develop/cache), [stream processing](../develop/stream), [data subscription](../develop/tmq) and other functionalities to reduce the system complexity and cost of development and operation.

 This section introduces the major features, competitive advantages, typical use-cases and benchmarks to help you get a high level overview of TDengine.

@@ -43,7 +43,7 @@ For more details on features, please read through the entire documentation.

 ## Competitive Advantages

-By making full use of [characteristics of time series data](https://tdengine.com/tsdb/characteristics-of-time-series-data/), TDengine differentiates itself from other time series databases, with the following advantages.
+By making full use of [characteristics of time series data](https://tdengine.com/tsdb/characteristics-of-time-series-data/), TDengine differentiates itself from other [time series databases](https://tdengine.com/tsdb), with the following advantages.

 - **[High-Performance](https://tdengine.com/tdengine/high-performance-time-series-database/)**: TDengine is the only time-series database to solve the high cardinality issue to support billions of data collection points while out performing other time-series databases for data ingestion, querying and data compression.

@@ -127,3 +127,8 @@ As a high-performance, scalable and SQL supported time-series database, TDengine
 - [TDengine vs OpenTSDB](https://tdengine.com/2019/09/12/710.html)
 - [TDengine vs Cassandra](https://tdengine.com/2019/09/12/708.html)
 - [TDengine vs InfluxDB](https://tdengine.com/2019/09/12/706.html)
+
+## More readings
+- [Introduction to Time-Series Database](https://tdengine.com/tsdb/)
+- [Introduction to TDengine competitive advantages](https://tdengine.com/tdengine/)
+ 
--- a/docs/en/12-taos-sql/06-select.md
+++ b/docs/en/12-taos-sql/06-select.md
@@ -355,7 +355,7 @@ SELECT ... FROM (SELECT ... FROM ...) ...;
 - Compared to the non-nested query, the functionality that can be used in the outer query has the following restrictions:
  - Functions
    - If the result set returned by the inner query doesn't contain timestamp column, then functions relying on timestamp can't be used in the outer query, like INTERP,DERIVATIVE, IRATE, LAST_ROW, FIRST, LAST, TWA, STATEDURATION, TAIL, UNIQUE.
-    - If the result set returned by the inner query are not valid time series, then functions relying on time series can't be used in the outer query, like LEASTSQUARES, ELAPSED, INTERP, DERIVATIVE, IRATE, TWA, DIFF, STATECOUNT, STATEDURATION, CSUM, MAVG, TAIL, UNIQUE. 
+    - If the result set returned by the inner query are not sorted in order by timestamp, then functions relying on data ordered by timestamp can't be used in the outer query, like LEASTSQUARES, ELAPSED, INTERP, DERIVATIVE, IRATE, TWA, DIFF, STATECOUNT, STATEDURATION, CSUM, MAVG, TAIL, UNIQUE. 
    - Functions that need to scan the data twice can't be used in the outer query, like PERCENTILE.

 :::

--- a/docs/en/20-third-party/13-Jupyter.md
+++ b/docs/en/20-third-party/13-Jupyter.md
+---
+sidebar_label: JupyterLab
+title: Connect JupyterLab to TDengine
+---
+
+JupyterLab is the next generation of the ubiquitous Jupyter Notebook. In this note we show you how to install the TDengine Python connector to connect to TDengine in JupyterLab. You can then insert data and perform queries against the TDengine instance within JupyterLab.
+
+## Install JupyterLab
+Installing JupyterLab is very easy. Installation instructions can be found at:  
+
+https://jupyterlab.readthedocs.io/en/stable/getting_started/installation.html.  
+
+If you don't feel like clicking on the link here are the instructions.  
+Jupyter's preferred Python package manager is pip, so we show the instructions for pip.  
+You can also use **conda** or **pipenv** if you are managing Python environments.
+````
+pip install jupyterlab
+````
+
+For **conda** you can run:
+````
+conda install -c conda-forge jupyterlab
+````
+
+For **pipenv** you can run:
+````
+pipenv install jupyterlab
+pipenv shell
+````
+
+## Run JupyterLab
+You can start JupyterLab from the command line by running:
+````
+jupyter lab
+````
+This will automatically launch your default browser and connect to your JupyterLab instance, usually on port 8888.
+
+## Install the TDengine Python connector
+You can now install the TDengine Python connector as follows.  
+
+Start a new Python kernel in JupyterLab.  
+
+If using **conda** run the following:
+````
+# Install a conda package in the current Jupyter kernel
+import sys
+!conda install --yes --prefix {sys.prefix} taospy
+````
+If using **pip** run the following:
+````
+# Install a pip package in the current Jupyter kernel
+import sys
+!{sys.executable} -m pip install taospy
+````
+
+## Connect to TDengine
+You can find detailed examples to use the Python connector, in the TDengine documentation here.
+Once you have installed the TDengine Python connector in your JupyterLab kernel, the process of connecting to TDengine is the same as that you would use if you weren't using JupyterLab.
+Each TDengine instance, has a database called "log" which has monitoring information about the TDengine instance.
+In the "log" database there is a [supertable](https://docs.tdengine.com/taos-sql/stable/) called "disks_info".  
+
+The structure of this table is as follows:
+````
+taos> desc disks_info;
+             Field              |         Type         |   Length    |   Note   |
+=================================================================================
+ ts                             | TIMESTAMP            |           8 |          |
+ datadir_l0_used                | FLOAT                |           4 |          |
+ datadir_l0_total               | FLOAT                |           4 |          |
+ datadir_l1_used                | FLOAT                |           4 |          |
+ datadir_l1_total               | FLOAT                |           4 |          |
+ datadir_l2_used                | FLOAT                |           4 |          |
+ datadir_l2_total               | FLOAT                |           4 |          |
+ dnode_id                       | INT                  |           4 | TAG      |
+ dnode_ep                       | BINARY               |         134 | TAG      |
+Query OK, 9 row(s) in set (0.000238s)
+````
+
+The code below is used to fetch data from this table into a pandas DataFrame.
+
+````
+import sys
+import taos
+import pandas
+
+def sqlQuery(conn):
+    df: pandas.DataFrame = pandas.read_sql("select * from log.disks_info limit 500", conn)
+    print(df)
+    return df
+
+conn = taos.connect()
+
+result = sqlQuery(conn)
+
+print(result)
+````
+
+TDengine has connectors for various languages including Node.js, Go, PHP and there are kernels for these languages which can be found [here](https://github.com/jupyter/jupyter/wiki/Jupyter-kernels).
--- a/docs/zh/12-taos-sql/06-select.md
+++ b/docs/zh/12-taos-sql/06-select.md
@@ -356,7 +356,7 @@ SELECT ... FROM (SELECT ... FROM ...) ...;
 - 与非嵌套的查询语句相比，外层查询所能支持的功能特性存在如下限制：
  - 计算函数部分：
    - 如果内层查询的结果数据未提供时间戳，那么计算过程隐式依赖时间戳的函数在外层会无法正常工作。例如：INTERP, DERIVATIVE, IRATE, LAST_ROW, FIRST, LAST, TWA, STATEDURATION, TAIL, UNIQUE。
-    - 如果内层查询的结果数据不是有效的时间序列，那么计算过程依赖数据为时间序列的函数在外层会无法正常工作。例如：LEASTSQUARES, ELAPSED, INTERP, DERIVATIVE, IRATE, TWA, DIFF, STATECOUNT, STATEDURATION, CSUM, MAVG, TAIL, UNIQUE。
+    - 如果内层查询的结果数据不是按时间戳有序，那么计算过程依赖数据按时间有序的函数在外层会无法正常工作。例如：LEASTSQUARES, ELAPSED, INTERP, DERIVATIVE, IRATE, TWA, DIFF, STATECOUNT, STATEDURATION, CSUM, MAVG, TAIL, UNIQUE。
    - 计算过程需要两遍扫描的函数，在外层查询中无法正常工作。例如：此类函数包括：PERCENTILE。

 :::

--- a/docs/zh/28-releases/01-tdengine.md
+++ b/docs/zh/28-releases/01-tdengine.md
@@ -6,6 +6,11 @@ description: TDengine 发布历史、Release Notes 及下载链接

 import Release from "/components/ReleaseV3";

+
+## 3.0.1.1
+
+<Release type="tdengine" version="3.0.1.1" />
+
 ## 3.0.1.0

 <Release type="tdengine" version="3.0.1.0" />

--- a/docs/zh/28-releases/02-tools.md
+++ b/docs/zh/28-releases/02-tools.md
@@ -6,6 +6,10 @@ description: taosTools 的发布历史、Release Notes 和下载链接

 import Release from "/components/ReleaseV3";

+## 2.2.0
+
+<Release type="tools" version="2.2.0" />
+
 ## 2.1.3

 <Release type="tools" version="2.1.3" />
--- a/source/client/src/clientHb.c
+++ b/source/client/src/clientHb.c
@@ -173,7 +173,8 @@ static int32_t hbQueryHbRspHandle(SAppHbMgr *pAppHbMgr, SClientHbRsp *pRsp) {
      pTscObj->pAppInfo->totalDnodes = pRsp->query->totalDnodes;
      pTscObj->pAppInfo->onlineDnodes = pRsp->query->onlineDnodes;
      pTscObj->connId = pRsp->query->connId;
-      tscTrace("conn %p hb rsp, dnodes %d/%d", pTscObj->connId, pTscObj->pAppInfo->onlineDnodes, pTscObj->pAppInfo->totalDnodes);
+      tscTrace("conn %p hb rsp, dnodes %d/%d", pTscObj->connId, pTscObj->pAppInfo->onlineDnodes,
+               pTscObj->pAppInfo->totalDnodes);

      if (pRsp->query->killRid) {
        tscDebug("request rid %" PRIx64 " need to be killed now", pRsp->query->killRid);
@@ -297,7 +298,8 @@ static int32_t hbAsyncCallBack(void *param, SDataBuf *pMsg, int32_t code) {

  if (code != 0) {
    (*pInst)->onlineDnodes = ((*pInst)->totalDnodes ? 0 : -1);
-    tscDebug("hb rsp error %s, update server status %d/%d", tstrerror(code), (*pInst)->onlineDnodes, (*pInst)->totalDnodes);
+    tscDebug("hb rsp error %s, update server status %d/%d", tstrerror(code), (*pInst)->onlineDnodes,
+             (*pInst)->totalDnodes);
  }

  if (rspNum) {
@@ -657,6 +659,8 @@ int32_t hbGatherAppInfo(void) {

  for (int32_t i = 0; i < sz; ++i) {
    SAppHbMgr *pAppHbMgr = taosArrayGetP(clientHbMgr.appHbMgrs, i);
+    if (pAppHbMgr == NULL) continue;
+
    uint64_t   clusterId = pAppHbMgr->pAppInstInfo->clusterId;
    SAppHbReq *pApp = taosHashGet(clientHbMgr.appSummary, &clusterId, sizeof(clusterId));
    if (NULL == pApp) {
@@ -694,15 +698,21 @@ static void *hbThreadFunc(void *param) {
      hbGatherAppInfo();
    }

+    SArray *mgr = taosArrayInit(sz, sizeof(void *));
    for (int i = 0; i < sz; i++) {
      SAppHbMgr *pAppHbMgr = taosArrayGetP(clientHbMgr.appHbMgrs, i);
+      if (pAppHbMgr == NULL) {
+        continue;
+      }

      int32_t connCnt = atomic_load_32(&pAppHbMgr->connKeyCnt);
      if (connCnt == 0) {
+        taosArrayPush(mgr, &pAppHbMgr);
        continue;
      }
      SClientHbBatchReq *pReq = hbGatherAllInfo(pAppHbMgr);
-      if (pReq == NULL) {
+      if (pReq == NULL || taosArrayGetP(clientHbMgr.appHbMgrs, i) == NULL) {
+        tFreeClientHbBatchReq(pReq);
        continue;
      }
      int   tlen = tSerializeSClientHbBatchReq(NULL, 0, pReq);
@@ -711,6 +721,7 @@ static void *hbThreadFunc(void *param) {
        terrno = TSDB_CODE_TSC_OUT_OF_MEMORY;
        tFreeClientHbBatchReq(pReq);
        // hbClearReqInfo(pAppHbMgr);
+        taosArrayPush(mgr, &pAppHbMgr);
        break;
      }

@@ -722,6 +733,7 @@ static void *hbThreadFunc(void *param) {
        tFreeClientHbBatchReq(pReq);
        // hbClearReqInfo(pAppHbMgr);
        taosMemoryFree(buf);
+        taosArrayPush(mgr, &pAppHbMgr);
        break;
      }
      pInfo->fp = hbAsyncCallBack;
@@ -729,7 +741,7 @@ static void *hbThreadFunc(void *param) {
      pInfo->msgInfo.len = tlen;
      pInfo->msgType = TDMT_MND_HEARTBEAT;
      pInfo->param = strdup(pAppHbMgr->key);
-      pInfo->paramFreeFp = taosMemoryFree;      
+      pInfo->paramFreeFp = taosMemoryFree;
      pInfo->requestId = generateRequestId();
      pInfo->requestObjRefId = 0;

@@ -741,8 +753,12 @@ static void *hbThreadFunc(void *param) {
      // hbClearReqInfo(pAppHbMgr);

      atomic_add_fetch_32(&pAppHbMgr->reportCnt, 1);
+      taosArrayPush(mgr, &pAppHbMgr);
    }

+    taosArrayDestroy(clientHbMgr.appHbMgrs);
+    clientHbMgr.appHbMgrs = mgr;
+
    taosThreadMutexUnlock(&clientHbMgr.lock);

    taosMsleep(HEARTBEAT_INTERVAL);
@@ -834,7 +850,7 @@ void hbRemoveAppHbMrg(SAppHbMgr **pAppHbMgr) {
    if (pItem == *pAppHbMgr) {
      hbFreeAppHbMgr(*pAppHbMgr);
      *pAppHbMgr = NULL;
-      taosArrayRemove(clientHbMgr.appHbMgrs, i);
+      taosArraySet(clientHbMgr.appHbMgrs, i, pAppHbMgr);
      break;
    }
  }
@@ -845,6 +861,7 @@ void appHbMgrCleanup(void) {
  int sz = taosArrayGetSize(clientHbMgr.appHbMgrs);
  for (int i = 0; i < sz; i++) {
    SAppHbMgr *pTarget = taosArrayGetP(clientHbMgr.appHbMgrs, i);
+    if (pTarget == NULL) continue;
    hbFreeAppHbMgr(pTarget);
  }
 }
@@ -859,7 +876,14 @@ int hbMgrInit() {

  clientHbMgr.appSummary = taosHashInit(10, taosGetDefaultHashFunction(TSDB_DATA_TYPE_BIGINT), false, HASH_NO_LOCK);
  clientHbMgr.appHbMgrs = taosArrayInit(0, sizeof(void *));
-  taosThreadMutexInit(&clientHbMgr.lock, NULL);
+
+  TdThreadMutexAttr attr = {0};
+  taosThreadMutexAttrSetType(&attr, PTHREAD_MUTEX_RECURSIVE);
+  int ret = taosThreadMutexAttrInit(&attr);
+  assert(ret == 0);
+
+  taosThreadMutexInit(&clientHbMgr.lock, &attr);
+  taosThreadMutexAttrDestroy(&attr);

  // init handle funcs
  hbMgrInitHandle();

--- a/source/dnode/vnode/src/tsdb/tsdbRetention.c
+++ b/source/dnode/vnode/src/tsdb/tsdbRetention.c
@@ -16,19 +16,9 @@
 #include "tsdb.h"

 static bool tsdbShouldDoRetention(STsdb *pTsdb, int64_t now) {
-  STsdbKeepCfg *keepCfg = &pTsdb->keepCfg;
-
-  if ((keepCfg->keep0 == keepCfg->keep1) && (keepCfg->keep1 == keepCfg->keep2)) {
-    return false;
-  }
-
-  if (tfsGetLevel(pTsdb->pVnode->pTfs) <= 1) {
-    return false;
-  }
-
  for (int32_t iSet = 0; iSet < taosArrayGetSize(pTsdb->fs.aDFileSet); iSet++) {
    SDFileSet *pSet = (SDFileSet *)taosArrayGet(pTsdb->fs.aDFileSet, iSet);
-    int32_t    expLevel = tsdbFidLevel(pSet->fid, keepCfg, now);
+    int32_t    expLevel = tsdbFidLevel(pSet->fid, &pTsdb->keepCfg, now);
    SDiskID    did;

    if (expLevel == pSet->diskId.level) continue;

--- a/source/util/test/trefTest.c
+++ b/source/util/test/trefTest.c
@@ -94,7 +94,7 @@ void *openRefSpace(void *param) {
  pSpace->rsetId = taosOpenRef(50, myfree);

  if (pSpace->rsetId < 0) {
-    printf("failed to open ref, reson:%s\n", tstrerror(pSpace->rsetId));
+    printf("failed to open ref, reason:%s\n", tstrerror(pSpace->rsetId));
    return NULL;
  }