From eff225e572d6050831be65ccf37793bfb7838265 Mon Sep 17 00:00:00 2001 From: wangmm0220 Date: Thu, 17 Mar 2022 18:50:17 +0800 Subject: [PATCH] modify doc for hyperloglog --- documentation20/cn/12.taos-sql/docs.md | 4 +++- documentation20/en/12.taos-sql/docs.md | 4 +++- 2 files changed, 6 insertions(+), 2 deletions(-) diff --git a/documentation20/cn/12.taos-sql/docs.md b/documentation20/cn/12.taos-sql/docs.md index 1785796222..ce9136f002 100755 --- a/documentation20/cn/12.taos-sql/docs.md +++ b/documentation20/cn/12.taos-sql/docs.md @@ -1118,7 +1118,9 @@ TDengine支持针对数据的聚合查询。提供支持的聚合和选择函数 ```mysql SELECT HYPERLOGLOG(field_name) FROM { tb_name | stb_name } [WHERE clause]; ``` - 功能说明:采用hyperloglog算法,返回某列的基数。该算法在数据量很大的情况下,可以明显降低内存的占用,但是求出来的基数是个估算值,标准误差为0.81%。 + 功能说明: + - 采用hyperloglog算法,返回某列的基数。该算法在数据量很大的情况下,可以明显降低内存的占用,但是求出来的基数是个估算值,标准误差(标准误差是多次实验,每次的平均数的标准差,不是与真实结果的误差)为0.81%。 + - 在数据量较少的时候该算法不是很准确,可以使用select count(data) from (select unique(col) as data from table) 的方法。 返回结果类型:整形。 diff --git a/documentation20/en/12.taos-sql/docs.md b/documentation20/en/12.taos-sql/docs.md index 4071198d3d..653ae588c5 100755 --- a/documentation20/en/12.taos-sql/docs.md +++ b/documentation20/en/12.taos-sql/docs.md @@ -888,7 +888,9 @@ TDengine supports aggregations over data, they are listed below: ```mysql SELECT HYPERLOGLOG(field_name) FROM { tb_name | stb_name } [WHERE clause]; ``` - Function: The hyperloglog algorithm is used to return the cardinality of a column. In the case of large amount of data, the algorithm can significantly reduce the occupation of memory, but the cardinality is an estimated value, and the standard error is 0.81%. + Function: + - The hyperloglog algorithm is used to return the cardinality of a column. In the case of large amount of data, the algorithm can significantly reduce the occupation of memory, but the cardinality is an estimated value, and the standard error(the standard error is the standard deviation of the average of multiple experiments, not the error with the real result) is 0.81%. + - When the amount of data is small, the algorithm is not very accurate. You can use the method like this: select count(data) from (select unique(col) as data from table). Return Data Type:Integer. -- GitLab