diff --git a/docs-cn/12-taos-sql/07-function.md b/docs-cn/12-taos-sql/07-function.md index 68dfd579badc1814852fc7a529831667d4383278..300060fb06f50bdc9401346e9e7002ebd705310f 100644 --- a/docs-cn/12-taos-sql/07-function.md +++ b/docs-cn/12-taos-sql/07-function.md @@ -261,6 +261,66 @@ taos> select hyperloglog(dbig) from shll; Query OK, 1 row(s) in set (0.008388s) ``` +### HISTOGRAM + +``` +SELECT HISTOGRAM(field_name,bin_type, bin_description, normalized) FROM tb_name [WHERE clause]; +``` + +**功能说明**:统计数据按照用户指定区间的分布。 + +**返回结果类型**:如归一化参数 normalized 设置为 1,返回结果为双精度浮点类型 DOUBLE,否则为长整形 INT64。 + +**应用字段**:数值型字段。 + +**支持的版本**:2.6.0.0 及以后的版本。 + +**适用于**: 表和超级表。 + +**说明**: + 1. bin_type 用户指定的分桶类型, 有效输入类型为"user_input“, ”linear_bin", "log_bin"。 + 2. bin_description 描述如何生成分桶区间,针对三种桶类型,分别为以下描述格式(均为 JSON 格式字符串): + - "user_input": "[1, 3, 5, 7]" + 用户指定 bin 的具体数值。 + + - "linear_bin": "{"start": 0.0, "width": 5.0, "count": 5, "infinity": true}" + "start" 表示数据起始点,"width" 表示每次 bin 偏移量, "count" 为 bin 的总数,"infinity" 表示是否添加(-inf, inf)作为区间起点跟终点, + 生成区间为[-inf, 0.0, 5.0, 10.0, 15.0, 20.0, +inf]。 + + - "log_bin": "{"start":1.0, "factor": 2.0, "count": 5, "infinity": true}" + "start" 表示数据起始点,"factor" 表示按指数递增的因子,"count" 为 bin 的总数,"infinity" 表示是否添加(-inf, inf)作为区间起点跟终点, + 生成区间为[-inf, 1.0, 2.0, 4.0, 8.0, 16.0, +inf]。 + 3. normalized 是否将返回结果归一化到 0~1 之间 。有效输入为 0 和 1。 + +**示例**: + +```sql + taos> SELECT HISTOGRAM(voltage, "user_input", "[1,3,5,7]", 1) FROM meters; + histogram(voltage, "user_input", "[1,3,5,7]", 1) | + ======================================================= + {"lower_bin":1, "upper_bin":3, "count":0.333333} | + {"lower_bin":3, "upper_bin":5, "count":0.333333} | + {"lower_bin":5, "upper_bin":7, "count":0.333333} | + Query OK, 3 row(s) in set (0.004273s) + + taos> SELECT HISTOGRAM(voltage, 'linear_bin', '{"start": 1, "width": 3, "count": 3, "infinity": false}', 0) FROM meters; + histogram(voltage, 'linear_bin', '{"start": 1, "width": 3, " | + =================================================================== + {"lower_bin":1, "upper_bin":4, "count":3} | + {"lower_bin":4, "upper_bin":7, "count":3} | + {"lower_bin":7, "upper_bin":10, "count":3} | + Query OK, 3 row(s) in set (0.004887s) + + taos> SELECT HISTOGRAM(voltage, 'log_bin', '{"start": 1, "factor": 3, "count": 3, "infinity": true}', 0) FROM meters; + histogram(voltage, 'log_bin', '{"start": 1, "factor": 3, "count" | + =================================================================== + {"lower_bin":-inf, "upper_bin":1, "count":3} | + {"lower_bin":1, "upper_bin":3, "count":2} | + {"lower_bin":3, "upper_bin":9, "count":6} | + {"lower_bin":9, "upper_bin":27, "count":3} | + {"lower_bin":27, "upper_bin":inf, "count":1} | +``` + ## 选择函数 在使用所有的选择函数的时候,可以同时指定输出 ts 列或标签列(包括 tbname),这样就可以方便地知道被选出的值是源于哪个数据行的。 diff --git a/docs-en/12-taos-sql/07-function.md b/docs-en/12-taos-sql/07-function.md index 825aeea354fb684e47ceed7afb2bc66d97b23c09..f06106ff9fc4c891aae50588b8ffa35636f32a5f 100644 --- a/docs-en/12-taos-sql/07-function.md +++ b/docs-en/12-taos-sql/07-function.md @@ -259,6 +259,72 @@ taos> select hyperloglog(dbig) from shll; Query OK, 1 row(s) in set (0.008388s) ``` +### HISTOGRAM + +``` +SELECT HISTOGRAM(field_name,bin_type, bin_description, normalized) FROM tb_name [WHERE clause]; +``` + +**Description**:Returns count of data points in user-specified ranges. + +**Return value type**:Double or INT64, depends on normalized parameter settings. + +**Applicable column type**:Numerical types. + +**Applicable versions**:Since version 2.6.0.0. + +**Applicable table types**: table, STable + +**说明**: + 1. bin_type: parameter to indicate the bucket type, valid inputs are: "user_input", "linear_bin", "log_bin"。 + 2. bin_description: parameter to describe how to generate buckets,can be in the following JSON formats for each bin_type respectively: + - "user_input": "[1, 3, 5, 7]" + User defined specified bin values. + + - "linear_bin": "{"start": 0.0, "width": 5.0, "count": 5, "infinity": true}" + "start" - bin starting point. + "width" - bin offset. + "count" - number of bins generated. + "infinity" - whether to add(-inf, inf)as start/end point in generated set of bins. + The above "linear_bin" descriptor generates a set of bins: [-inf, 0.0, 5.0, 10.0, 15.0, 20.0, +inf]. + + - "log_bin": "{"start":1.0, "factor": 2.0, "count": 5, "infinity": true}" + "start" - bin starting point. + "factor" - exponential factor of bin offset. + "count" - number of bins generated. + "infinity" - whether to add(-inf, inf)as start/end point in generated range of bins. + The above "log_bin" descriptor generates a set of bins:[-inf, 1.0, 2.0, 4.0, 8.0, 16.0, +inf]. + 3. normalized: setting to 1/0 to turn on/off result normalization. + +**示例**: + +```sql + taos> SELECT HISTOGRAM(voltage, "user_input", "[1,3,5,7]", 1) FROM meters; + histogram(voltage, "user_input", "[1,3,5,7]", 1) | + ======================================================= + {"lower_bin":1, "upper_bin":3, "count":0.333333} | + {"lower_bin":3, "upper_bin":5, "count":0.333333} | + {"lower_bin":5, "upper_bin":7, "count":0.333333} | + Query OK, 3 row(s) in set (0.004273s) + + taos> SELECT HISTOGRAM(voltage, 'linear_bin', '{"start": 1, "width": 3, "count": 3, "infinity": false}', 0) FROM meters; + histogram(voltage, 'linear_bin', '{"start": 1, "width": 3, " | + =================================================================== + {"lower_bin":1, "upper_bin":4, "count":3} | + {"lower_bin":4, "upper_bin":7, "count":3} | + {"lower_bin":7, "upper_bin":10, "count":3} | + Query OK, 3 row(s) in set (0.004887s) + + taos> SELECT HISTOGRAM(voltage, 'log_bin', '{"start": 1, "factor": 3, "count": 3, "infinity": true}', 0) FROM meters; + histogram(voltage, 'log_bin', '{"start": 1, "factor": 3, "count" | + =================================================================== + {"lower_bin":-inf, "upper_bin":1, "count":3} | + {"lower_bin":1, "upper_bin":3, "count":2} | + {"lower_bin":3, "upper_bin":9, "count":6} | + {"lower_bin":9, "upper_bin":27, "count":3} | + {"lower_bin":27, "upper_bin":inf, "count":1} | +``` + ## Selection Functions When any select function is used, timestamp column or tag columns including `tbname` can be specified to show that the selected value are from which rows.