未验证 提交 cbcaf4b3 编写于 作者: W wade zhang 提交者: GitHub

Merge pull request #14674 from taosdata/docs/fenghuazzm-patch-1

docs: Update 03-immigrate.md in 2.4
......@@ -379,11 +379,11 @@ We still use the hypothetical environment from Chapter 4. There are three measur
### Storage resource estimation
Assuming that the number of sensor devices that generate data and need to be stored is `n`, the frequency of data generation is `t` per second, and the length of each record is `L` bytes, the scale of data generated per day is `n * t * L` bytes. Assuming the compression ratio is `C`, the daily data size is `(n * t * L)/C` bytes. The storage resources are estimated to accommodate the data scale for 1.5 years. In the production environment, the compression ratio C of TDengine is generally between 5 and 7.
Assuming that the number of sensor devices that generate data and need to be stored is `n`, the frequency of data generation is `t` per second, and the length of each record is `L` bytes, the scale of data generated per day is `86400 * n * t * L` bytes. Assuming the compression ratio is `C`, the daily data size is `(86400 * n * t * L)/C` bytes. The storage resources are estimated to accommodate the data scale for 1.5 years. In the production environment, the compression ratio C of TDengine is generally between 5 and 7.
With additional 20% redundancy, you can calculate the required storage resources:
```matlab
(n * t * L) * (365 * 1.5) * (1+20%)/C
(86400 * n * t * L) * (365 * 1.5) * (1+20%)/C
````
Substituting in the above formula, the raw data generated every year is 11.8TB without considering the label information. Note that tag information is associated with each timeline in TDengine, not every record. The amount of data to be recorded is somewhat reduced relative to the generated data, and label data can be ignored as a whole. Assuming a compression ratio of 5, the size of the retained data ends up being 2.56 TB.
......
......@@ -367,10 +367,10 @@ WHERE ts>=1510560000 AND ts<=1515000009
### 存储资源估算
假设产生数据并需要存储的传感器设备数量为 `n`,数据生成的频率为`t`条/秒,每条记录的长度为 `L` bytes,则每天产生的数据规模为 `n×t×L` bytes。假设压缩比为 C,则每日产生数据规模为 `(n×t×L)/C` bytes。存储资源预估为能够容纳 1.5 年的数据规模,生产环境下 TDengine 的压缩比 C 一般在 5 ~ 7 之间,同时为最后结果增加 20% 的冗余,可计算得到需要存储资源:
假设产生数据并需要存储的传感器设备数量为 `n`,数据生成的频率为`t`条/秒,每条记录的长度为 `L` bytes,则每天产生的数据规模为 `86400×n×t×L` bytes。假设压缩比为 C,则每日产生数据规模为 `(86400×n×t×L)/C` bytes。存储资源预估为能够容纳 1.5 年的数据规模,生产环境下 TDengine 的压缩比 C 一般在 5 ~ 7 之间,同时为最后结果增加 20% 的冗余,可计算得到需要存储资源:
```matlab
(n×t×L)×(365×1.5)×(1+20%)/C
(86400×n×t×L)×(365×1.5)×(1+20%)/C
```
结合以上的计算公式,将参数带入计算公式,在不考虑标签信息的情况下,每年产生的原始数据规模是 11.8TB。需要注意的是,由于标签信息在 TDengine 中关联到每个时间线,并不是每条记录。所以需要记录的数据量规模相对于产生的数据有一定的降低,而这部分标签数据整体上可以忽略不记。假设压缩比为 5,则保留的数据规模最终为 2.56 TB。
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册