diff --git a/docs/en/25-application/03-immigrate.md b/docs/en/25-application/03-immigrate.md index 4d47aec1d76014ba63f6be91004abcc3934769f7..fe67f973894d460fb017de0e1a2099b8441a4abe 100644 --- a/docs/en/25-application/03-immigrate.md +++ b/docs/en/25-application/03-immigrate.md @@ -379,11 +379,11 @@ We still use the hypothetical environment from Chapter 4. There are three measur ### Storage resource estimation -Assuming that the number of sensor devices that generate data and need to be stored is `n`, the frequency of data generation is `t` per second, and the length of each record is `L` bytes, the scale of data generated per day is `n * t * L` bytes. Assuming the compression ratio is `C`, the daily data size is `(n * t * L)/C` bytes. The storage resources are estimated to accommodate the data scale for 1.5 years. In the production environment, the compression ratio C of TDengine is generally between 5 and 7. +Assuming that the number of sensor devices that generate data and need to be stored is `n`, the frequency of data generation is `t` per second, and the length of each record is `L` bytes, the scale of data generated per day is `86400 * n * t * L` bytes. Assuming the compression ratio is `C`, the daily data size is `(86400 * n * t * L)/C` bytes. The storage resources are estimated to accommodate the data scale for 1.5 years. In the production environment, the compression ratio C of TDengine is generally between 5 and 7. With additional 20% redundancy, you can calculate the required storage resources: ```matlab -(n * t * L) * (365 * 1.5) * (1+20%)/C +(86400 * n * t * L) * (365 * 1.5) * (1+20%)/C ```` Substituting in the above formula, the raw data generated every year is 11.8TB without considering the label information. Note that tag information is associated with each timeline in TDengine, not every record. The amount of data to be recorded is somewhat reduced relative to the generated data, and label data can be ignored as a whole. Assuming a compression ratio of 5, the size of the retained data ends up being 2.56 TB. diff --git a/docs/zh/25-application/03-immigrate.md b/docs/zh/25-application/03-immigrate.md index 9d8946bc4a69639c5327ac1ffb6c0539ddbd0e63..d1c9caea099b79494784aa1122e89d7b4d412464 100644 --- a/docs/zh/25-application/03-immigrate.md +++ b/docs/zh/25-application/03-immigrate.md @@ -367,10 +367,10 @@ WHERE ts>=1510560000 AND ts<=1515000009 ### 存储资源估算 -假设产生数据并需要存储的传感器设备数量为 `n`,数据生成的频率为`t`条/秒,每条记录的长度为 `L` bytes,则每天产生的数据规模为 `n×t×L` bytes。假设压缩比为 C,则每日产生数据规模为 `(n×t×L)/C` bytes。存储资源预估为能够容纳 1.5 年的数据规模,生产环境下 TDengine 的压缩比 C 一般在 5 ~ 7 之间,同时为最后结果增加 20% 的冗余,可计算得到需要存储资源: +假设产生数据并需要存储的传感器设备数量为 `n`,数据生成的频率为`t`条/秒,每条记录的长度为 `L` bytes,则每天产生的数据规模为 `86400×n×t×L` bytes。假设压缩比为 C,则每日产生数据规模为 `(86400×n×t×L)/C` bytes。存储资源预估为能够容纳 1.5 年的数据规模,生产环境下 TDengine 的压缩比 C 一般在 5 ~ 7 之间,同时为最后结果增加 20% 的冗余,可计算得到需要存储资源: ```matlab -(n×t×L)×(365×1.5)×(1+20%)/C +(86400×n×t×L)×(365×1.5)×(1+20%)/C ``` 结合以上的计算公式,将参数带入计算公式,在不考虑标签信息的情况下,每年产生的原始数据规模是 11.8TB。需要注意的是,由于标签信息在 TDengine 中关联到每个时间线,并不是每条记录。所以需要记录的数据量规模相对于产生的数据有一定的降低,而这部分标签数据整体上可以忽略不记。假设压缩比为 5,则保留的数据规模最终为 2.56 TB。