提交 457ba27c 编写于 作者: M mindspore-ci-bot 提交者: Gitee

!476 add more tips when using TensorSummary.

Merge pull request !476 from wangshuide/wsd_tensor_visual
......@@ -333,6 +333,8 @@ In the saved files, `ms_output_after_hwopt.pb` is the computational graph after
The size of a `TensorSummary` data = the number of values in the tensor * 4 bytes. Assuming that the size of the tensor recorded by `TensorSummary` is 32 * 1 * 256 * 256, then a `TensorSummary` data needs about 32 * 1 * 256 * 256 * 4 bytes = 8,388,608 bytes = 8MiB. Also suppose that the collect_freq of `SummaryCollector` is set to 1, and 50 iterations are trained. Then the required space when recording these 50 sets of data is about 50 * 8 MiB = 400MiB. It should be noted that due to the overhead of data structure and other factors, the actual storage space used will be slightly larger than 400MiB.
5. The training log file is large when using `TensorSummary` because the complete tensor data is recorded. MindInsight needs more time to parse the training log file, please be patient.
## Visualization Components
### Training Dashboard
......@@ -564,12 +566,17 @@ Figure 21 shows the scalars comparision function area, which allows you to view
To limit time of listing summaries, MindInsight lists at most 999 summary items.
To limit memory usage, MindInsight limits the number of tags and steps:
- There are 300 tags at most in each training dashboard. Total number of scalar tags, image tags, computation graph tags, parameter distribution(histogram) tags can not exceed 300. Specially, there are 10 computation graph tags at most. When tags exceed limit, MindInsight preserves the most recently processed tags.
- There are 300 tags at most in each training dashboard. Total number of scalar tags, image tags, computation graph tags, parameter distribution(histogram) tags, tensor tags can not exceed 300. Specially, there are 10 computation graph tags at most. When tags exceed limit, MindInsight preserves the most recently processed tags.
- There are 1000 steps at most for each scalar tag in each training dashboard. When steps exceed limit, MindInsight will sample steps randomly to meet this limit.
- There are 10 steps at most for each image tag in each training dashboard. When steps exceed limit, MindInsight will sample steps randomly to meet this limit.
- There are 50 steps at most for each parameter distribution(histogram) tag in each training dashboard. When steps exceed limit, MindInsight will sample steps randomly to meet this limit.
- There are 20 steps at most for each tensor tag in each training dashboard. When steps exceed limit, MindInsight will sample steps randomly to meet this limit.
To ensure performance, MindInsight implements scalars comparision with the cache mechanism and the following restrictions:
- The scalars comparision supports only for trainings in cache.
- The maximum of 15 latest trainings (sorted by modification time) can be retained in the cache.
- The maximum of 5 trainings can be selected for scalars comparision at the same time.
Since `TensorSummary` will record complete tensor data, the amount of data is usually relatively large. In order to limit memory usage and ensure performance, MindInsight make the following restrictions with the size of tensor and the number of value responsed and displayed on the front end:
- Support tensor containing up to 10 million values.
- In the tensor-visible table view, the maximum number of value responsed to the front end for each step of each label is 100,000.
\ No newline at end of file
......@@ -334,6 +334,7 @@ model.train(cnn_network, callbacks=[confusion_martrix])
备注:估算`TensorSummary`空间使用量的方法如下:
一个`TensorSummary`数据的大小 = Tensor中的数值个数 * 4 bytes。假设使用`TensorSummary`记录的Tensor大小为32 * 1 * 256 * 256,则一个`TensorSummary`数据大约需要32 * 1 * 256 * 256 * 4 bytes = 8,388,608 bytes = 8MiB。又假设`SummaryCollector`的collect_freq设置为1,且训练了50个迭代。则记录这50组数据需要的空间约为50 * 8 MiB = 400MiB。需要注意的是,由于数据结构等因素的开销,实际使用的存储空间会略大于400MiB。
5. 当使用`TensorSummary`时,由于记录完整Tensor数据,训练日志文件较大,MindInsight需要更多时间解析训练日志文件,请耐心等待。
## 可视化组件
......@@ -566,12 +567,17 @@ model.train(cnn_network, callbacks=[confusion_martrix])
为了控制列出summary列表的用时,MindInsight最多支持发现999个summary列表条目。
为了控制内存占用,MindInsight对标签(tag)数目和步骤(step)数目进行了限制:
- 每个训练看板的最大标签数量为300个标签。标量标签、图片标签、计算图标签、参数分布图(直方图)标签的数量总和不得超过300个。特别地,每个训练看板最多有10个计算图标签。当实际标签数量超过这一限制时,将依照MindInsight的处理顺序,保留最近处理的300个标签。
- 每个训练看板的最大标签数量为300个标签。标量标签、图片标签、计算图标签、参数分布图(直方图)标签、张量标签的数量总和不得超过300个。特别地,每个训练看板最多有10个计算图标签。当实际标签数量超过这一限制时,将依照MindInsight的处理顺序,保留最近处理的300个标签。
- 每个训练看板的每个标量标签最多有1000个步骤的数据。当实际步骤的数目超过这一限制时,将对数据进行随机采样,以满足这一限制。
- 每个训练看板的每个图片标签最多有10个步骤的数据。当实际步骤的数目超过这一限制时,将对数据进行随机采样,以满足这一限制。
- 每个训练看板的每个参数分布图(直方图)标签最多有50个步骤的数据。当实际步骤的数目超过这一限制时,将对数据进行随机采样,以满足这一限制。
- 每个训练看板的每个张量标签最多有20个步骤的数据。当实际步骤的数目超过这一限制时,将对数据进行随机采样,以满足这一限制。
出于性能上的考虑,MindInsight对比看板使用缓存机制加载训练的标量曲线数据,并进行以下限制:
- 对比看板只支持在缓存中的训练进行比较标量曲线对比。
- 缓存最多保留最新(按修改时间排列)的15个训练。
- 用户最多同时对比5个训练的标量曲线。
由于`TensorSummary`会记录完整Tensor数据,数据量通常会比较大,为了控制内存占用和出于性能上的考虑,MindInsight对Tensor的大小以及返回前端展示的数值个数进行以下限制:
- 最大支持含有1千万个数值的Tensor。
- 在张量可视的表格视图下,每个标签的每个步骤返回前端的数值个数最大为10万。
\ No newline at end of file
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册