提交 b672697d 编写于 作者: L lvmingfu

Splitting dashboard and lineage documents in tutorials

上级 559a6952
# Dashboard
<!-- TOC -->
- [Dashboard](#dashboard)
- [Overview](#overview)
- [Scalar Visualization](#scalar-visualization)
- [Parameter Distribution Visualization](#parameter-distribution-visualization)
- [Computational Graph Visualization](#computational-graph-visualization)
- [Dataset Graph Visualization](#dataset-graph-visualization)
- [Image Visualization](#image-visualization)
- [Notices](#Notices)
<!-- /TOC -->
<a href="https://gitee.com/mindspore/docs/blob/master/tutorials/source_en/advanced_use/dashboard.md" target="_blank"><img src="../_static/logo_source.png"></a>
## Overview
Training dashboard is an important part of mindinsight's visualization component, and its tags include scalar visualization, parameter distribution visualization, computational visualization, data visualization and image visualization.
Access the Training Dashboard by selecting a specific training from the training list.
## Scalar Visualization
Scalar visualization is used to display the change trend of scalars during training.
![scalar.png](./images/scalar.png)
Figure 1: Scalar trend chart
Figure 1 shows a change process of loss values during the neural network training. The horizontal coordinate indicates the training step, and the vertical coordinate indicates the loss value.
Buttons from left to right in the upper right corner of the figure are used to display the chart in full screen, switch the Y-axis scale, enable or disable the rectangle selection, roll back the chart step by step, and restore the chart.
- Full-screen Display: Display the scalar curve in full screen. Click the button again to restore it.
- Switch Y-axis Scale: Perform logarithmic conversion on the Y-axis coordinate.
- Enable/Disable Rectangle Selection: Draw a rectangle to select and zoom in a part of the chart. You can perform rectangle selection again on the zoomed-in chart.
- Step-by-step Rollback: Cancel operations step by step after continuously drawing rectangles to select and zooming in the same area.
- Restore Chart: Restore a chart to the original state.
There can set the threshold value to highlight the value or delete the threshold value in the lower right corner of the figure. As shown in the figure, the threshold set is less than 0.3, highlighted in red shows what is below the threshold, and it is intuitive to see the expected data value or some unusual value.
![scalar_select.png](./images/scalar_select.png)
Figure 2: Scalar visualization function area
Figure 2 shows the scalar visualization function area, which allows you to view scalar information by selecting different tags, different dimensions of the horizontal axis, and smoothness.
- Tag: Select the required tags to view the corresponding scalar information.
- Horizontal Axis: Select any of Step, Relative Time, and Absolute Time as the horizontal axis of the scalar curve.
- Smoothness: Adjust the smoothness to smooth the scalar curve.
- Scalar Synthesis: Synthesize two scalar curves and display them in a chart to facilitate comparison between the two curves or view the synthesized chart.
![scalar_compound.png](./images/scalar_compound.png)
Figure 3: Scalar synthesis of Accuracy and Loss curves
Figure 3 shows the scalar synthesis of the Accuracy and Loss curves. The function area of scalar synthesis is similar to that of scalar visualization. Different from the scalar visualization function area, the scalar synthesis function allows you to select a maximum of two tags at a time to synthesize and display their curves.
## Parameter Distribution Visualization
The parameter distribution in a form of a histogram displays tensors specified by a user.
![histogram.png](./images/histogram.png)
Figure 4: Histogram
Figure 4 shows tensors recorded by a user in a form of a histogram. Click the upper right corner to zoom in the histogram.
![histogram_func.png](./images/histogram_func.png)
Figure 5: Function area of the parameter distribution histogram
Figure 5 shows the function area of the parameter distribution histogram, including:
- Tag selection: Select the required tags to view the corresponding histogram.
- Vertical axis: Select any of `Step`, `Relative time`, and `Absolute time` as the data displayed on the vertical axis of the histogram.
- Angle of view: Select either `Front` or `Top`. `Front` view refers to viewing the histogram from the front view. In this case, data between different steps is overlapped. `Top` view refers to viewing the histogram at an angle of 45 degrees. In this case, data between different steps can be presented.
## Computational Graph Visualization
Computational graph visualization is used to display the graph structure, data flow direction, and control flow direction of a computational graph. It supports visualization of summary log files and pb files generated by `save_graphs` configuration in `context`.
![graph.png](./images/graph.png)
Figure 6: Computational graph display area
Figure 6 shows the network structure of a computational graph. As shown in the figure, select an operator in the area of the display area. The operator has two inputs and one outputs (the solid line indicates the data flow direction of the operator).
![graph_sidebar.png](./images/graph_sidebar.png)
Figure 7: Computational graph function area
Figure 7 shows the function area of the computational graph, including:
- File selection box: View the computational graphs of different files.
- Search box: Enter a node name and press Enter to view the node.
- Thumbnail: Display the thumbnail of the entire network structure. When viewing an extra large image structure, you can view the currently browsed area.
- Node information: Display the basic information of the selected node, including the node name, properties, input node, and output node.
- Legend: Display the meaning of each icon in the computational graph.
## Dataset Graph Visualization
Dataset graph visualization is used to display data processing and augmentation information of a single model training.
![data_function.png](./images/data_function.png)
Figure 8: Dataset graph function area
Figure 8 shows the dataset graph function area which includes the following content:
- Legend: Display the meaning of each icon in the data lineage graph.
- Data Processing Pipeline: Display the data processing pipeline used for training. Select a single node in the graph to view details.
- Node Information: Display basic information about the selected node, including names and parameters of the data processing and augmentation operators.
## Image Visualization
Image visualization is used to display images specified by users.
![image.png](./images/image_vi.png)
Figure 9: Image visualization
Figure 9 shows how to view images of different steps by sliding the Step slider.
![image_function.png](./images/image_function.png)
Figure 10: Image visualization function area
Figure 10 shows the function area of image visualization. You can view image information by selecting different tags, brightness, and contrast.
- Tag: Select the required tags to view the corresponding image information.
- Brightness Adjustment: Adjust the brightness of all displayed images.
- Contrast Adjustment: Adjust the contrast of all displayed images.
## Tensor Visualization
Tensor visualization is used to display tensors in the form of table and histogram.
![tensor_function.png](./images/tensor_function.png)
Figure 11: Tensor visualization function area
Figure 11 shows the function area of tensor visualization.
- Tag selection: Select the required tags to view the corresponding table data or histogram.
- View: Select `Table` or `Histogram` to display tensor data. In the `Histogram` view, there are the options of `Vertical axis` and `Angle of view`.
- Vertical axis: Select any of `Step`, `Relative time`, and `Absolute time` as the data displayed on the vertical axis of the histogram.
- Angle of view: Select either `Front` or `Top`. `Front` view refers to viewing the histogram from the front view. In this case, data between different steps is overlapped. `Top` view refers to viewing the histogram at an angle of 45 degrees. In this case, data between different steps can be presented.
![tensor_table.png](./images/tensor_table.png)
Figure 12: Table display
Figure 12 shows tensors recorded by a user in a form of a table which includes the following function:
- Click the small square button on the right side of the table to zoom in the table.
- The white box in the table shows the tensor data under which dimension is currently displayed, where the colon `:` represents all values of the current dimension, you can enter the corresponding index or `:` in the box and press `Enter` or click the button of tick on the back to query tensor data for specific dimensions.
Assuming a certain dimension is 32, the index range is -32 to 31. Note: tensor data from 0 to 2 dimensions can be queried. Tensor data of more than two dimensions is not supported, in other word, the query conditions of more than two colons `:` cannot be set.
- Query the tensor data of a specific step by dragging the hollow circle below the table.
![tensor_histogram.png](./images/tensor_histogram.png)
Figure 13: Histogram display
Figure 13 shows tensors recorded by a user in a form of a histogram. Click the upper right corner to zoom in the histogram.
## Notices
1. Currently MindSpore supports recording computational graph after operator fusion for Ascend 910 AI processor only.
2. When using the Summary operator to collect data in training, 'HistogramSummary' operator affects performance, so please use as little as possible.
3. To limit memory usage, MindInsight limits the number of tags and steps:
- There are 300 tags at most in each training dashboard. Total number of scalar tags, image tags, computation graph tags, parameter distribution(histogram) tags, tensor tags can not exceed 300. Specially, there are 10 computation graph tags and 6 tensor tags at most. When tags exceed limit, MindInsight preserves the most recently processed tags.
- There are 1000 steps at most for each scalar tag in each training dashboard. When steps exceed limit, MindInsight will sample steps randomly to meet this limit.
- There are 10 steps at most for each image tag in each training dashboard. When steps exceed limit, MindInsight will sample steps randomly to meet this limit.
- There are 50 steps at most for each parameter distribution(histogram) tag in each training dashboard. When steps exceed limit, MindInsight will sample steps randomly to meet this limit.
- There are 20 steps at most for each tensor tag in each training dashboard. When steps exceed limit, MindInsight will sample steps randomly to meet this limit.
4. Since `TensorSummary` will record complete tensor data, the amount of data is usually relatively large. In order to limit memory usage and ensure performance, MindInsight make the following restrictions with the size of tensor and the number of value responsed and displayed on the front end:
- MindInsight supports loading tensor containing up to 10 million values.
- After the tensor is loaded, in the tensor-visible table view, you can view a maximum of 100,000 values. If the value obtained by the selected dimension query exceeds this limit, it cannot be displayed.
5. Since tensor visualizatioin (`TensorSummary`) records raw tensor data, it requires a large amount of storage space. Before using `TensorSummary` and during training, please check that the system storage space is sufficient.
The storage space occupied by the tensor visualizatioin function can be reduced by the following methods:
1) Avoid using `TensorSummary` to record larger tensor.
2) Reduce the number of `TensorSummary` operators in the network.
After using the function, please clean up the training logs that are no longer needed in time to free up disk space.
Remarks: The method of estimating the space usage of `TensorSummary` is as follows:
The size of a `TensorSummary` data = the number of values in the tensor * 4 bytes. Assuming that the size of the tensor recorded by `TensorSummary` is 32 * 1 * 256 * 256, then a `TensorSummary` data needs about 32 * 1 * 256 * 256 * 4 bytes = 8,388,608 bytes = 8MiB. Also suppose that the collect_freq of `SummaryCollector` is set to 1, and 50 iterations are trained. Then the required space when recording these 50 sets of data is about 50 * 8 MiB = 400MiB. It should be noted that due to the overhead of data structure and other factors, the actual storage space used will be slightly larger than 400MiB.
6. The training log file is large when using `TensorSummary` because the complete tensor data is recorded. MindInsight needs more time to parse the training log file, please be patient.
\ No newline at end of file
# Lineage and Scalars Comparision
<!-- TOC -->
- [Lineage and Scalars Comparision](#lineage-and-scalars-comparision)
- [Overview](#overview)
- [Model Lineage](#model-lineage)
- [Dataset Lineage](#dataset-lineage)
- [Scalars Comparision](#scalars-comparision)
- [Specifications](#specifications)
- [Notices](#notices)
<!-- /TOC -->
<a href="https://gitee.com/mindspore/docs/blob/master/tutorials/source_en/advanced_use/lineage_and_scalars_comparision.md" target="_blank"><img src="../_static/logo_source.png"></a>
## Overview
Model lineage, data lineage and comparison Kanban in mindinsight are the same as training dashboard. In the visualization of training data, different scalar trend charts are observed by comparison dashboard to find problems, and then the lineage function is used to locate the problem causes, so as to give users the ability of efficient tuning in data enhancement and deep neural network.
## Model Lineage
Model lineage visualization is used to display the parameter information of all training models.
![image.png](./images/lineage_label.png)
Figure 1: Model parameter selection area
Figure 1 shows the model parameter selection area, which lists the model parameter tags that can be viewed. You can select required tags to view the corresponding model parameters.
![image.png](./images/lineage_model_chart.png)
Figure 2: Model lineage function area
Figure 2 shows the model lineage function area, which visualizes the model parameter information. You can select a specific area in the column to display the model information within the area.
![image.png](./images/lineage_model_table.png)
Figure 3: Model list
Figure 3 shows all model information in groups. You can sort the model information in ascending or descending order by specified column.
## Dataset Lineage
Dataset lineage visualization is used to display data processing and augmentation information of all model trainings.
![data_label.png](./images/data_label.png)
Figure 4: Data processing and augmentation operator selection area
Figure 4 shows the data processing and augmentation operator selection area, which lists names of data processing and augmentation operators that can be viewed. You can select required tags to view related parameters.
![data_chart.png](./images/data_chart.png)
Figure 5: Dataset lineage function area
Figure 5 shows the dataset lineage function area, which visualizes the parameter information used for data processing and augmentation. You can select a specific area in the column to display the parameter information within the area.
![data_table.png](./images/data_table.png)
Figure 6: Dataset lineage list
Figure 6 shows the data processing and augmentation information of all model trainings.
> If user filters the model lineage and then switches to the data lineage page, the line chart will show the latest filtered column in model lineage.
## Scalars Comparision
Scalars Comparision can be used to compare scalar curves between multiple trainings
![multi_scalars.png](./images/multi_scalars.png)
Figure 7: Scalars comparision curve area
Figure 7 shows the scalar curve comparision between multiple trainings. The horizontal coordinate indicates the training step, and the vertical coordinate indicates the scalar value.
Buttons from left to right in the upper right corner of the figure are used to display the chart in full screen, switch the Y-axis scale, enable or disable the rectangle selection, roll back the chart step by step, and restore the chart.
- Full-screen Display: Display the scalar curve in full screen. Click the button again to restore it.
- Switch Y-axis Scale: Perform logarithmic conversion on the Y-axis coordinate.
- Enable/Disable Rectangle Selection: Draw a rectangle to select and zoom in a part of the chart. You can perform rectangle selection again on the zoomed-in chart.
- Step-by-step Rollback: Cancel operations step by step after continuously drawing rectangles to select and zooming in the same area.
- Restore Chart: Restore a chart to the original state.
![multi_scalars_select.png](./images/multi_scalars_select.png)
Figure 8: Scalars comparision function area
Figure 8 shows the scalars comparision function area, which allows you to view scalar information by selecting different trainings or tags, different dimensions of the horizontal axis, and smoothness.
- Training: Select or filter the required trainings to view the corresponding scalar information.
- Tag: Select the required tags to view the corresponding scalar information.
- Horizontal Axis: Select any of Step, Relative Time, and Absolute Time as the horizontal axis of the scalar curve.
- Smoothness: Adjust the smoothness to smooth the scalar curve.
## Notices
To ensure performance, MindInsight implements scalars comparision with the cache mechanism and the following restrictions:
- The scalars comparision supports only for trainings in cache.
- The maximum of 15 latest trainings (sorted by modification time) can be retained in the cache.
- The maximum of 5 trainings can be selected for scalars comparision at the same time.
\ No newline at end of file
# Dashboard and Lineage
# Summary_Record
<!-- TOC -->
- [Dashboard and Lineage](#dashboard-and-lineage)
- [Summary_Record](#summary_record)
- [Overview](#overview)
- [Operation Process](#operation-process)
- [Preparing the Training Script](#preparing-the-training-script)
- [Collect Summary Data](#collect-summary-data)
- [Notices](#Notices)
- [Visualization Components](#visualization-components)
- [Training Dashboard](#training-dashboard)
- [Scalar Visualization](#scalar-visualization)
- [Parameter Distribution Visualization](#parameter-distribution-visualization)
- [Computational Graph Visualization](#computational-graph-visualization)
- [Dataset Graph Visualization](#dataset-graph-visualization)
- [Image Visualization](#image-visualization)
- [Tensor Visualization](#tensor-visualization)
- [Model Lineage](#model-lineage)
- [Dataset Lineage](#dataset-lineage)
- [Scalars Comparision](#scalars-comparision)
- [Specifications](#specifications)
- [Method one: Automatically collected through SummaryCollector](#method-one:-automatically-collected-through-summarycollector)
- [Method two: Custom collection of network data with summary operators and SummaryCollector](#method-two:-custom-collection-of-network-data-with-summary-operators-and-summarycollector)
- [Method three: Custom callback recording data](#method-three:-custom-callback-recording-data)
- [Notices](#Notices)
<!-- /TOC -->
<a href="https://gitee.com/mindspore/docs/blob/master/tutorials/source_en/advanced_use/dashboard_and_lineage.md" target="_blank"><img src="../_static/logo_source.png"></a>
<a href="https://gitee.com/mindspore/docs/blob/master/tutorials/source_en/advanced_use/summary_record.md" target="_blank"><img src="../_static/logo_source.png"></a>
## Overview
Scalars, images, computational graphs, and model hyperparameters during training are recorded in files and can be viewed on the web page.
......@@ -36,13 +26,11 @@ Scalars, images, computational graphs, and model hyperparameters during training
## Preparing the Training Script
### Collect Summary Data
Currently, MindSpore supports to save scalars, images, computational graph, and model hyperparameters to summary log file and display them on the web page.
MindSpore currently supports three ways to record data into summary log file.
**Method one: Automatically collected through `SummaryCollector`**
### Method one: Automatically collected through SummaryCollector
The `Callback` mechanism in MindSpore provides a quick and easy way to collect common information, including the calculational graph, loss value, learning rate, parameter weights, etc. It is named 'SummaryCollector'.
......@@ -130,7 +118,7 @@ ds_eval = create_dataset('./dataset_path')
model.eval(ds_eval, callbacks=[summary_collector])
```
**Method two: Custom collection of network data with summary operators and SummaryCollector**
### Method two: Custom collection of network data with summary operators and SummaryCollector
In addition to providing the `SummaryCollector` that automatically collects some summary data, MindSpore provides summary operators that enable custom collection other data on the network, such as the input of each convolutional layer, or the loss value in the loss function, etc. The recording method is shown in the following steps.
......@@ -235,7 +223,7 @@ summary_collector = SummaryCollector(summary_dir='./summary_dir', collect_freq=1
model.train(epoch=2, train_ds, callbacks=[summary_collector])
```
**Method three: Custom callback recording data**
### Method three: Custom callback recording data
MindSpore supports custom callback and support to record data into summary log file
in custom callback, and display the data by the web page.
......@@ -284,13 +272,11 @@ the `save_graphs` option of `context.set_context` in the training script is set
In the saved files, `ms_output_after_hwopt.pb` is the computational graph after operator fusion, which can be viewed on the web page.
### Notices
1. Currently MindSpore supports recording computational graph after operator fusion for Ascend 910 AI processor only.
## Notices
2. When using the Summary operator to collect data in training, 'HistogramSummary' operator affects performance, so please use as little as possible.
1. To limit time of listing summaries, MindInsight lists at most 999 summary items.
3. Multiple `SummaryRecord` instances can not be used at the same time. (`SummaryRecord` is used in `SummaryCollector`)
2. Multiple `SummaryRecord` instances can not be used at the same time. (`SummaryRecord` is used in `SummaryCollector`)
If you use two or more instances of `SummaryCollector` in the callback list of 'model.train' or 'model.eval', it is seen as using multiple `SummaryRecord` instances at the same time, and it will cause recoding data fail.
......@@ -321,262 +307,3 @@ In the saved files, `ms_output_after_hwopt.pb` is the computational graph after
summary_collector = SummaryCollector('./summary_dir2')
model.train(epoch=2, train_dataset, callbacks=[confusion_callback, summary_collector])
```
4. Since tensor visualizatioin (`TensorSummary`) records raw tensor data, it requires a large amount of storage space. Before using `TensorSummary` and during training, please check that the system storage space is sufficient.
The storage space occupied by the tensor visualizatioin function can be reduced by the following methods:
1) Avoid using `TensorSummary` to record larger tensor.
2) Reduce the number of `TensorSummary` operators in the network.
After using the function, please clean up the training logs that are no longer needed in time to free up disk space.
Remarks: The method of estimating the space usage of `TensorSummary` is as follows:
The size of a `TensorSummary` data = the number of values in the tensor * 4 bytes. Assuming that the size of the tensor recorded by `TensorSummary` is 32 * 1 * 256 * 256, then a `TensorSummary` data needs about 32 * 1 * 256 * 256 * 4 bytes = 8,388,608 bytes = 8MiB. Also suppose that the collect_freq of `SummaryCollector` is set to 1, and 50 iterations are trained. Then the required space when recording these 50 sets of data is about 50 * 8 MiB = 400MiB. It should be noted that due to the overhead of data structure and other factors, the actual storage space used will be slightly larger than 400MiB.
5. The training log file is large when using `TensorSummary` because the complete tensor data is recorded. MindInsight needs more time to parse the training log file, please be patient.
## Visualization Components
### Training Dashboard
Access the Training Dashboard by selecting a specific training from the training list.
#### Scalar Visualization
Scalar visualization is used to display the change trend of scalars during training.
![scalar.png](./images/scalar.png)
Figure 1: Scalar trend chart
Figure 1 shows a change process of loss values during the neural network training. The horizontal coordinate indicates the training step, and the vertical coordinate indicates the loss value.
Buttons from left to right in the upper right corner of the figure are used to display the chart in full screen, switch the Y-axis scale, enable or disable the rectangle selection, roll back the chart step by step, and restore the chart.
- Full-screen Display: Display the scalar curve in full screen. Click the button again to restore it.
- Switch Y-axis Scale: Perform logarithmic conversion on the Y-axis coordinate.
- Enable/Disable Rectangle Selection: Draw a rectangle to select and zoom in a part of the chart. You can perform rectangle selection again on the zoomed-in chart.
- Step-by-step Rollback: Cancel operations step by step after continuously drawing rectangles to select and zooming in the same area.
- Restore Chart: Restore a chart to the original state.
There can set the threshold value to highlight the value or delete the threshold value in the lower right corner of the figure. As shown in the figure, the threshold set is less than 0.3, highlighted in red shows what is below the threshold, and it is intuitive to see the expected data value or some unusual value.
![scalar_select.png](./images/scalar_select.png)
Figure 2: Scalar visualization function area
Figure 2 shows the scalar visualization function area, which allows you to view scalar information by selecting different tags, different dimensions of the horizontal axis, and smoothness.
- Tag: Select the required tags to view the corresponding scalar information.
- Horizontal Axis: Select any of Step, Relative Time, and Absolute Time as the horizontal axis of the scalar curve.
- Smoothness: Adjust the smoothness to smooth the scalar curve.
- Scalar Synthesis: Synthesize two scalar curves and display them in a chart to facilitate comparison between the two curves or view the synthesized chart.
![scalar_compound.png](./images/scalar_compound.png)
Figure 3: Scalar synthesis of Accuracy and Loss curves
Figure 3 shows the scalar synthesis of the Accuracy and Loss curves. The function area of scalar synthesis is similar to that of scalar visualization. Different from the scalar visualization function area, the scalar synthesis function allows you to select a maximum of two tags at a time to synthesize and display their curves.
#### Parameter Distribution Visualization
The parameter distribution in a form of a histogram displays tensors specified by a user.
![histogram.png](./images/histogram.png)
Figure 4: Histogram
Figure 4 shows tensors recorded by a user in a form of a histogram. Click the upper right corner to zoom in the histogram.
![histogram_func.png](./images/histogram_func.png)
Figure 5: Function area of the parameter distribution histogram
Figure 5 shows the function area of the parameter distribution histogram, including:
- Tag selection: Select the required tags to view the corresponding histogram.
- Vertical axis: Select any of `Step`, `Relative time`, and `Absolute time` as the data displayed on the vertical axis of the histogram.
- Angle of view: Select either `Front` or `Top`. `Front` view refers to viewing the histogram from the front view. In this case, data between different steps is overlapped. `Top` view refers to viewing the histogram at an angle of 45 degrees. In this case, data between different steps can be presented.
#### Computational Graph Visualization
Computational graph visualization is used to display the graph structure, data flow direction, and control flow direction of a computational graph. It supports visualization of summary log files and pb files generated by `save_graphs` configuration in `context`.
![graph.png](./images/graph.png)
Figure 6: Computational graph display area
Figure 6 shows the network structure of a computational graph. As shown in the figure, select an operator in the area of the display area. The operator has two inputs and one outputs (the solid line indicates the data flow direction of the operator).
![graph_sidebar.png](./images/graph_sidebar.png)
Figure 7: Computational graph function area
Figure 7 shows the function area of the computational graph, including:
- File selection box: View the computational graphs of different files.
- Search box: Enter a node name and press Enter to view the node.
- Thumbnail: Display the thumbnail of the entire network structure. When viewing an extra large image structure, you can view the currently browsed area.
- Node information: Display the basic information of the selected node, including the node name, properties, input node, and output node.
- Legend: Display the meaning of each icon in the computational graph.
#### Dataset Graph Visualization
Dataset graph visualization is used to display data processing and augmentation information of a single model training.
![data_function.png](./images/data_function.png)
Figure 8: Dataset graph function area
Figure 8 shows the dataset graph function area which includes the following content:
- Legend: Display the meaning of each icon in the data lineage graph.
- Data Processing Pipeline: Display the data processing pipeline used for training. Select a single node in the graph to view details.
- Node Information: Display basic information about the selected node, including names and parameters of the data processing and augmentation operators.
#### Image Visualization
Image visualization is used to display images specified by users.
![image.png](./images/image_vi.png)
Figure 9: Image visualization
Figure 9 shows how to view images of different steps by sliding the Step slider.
![image_function.png](./images/image_function.png)
Figure 10: Image visualization function area
Figure 10 shows the function area of image visualization. You can view image information by selecting different tags, brightness, and contrast.
- Tag: Select the required tags to view the corresponding image information.
- Brightness Adjustment: Adjust the brightness of all displayed images.
- Contrast Adjustment: Adjust the contrast of all displayed images.
#### Tensor Visualization
Tensor visualization is used to display tensors in the form of table and histogram.
![tensor_function.png](./images/tensor_function.png)
Figure 11: Tensor visualization function area
Figure 11 shows the function area of tensor visualization.
- Tag selection: Select the required tags to view the corresponding table data or histogram.
- View: Select `Table` or `Histogram` to display tensor data. In the `Histogram` view, there are the options of `Vertical axis` and `Angle of view`.
- Vertical axis: Select any of `Step`, `Relative time`, and `Absolute time` as the data displayed on the vertical axis of the histogram.
- Angle of view: Select either `Front` or `Top`. `Front` view refers to viewing the histogram from the front view. In this case, data between different steps is overlapped. `Top` view refers to viewing the histogram at an angle of 45 degrees. In this case, data between different steps can be presented.
![tensor_table.png](./images/tensor_table.png)
Figure 12: Table display
Figure 12 shows tensors recorded by a user in a form of a table which includes the following function:
- Click the small square button on the right side of the table to zoom in the table.
- The white box in the table shows the tensor data under which dimension is currently displayed, where the colon `:` represents all values of the current dimension, you can enter the corresponding index or `:` in the box and press `Enter` or click the button of tick on the back to query tensor data for specific dimensions.
Assuming a certain dimension is 32, the index range is -32 to 31. Note: tensor data from 0 to 2 dimensions can be queried. Tensor data of more than two dimensions is not supported, in other word, the query conditions of more than two colons `:` cannot be set.
- Query the tensor data of a specific step by dragging the hollow circle below the table.
![tensor_histogram.png](./images/tensor_histogram.png)
Figure 13: Histogram display
Figure 13 shows tensors recorded by a user in a form of a histogram. Click the upper right corner to zoom in the histogram.
### Model Lineage
Model lineage visualization is used to display the parameter information of all training models.
![image.png](./images/lineage_label.png)
Figure 14: Model parameter selection area
Figure 14 shows the model parameter selection area, which lists the model parameter tags that can be viewed. You can select required tags to view the corresponding model parameters.
![image.png](./images/lineage_model_chart.png)
Figure 15: Model lineage function area
Figure 15 shows the model lineage function area, which visualizes the model parameter information. You can select a specific area in the column to display the model information within the area.
![image.png](./images/lineage_model_table.png)
Figure 16: Model list
Figure 16 shows all model information in groups. You can sort the model information in ascending or descending order by specified column.
### Dataset Lineage
Dataset lineage visualization is used to display data processing and augmentation information of all model trainings.
![data_label.png](./images/data_label.png)
Figure 17: Data processing and augmentation operator selection area
Figure 17 shows the data processing and augmentation operator selection area, which lists names of data processing and augmentation operators that can be viewed. You can select required tags to view related parameters.
![data_chart.png](./images/data_chart.png)
Figure 18: Dataset lineage function area
Figure 18 shows the dataset lineage function area, which visualizes the parameter information used for data processing and augmentation. You can select a specific area in the column to display the parameter information within the area.
![data_table.png](./images/data_table.png)
Figure 19: Dataset lineage list
Figure 19 shows the data processing and augmentation information of all model trainings.
> If user filters the model lineage and then switches to the data lineage page, the line chart will show the latest filtered column in model lineage.
### Scalars Comparision
Scalars Comparision can be used to compare scalar curves between multiple trainings
![multi_scalars.png](./images/multi_scalars.png)
Figure 20: Scalars comparision curve area
Figure 20 shows the scalar curve comparision between multiple trainings. The horizontal coordinate indicates the training step, and the vertical coordinate indicates the scalar value.
Buttons from left to right in the upper right corner of the figure are used to display the chart in full screen, switch the Y-axis scale, enable or disable the rectangle selection, roll back the chart step by step, and restore the chart.
- Full-screen Display: Display the scalar curve in full screen. Click the button again to restore it.
- Switch Y-axis Scale: Perform logarithmic conversion on the Y-axis coordinate.
- Enable/Disable Rectangle Selection: Draw a rectangle to select and zoom in a part of the chart. You can perform rectangle selection again on the zoomed-in chart.
- Step-by-step Rollback: Cancel operations step by step after continuously drawing rectangles to select and zooming in the same area.
- Restore Chart: Restore a chart to the original state.
![multi_scalars_select.png](./images/multi_scalars_select.png)
Figure 21: Scalars comparision function area
Figure 21 shows the scalars comparision function area, which allows you to view scalar information by selecting different trainings or tags, different dimensions of the horizontal axis, and smoothness.
- Training: Select or filter the required trainings to view the corresponding scalar information.
- Tag: Select the required tags to view the corresponding scalar information.
- Horizontal Axis: Select any of Step, Relative Time, and Absolute Time as the horizontal axis of the scalar curve.
- Smoothness: Adjust the smoothness to smooth the scalar curve.
## Specifications
To limit time of listing summaries, MindInsight lists at most 999 summary items.
To limit memory usage, MindInsight limits the number of tags and steps:
- There are 300 tags at most in each training dashboard. Total number of scalar tags, image tags, computation graph tags, parameter distribution(histogram) tags, tensor tags can not exceed 300. Specially, there are 10 computation graph tags and 6 tensor tags at most. When tags exceed limit, MindInsight preserves the most recently processed tags.
- There are 1000 steps at most for each scalar tag in each training dashboard. When steps exceed limit, MindInsight will sample steps randomly to meet this limit.
- There are 10 steps at most for each image tag in each training dashboard. When steps exceed limit, MindInsight will sample steps randomly to meet this limit.
- There are 50 steps at most for each parameter distribution(histogram) tag in each training dashboard. When steps exceed limit, MindInsight will sample steps randomly to meet this limit.
- There are 20 steps at most for each tensor tag in each training dashboard. When steps exceed limit, MindInsight will sample steps randomly to meet this limit.
To ensure performance, MindInsight implements scalars comparision with the cache mechanism and the following restrictions:
- The scalars comparision supports only for trainings in cache.
- The maximum of 15 latest trainings (sorted by modification time) can be retained in the cache.
- The maximum of 5 trainings can be selected for scalars comparision at the same time.
Since `TensorSummary` will record complete tensor data, the amount of data is usually relatively large. In order to limit memory usage and ensure performance, MindInsight make the following restrictions with the size of tensor and the number of value responsed and displayed on the front end:
- MindInsight supports loading tensor containing up to 10 million values.
- After the tensor is loaded, in the tensor-visible table view, you can view a maximum of 100,000 values. If the value obtained by the selected dimension query exceeds this limit, it cannot be displayed.
\ No newline at end of file
......@@ -4,7 +4,9 @@ Training Process Visualization
.. toctree::
:maxdepth: 1
dashboard_and_lineage
summary_record
dashboard
lineage_and_scalars_comparision
system_metrics
performance_profiling
mindinsight_commands
# 训练看板
<!-- TOC -->
- [训练看板](#训练看板)
- [概述](#概述)
- [标量可视化](#标量可视化)
- [参数分布图可视化](#参数分布图可视化)
- [计算图可视化](#计算图可视化)
- [数据图可视化](#数据图可视化)
- [图像可视化](#图像可视化)
- [张量可视化](#张量可视化)
- [注意事项](#注意事项)
<!-- /TOC -->
<a href="https://gitee.com/mindspore/docs/blob/master/tutorials/source_zh_cn/advanced_use/dashboard.md" target="_blank"><img src="../_static/logo_source.png"></a>&nbsp;&nbsp;
<a href="https://gitee.com/mindspore/docs/tree/master/tutorials/notebook/mindinsight" target="_blank"><img src="../_static/logo_notebook.png"></a>
## 概述
训练看板是MindInsight的可视化组件的重要组成部分,而训练看板的标签包含:标量可视化、参数分布图可视化、计算图可视化、数据图可视化和图像可视化等。
用户从训练列表中选择指定的训练,进入训练看板。
## 标量可视化
标量可视化用于展示训练过程中,标量的变化趋势情况。
![scalar.png](./images/scalar.png)
图1:标量趋势图
图1展示了神经网络在训练过程中损失值的变化过程。横坐标是训练步骤,纵坐标是损失值。
图中右上角有几个按钮功能,从左到右功能分别是全屏展示,切换Y轴比例,开启/关闭框选,分步回退和还原图形。
- 全屏展示即全屏展示该标量曲线,再点击一次即可恢复。
- 切换Y轴比例是指可以将Y轴坐标进行对数转换。
- 开启/关闭框选是指可以框选图中部分区域,并放大查看该区域, 可以在已放大的图形上叠加框选。
- 分步回退是指对同一个区域连续框选并放大查看时,可以逐步撤销操作。
- 还原图形是指进行了多次框选后,点击此按钮可以将图还原回原始状态。
图中右下角可以设置阈值并高亮显示或者删除阈值。如图所示,设置的阈值为小于0.3,红色高亮部分显示出超出阈值的部分,能够直观地看到预期的数据值或者一些异常的数值。
![scalar_select.png](./images/scalar_select.png)
图2:标量可视化功能区
图2展示的标量可视化的功能区,提供了根据选择不同标签,水平轴的不同维度和平滑度来查看标量信息的功能。
- 标签:提供了对所有标签进行多项选择的功能,用户可以通过勾选所需的标签,查看对应的标量信息。
- 水平轴:可以选择“步骤”、“相对时间”、“绝对时间”中的任意一项,来作为标量曲线的水平轴。
- 平滑度:可以通过调整平滑度,对标量曲线进行平滑处理。
- 标量合成:可以选中两条标量曲线进行合成并展示在一个图中,以方便对两条曲线进行对比或者查看合成后的图。
![scalar_compound.png](./images/scalar_compound.png)
图3:Accuracy和Loss的标量合成图
图3展示Accuracy曲线和Loss曲线的标量合成图。标量合成的功能区与标量可视化的功能区相似。其中与标量可视化功能区不一样的地方,在于标签选择时,标量合成功能最多只能同时选择两个标签,将其曲线合成并展示。
## 参数分布图可视化
参数分布图用于将用户所指定的张量以直方图的形式进行展示。
![histogram.png](./images/histogram.png)
图4: 直方图展示
图4将用户所记录的张量以直方图的形式进行展示。点击图中右上角,可以将图放大。
![histogram_func.png](./images/histogram_func.png)
图5: 参数分布图功能区
图5展示参数分布图的功能区,包含以下内容:
- 标签选择:提供了对所有标签进行多项选择的功能,用户可以通过勾选所需的标签,查看对应的直方图。
- 纵轴:可以选择`步骤``相对时间``绝对时间`中的任意一项,来作为直方图纵轴显示的数据。
- 视角:可以选择`正视``俯视`中的一种。`正视`是指从正面的角度查看直方图,此时不同步骤之间的数据会覆盖在一起。`俯视`是指偏移以45度角俯视直方图区域,这时可以呈现不同步骤之间数据的差异。
## 计算图可视化
计算图可视化用于展示计算图的图结构,数据流以及控制流的走向,支持展示summary日志文件与通过`context``save_graphs`参数导出的`pb`文件。
![graph.png](./images/graph.png)
图6:计算图展示区
图6展示了计算图的网络结构。如图中所展示的,在展示区中,选中其中一个算子(图中圈红算子),可以看到该算子有两个输入和一个输出(实线代表算子的数据流走向)。
![graph_sidebar.png](./images/graph_sidebar.png)
图7:计算图功能区
图7展示了计算图可视化的功能区,包含以下内容:
- 文件选择框: 可以选择查看不同文件的计算图。
- 搜索框:可以对节点进行搜索,输入节点名称点击回车,即可展示该节点。
- 缩略图:展示整个网络图结构的缩略图,在查看超大图结构时,方便查看当前浏览的区域。
- 节点信息:展示选中的节点的基本信息,包括节点的名称、属性、输入节点、输出节点等信息。
- 图例:展示的是计算图中各个图标的含义。
## 数据图可视化
数据图可视化用于展示单次模型训练的数据处理和数据增强信息。
![data_function.png](./images/data_function.png)
图8:数据图功能区
图8展示的数据图功能区包含以下内容:
- 图例:展示数据溯源图中各个图标的含义。
- 数据处理流水线:展示训练所使用的数据处理流水线,可以选择图中的单个节点查看详细信息。
- 节点信息:展示选中的节点的基本信息,包括使用的数据处理和增强算子的名称、参数等。
## 图像可视化
图像可视化用于展示用户所指定的图片。
![image.png](./images/image_vi.png)
图9:图像可视化
图9展示通过滑动图中“步骤”滑条,查看不同步骤的图片。
![image_function.png](./images/image_function.png)
图10:图像可视化功能区
图10展示图像可视化的功能区,提供了选择查看不同标签,不同亮度和不同对比度来查看图片信息。
- 标签:提供了对所有标签进行多项选择的功能,用户可以通过勾选所需的标签,查看对应的图片信息。
- 亮度调整:可以调整所展示的所有图片亮度。
- 对比度调整:可以调整所展示的所有图片对比度。
## 张量可视化
张量可视化用于将张量以表格以及直方图的形式进行展示。
![tensor_function.png](./images/tensor_function.png)
图11:张量可视化功能区
图11展示张量可视化的功能区,包含以下内容:
- 标签选择:提供了对所有标签进行多项选择的功能,用户可以通过勾选所需的标签,查看对应的表格数据或者直方图。
- 视图:可以选择`表格`或者`直方图`来展示tensor数据。在`直方图`视图下存在`纵轴``视角`的功能选择。
- 纵轴:可以选择`步骤``相对时间``绝对时间`中的任意一项,来作为直方图纵轴显示的数据。
- 视角:可以选择`正视``俯视`中的一种。`正视`是指从正面的角度查看直方图,此时不同步骤之间的数据会覆盖在一起。`俯视`是指偏移以45度角俯视直方图区域,这时可以呈现不同步骤之间数据的差异。
![tensor_table.png](./images/tensor_table.png)
图12:表格展示
图12将用户所记录的张量以表格的形式展示,包含以下功能:
- 点击表格右边小方框按钮,可以将表格放大。
- 表格中白色方框显示当前展示的是哪个维度下的张量数据,其中冒号`:`表示当前维度的所有值,可以在方框输入对应的索引或者`:`后按`Enter`键或者点击后边的打勾按钮来查询特定维度的张量数据。
假设某维度是32,则其索引范围是-32到31。注意:可以查询0维到2维的张量数据,不支持查询超过两维的张量数据,即不能设置超过两个冒号`:`的查询条件。
- 拖拽表格下方的空心圆圈可以查询特定步骤的张量数据。
![tensor_histogram.png](./images/tensor_histogram.png)
图13: 直方图展示
图13将用户所记录的张量以直方图的形式进行展示。点击图中右上角,可以将图放大。
## 注意事项
1. 目前MindSpore仅支持在Ascend 910 AI处理器上导出算子融合后的计算图。
2. 在训练中使用Summary算子收集数据时,`HistogramSummary` 算子会影响性能,所以请尽量少地使用。
3. 为了控制内存占用,MindInsight对标签(tag)数目和步骤(step)数目进行了限制:
- 每个训练看板的最大标签数量为300个标签。标量标签、图片标签、计算图标签、参数分布图(直方图)标签、张量标签的数量总和不得超过300个。特别地,每个训练看板最多有10个计算图标签、6个张量标签。当实际标签数量超过这一限制时,将依照MindInsight的处理顺序,保留最近处理的300个标签。
- 每个训练看板的每个标量标签最多有1000个步骤的数据。当实际步骤的数目超过这一限制时,将对数据进行随机采样,以满足这一限制。
- 每个训练看板的每个图片标签最多有10个步骤的数据。当实际步骤的数目超过这一限制时,将对数据进行随机采样,以满足这一限制。
- 每个训练看板的每个参数分布图(直方图)标签最多有50个步骤的数据。当实际步骤的数目超过这一限制时,将对数据进行随机采样,以满足这一限制。
- 每个训练看板的每个张量标签最多有20个步骤的数据。当实际步骤的数目超过这一限制时,将对数据进行随机采样,以满足这一限制。
4. 由于`TensorSummary`会记录完整Tensor数据,数据量通常会比较大,为了控制内存占用和出于性能上的考虑,MindInsight对Tensor的大小以及返回前端展示的数值个数进行以下限制:
- MindInsight最大支持加载含有1千万个数值的Tensor。
- Tensor加载后,在张量可视的表格视图下,最大支持查看10万个数值,如果所选择的维度查询得到的数值超过这一限制,则无法显示。
5. 由于张量可视(`TensorSummary`)会记录原始张量数据,需要的存储空间较大。使用`TensorSummary`前和训练过程中请注意检查系统存储空间充足。
通过以下方法可以降低张量可视功能的存储空间占用:
1)避免使用`TensorSummary`记录较大的Tensor。
2)减少网络中`TensorSummary`算子的使用个数。
功能使用完毕后,请及时清理不再需要的训练日志,以释放磁盘空间。
备注:估算`TensorSummary`空间使用量的方法如下:
一个`TensorSummary`数据的大小 = Tensor中的数值个数 * 4 bytes。假设使用`TensorSummary`记录的Tensor大小为32 * 1 * 256 * 256,则一个`TensorSummary`数据大约需要32 * 1 * 256 * 256 * 4 bytes = 8,388,608 bytes = 8MiB。又假设`SummaryCollector`的collect_freq设置为1,且训练了50个迭代。则记录这50组数据需要的空间约为50 * 8 MiB = 400MiB。需要注意的是,由于数据结构等因素的开销,实际使用的存储空间会略大于400MiB。
6. 当使用`TensorSummary`时,由于记录完整Tensor数据,训练日志文件较大,MindInsight需要更多时间解析训练日志文件,请耐心等待。
\ No newline at end of file
# 溯源和对比看板
<!-- TOC -->
- [溯源和对比看板](#溯源和对比看板)
- [概述](#概述)
- [模型溯源](#模型溯源)
- [数据溯源](#数据溯源)
- [对比看板](#对比看板)
- [注意事项](#注意事项)
<!-- /TOC -->
<a href="https://gitee.com/mindspore/docs/blob/master/tutorials/source_zh_cn/advanced_use/lineage_and_scalars_comparision.md" target="_blank"><img src="../_static/logo_source.png"></a>&nbsp;&nbsp;
<a href="https://gitee.com/mindspore/docs/tree/master/tutorials/notebook/mindinsight" target="_blank"><img src="../_static/logo_notebook.png"></a>
## 概述
MindInsight中的模型溯源、数据溯源和对比看板同训练看板一样属于可视化组件中的重要组成部分,在对训练数据的可视化中,通过对比看板观察不同标量趋势图发现问题,再使用溯源功能定位问题原因,给用户在数据增强和深度神经网络中提供高效调优的能力。
## 模型溯源
模型溯源可视化用于展示所有训练的模型参数信息。
![image.png](./images/lineage_label.png)
图1:模型参数选择区
图1展示的模型参数选择区,列举了可供查看的模型参数标签。用户可以通过勾选所需的标签,查看相应的模型参数。
![image.png](./images/lineage_model_chart.png)
图2:模型溯源功能区
图2展示的模型溯源功能区,图像化展示了模型的参数信息。用户可以通过选择列的特定区域,展示区域范围内的模型信息。
![image.png](./images/lineage_model_table.png)
图3:模型列表
图3分组展示所有模型信息,用户可以按指定列进行升序或降序展示模型信息。
## 数据溯源
数据溯源可视化用于展示所有训练的数据处理和数据增强信息。
![data_label.png](./images/data_label.png)
图4:数据处理和增强算子选择区
图4展示的数据处理和数据增强算子选择区,列举了可供查看的数据处理和增强算子的名称。用户可以通过勾选所需的标签,查看相应的参数等信息。
![data_chart.png](./images/data_chart.png)
图5:数据溯源功能区
图5展示的数据溯源功能区,图像化展示了数据处理和数据增强使用的参数信息。用户可以通过选择列的特定区域,展示区域范围内的参数信息。
![data_table.png](./images/data_table.png)
图6:数据溯源列表
图6展示所有模型训练的数据处理和数据增强信息。
> 如果用户筛选模型溯源随后切换到数据溯源页面时,折线图将展示最新一次筛选过的模型溯源列。
## 对比看板
对比看板可视用于多个训练之间的标量曲线对比。
![multi_scalars.png](./images/multi_scalars.png)
图7: 标量对比曲线图
图7展示了多个训练之间的标量曲线对比效果,横坐标是训练步骤,纵坐标是标量值。
图中右上角有几个按钮功能,从左到右功能分别是全屏展示,切换Y轴比例,开启/关闭框选,分步回退和还原图形。
- 全屏展示即全屏展示该标量曲线,再点击一次即可恢复。
- 切换Y轴比例是指可以将Y轴坐标进行对数转换。
- 开启/关闭框选是指可以框选图中部分区域,并放大查看该区域, 可以在已放大的图形上叠加框选。
- 分步回退是指对同一个区域连续框选并放大查看时,可以逐步撤销操作。
- 还原图形是指进行了多次框选后,点击此按钮可以将图还原回原始状态。
![multi_scalars_select.png](./images/multi_scalars_select.png)
图8:对比看板可视功能区
图8展示的对比看板可视的功能区,提供了根据选择不同训练或标签,水平轴的不同维度和平滑度来进行标量对比的功能。
- 训练: 提供了对所有训练进行多项选择的功能,用户可以通过勾选或关键字筛选所需的训练。
- 标签:提供了对所有标签进行多项选择的功能,用户可以通过勾选所需的标签,查看对应的标量信息。
- 水平轴:可以选择“步骤”、“相对时间”、“绝对时间”中的任意一项,来作为标量曲线的水平轴。
- 平滑度:可以通过调整平滑度,对标量曲线进行平滑处理。
## 注意事项
出于性能上的考虑,MindInsight对比看板使用缓存机制加载训练的标量曲线数据,并进行以下限制:
- 对比看板只支持在缓存中的训练进行比较标量曲线对比。
- 缓存最多保留最新(按修改时间排列)的15个训练。
- 用户最多同时对比5个训练的标量曲线。
\ No newline at end of file
# 训练看板和溯源
# Summary数据收集
<!-- TOC -->
- [训练看板和溯源](#训练看板和溯源)
- [Summary数据收集](#summary数据收集)
- [概述](#概述)
- [操作流程](#操作流程)
- [准备训练脚本](#准备训练脚本)
- [Summary数据收集](#summary数据收集)
- [注意事项](#注意事项)
- [可视化组件](#可视化组件)
- [训练看板](#训练看板)
- [标量可视化](#标量可视化)
- [参数分布图可视化](#参数分布图可视化)
- [计算图可视化](#计算图可视化)
- [数据图可视化](#数据图可视化)
- [图像可视化](#图像可视化)
- [张量可视化](#张量可视化)
- [模型溯源](#模型溯源)
- [数据溯源](#数据溯源)
- [对比看板](#对比看板)
- [规格](#规格)
- [方式一:通过SummaryCollector自动收集](#方式一:通过summaryCollector自动收集)
- [方式二:结合Summary算子和SummaryCollector,自定义收集网络中的数据](#方式二:结合summary算子和summaryCollector,自定义收集网络中的数据)
- [方式三:自定义Callback记录数据](#方式三:自定义Callback记录数据)
- [注意事项](#注意事项)
<!-- /TOC -->
<a href="https://gitee.com/mindspore/docs/blob/master/tutorials/source_zh_cn/advanced_use/dashboard_and_lineage.md" target="_blank"><img src="../_static/logo_source.png"></a>&nbsp;&nbsp;
<a href="https://gitee.com/mindspore/docs/blob/master/tutorials/source_zh_cn/advanced_use/summary_record.md" target="_blank"><img src="../_static/logo_source.png"></a>&nbsp;&nbsp;
<a href="https://gitee.com/mindspore/docs/tree/master/tutorials/notebook/mindinsight" target="_blank"><img src="../_static/logo_notebook.png"></a>
## 概述
......@@ -37,13 +27,11 @@
## 准备训练脚本
### Summary数据收集
当前MindSpore支持将标量、图像、计算图、模型超参等信息保存到summary日志文件中,并通过可视化界面进行展示。
MindSpore目前支持三种方式将数据记录到summary日志文件中。
**方式一:通过 `SummaryCollector` 自动收集**
### 方式一:通过SummaryCollector自动收集
在MindSpore中通过 `Callback` 机制提供支持快速简易地收集一些常见的信息,包括计算图,损失值,学习率,参数权重等信息的 `Callback`, 叫做 `SummaryCollector`
......@@ -132,7 +120,7 @@ ds_eval = create_dataset('./dataset_path')
model.eval(ds_eval, callbacks=[summary_collector])
```
**方式二:结合Summary算子和 `SummaryCollector`,自定义收集网络中的数据**
### 方式二:结合Summary算子和SummaryCollector,自定义收集网络中的数据
MindSpore除了提供 `SummaryCollector` 能够自动收集一些常见数据,还提供了Summary算子,支持在网络中自定义收集其他的数据,比如每一个卷积层的输入,或在损失函数中的损失值等。记录方式如下面的步骤所示。
......@@ -237,7 +225,7 @@ summary_collector = SummaryCollector(summary_dir='./summary_dir', collect_freq=1
model.train(epoch=2, train_ds, callbacks=[summary_collector])
```
**方式三:自定义Callback记录数据**
### 方式三:自定义Callback记录数据
MindSpore支持自定义Callback, 并允许在自定义Callback中将数据记录到summary日志文件中,
并通过可视化页面进行查看。
......@@ -286,11 +274,12 @@ model.train(cnn_network, callbacks=[confusion_martrix])
在保存的文件中,`ms_output_after_hwopt.pb` 即为算子融合后的计算图,可以使用可视化页面对其进行查看。
### 注意事项
1. 目前MindSpore仅支持在Ascend 910 AI处理器上导出算子融合后的计算图。
2. 在训练中使用Summary算子收集数据时,`HistogramSummary` 算子会影响性能,所以请尽量少地使用。
3. 不能同时使用多个 `SummaryRecord` 实例 (`SummaryCollector` 中使用了 `SummaryRecord`)。
## 注意事项
1. 为了控制列出summary列表的用时,MindInsight最多支持发现999个summary列表条目。
2. 不能同时使用多个 `SummaryRecord` 实例 (`SummaryCollector` 中使用了 `SummaryRecord`)。
如果在 `model.train` 或者 `model.eval` 的callback列表中使用两个及以上的 `SummaryCollector` 实例,则视为同时使用 `SummaryRecord`,导致记录数据失败。
......@@ -299,7 +288,7 @@ model.train(cnn_network, callbacks=[confusion_martrix])
正确代码:
```python3
...
summary_collector = SummaryCollector('./summary_dir')
summary_collector = SummaryCollecotor('./summary_dir')
model.train(epoch=2, train_dataset, callbacks=[summary_collector])
...
......@@ -309,8 +298,8 @@ model.train(cnn_network, callbacks=[confusion_martrix])
错误代码:
```python3
...
summary_collector1 = SummaryCollector('./summary_dir1')
summary_collector2 = SummaryCollector('./summary_dir2')
summary_collector1 = SummaryCollecotor('./summary_dir1')
summary_collector2 = SummaryCollecotor('./summary_dir2')
model.train(epoch=2, train_dataset, callbacks=[summary_collector1, summary_collector2])
```
......@@ -319,265 +308,6 @@ model.train(cnn_network, callbacks=[confusion_martrix])
...
# Note: the 'ConfusionMatrixCallback' is user-defined, and it uses SummaryRecord to record data.
confusion_callback = ConfusionMatrixCallback('./summary_dir1')
summary_collector = SummaryCollector('./summary_dir2')
summary_collector = SummaryCollecotor('./summary_dir2')
model.train(epoch=2, train_dataset, callbacks=[confusion_callback, summary_collector])
```
4. 由于张量可视(`TensorSummary`)会记录原始张量数据,需要的存储空间较大。使用`TensorSummary`前和训练过程中请注意检查系统存储空间充足。
通过以下方法可以降低张量可视功能的存储空间占用:
1)避免使用`TensorSummary`记录较大的Tensor。
2)减少网络中`TensorSummary`算子的使用个数。
功能使用完毕后,请及时清理不再需要的训练日志,以释放磁盘空间。
备注:估算`TensorSummary`空间使用量的方法如下:
一个`TensorSummary`数据的大小 = Tensor中的数值个数 * 4 bytes。假设使用`TensorSummary`记录的Tensor大小为32 * 1 * 256 * 256,则一个`TensorSummary`数据大约需要32 * 1 * 256 * 256 * 4 bytes = 8,388,608 bytes = 8MiB。又假设`SummaryCollector`的collect_freq设置为1,且训练了50个迭代。则记录这50组数据需要的空间约为50 * 8 MiB = 400MiB。需要注意的是,由于数据结构等因素的开销,实际使用的存储空间会略大于400MiB。
5. 当使用`TensorSummary`时,由于记录完整Tensor数据,训练日志文件较大,MindInsight需要更多时间解析训练日志文件,请耐心等待。
## 可视化组件
### 训练看板
用户从训练列表中选择指定的训练,进入训练看板。
#### 标量可视化
标量可视化用于展示训练过程中,标量的变化趋势情况。
![scalar.png](./images/scalar.png)
图1:标量趋势图
图1展示了神经网络在训练过程中损失值的变化过程。横坐标是训练步骤,纵坐标是损失值。
图中右上角有几个按钮功能,从左到右功能分别是全屏展示,切换Y轴比例,开启/关闭框选,分步回退和还原图形。
- 全屏展示即全屏展示该标量曲线,再点击一次即可恢复。
- 切换Y轴比例是指可以将Y轴坐标进行对数转换。
- 开启/关闭框选是指可以框选图中部分区域,并放大查看该区域, 可以在已放大的图形上叠加框选。
- 分步回退是指对同一个区域连续框选并放大查看时,可以逐步撤销操作。
- 还原图形是指进行了多次框选后,点击此按钮可以将图还原回原始状态。
图中右下角可以设置阈值并高亮显示或者删除阈值。如图所示,设置的阈值为小于0.3,红色高亮部分显示出超出阈值的部分,能够直观地看到预期的数据值或者一些异常的数值。
![scalar_select.png](./images/scalar_select.png)
图2:标量可视化功能区
图2展示的标量可视化的功能区,提供了根据选择不同标签,水平轴的不同维度和平滑度来查看标量信息的功能。
- 标签:提供了对所有标签进行多项选择的功能,用户可以通过勾选所需的标签,查看对应的标量信息。
- 水平轴:可以选择“步骤”、“相对时间”、“绝对时间”中的任意一项,来作为标量曲线的水平轴。
- 平滑度:可以通过调整平滑度,对标量曲线进行平滑处理。
- 标量合成:可以选中两条标量曲线进行合成并展示在一个图中,以方便对两条曲线进行对比或者查看合成后的图。
![scalar_compound.png](./images/scalar_compound.png)
图3:Accuracy和Loss的标量合成图
图3展示Accuracy曲线和Loss曲线的标量合成图。标量合成的功能区与标量可视化的功能区相似。其中与标量可视化功能区不一样的地方,在于标签选择时,标量合成功能最多只能同时选择两个标签,将其曲线合成并展示。
#### 参数分布图可视化
参数分布图用于将用户所指定的张量以直方图的形式进行展示。
![histogram.png](./images/histogram.png)
图4: 直方图展示
图4将用户所记录的张量以直方图的形式进行展示。点击图中右上角,可以将图放大。
![histogram_func.png](./images/histogram_func.png)
图5: 参数分布图功能区
图5展示参数分布图的功能区,包含以下内容:
- 标签选择:提供了对所有标签进行多项选择的功能,用户可以通过勾选所需的标签,查看对应的直方图。
- 纵轴:可以选择`步骤``相对时间``绝对时间`中的任意一项,来作为直方图纵轴显示的数据。
- 视角:可以选择`正视``俯视`中的一种。`正视`是指从正面的角度查看直方图,此时不同步骤之间的数据会覆盖在一起。`俯视`是指偏移以45度角俯视直方图区域,这时可以呈现不同步骤之间数据的差异。
#### 计算图可视化
计算图可视化用于展示计算图的图结构,数据流以及控制流的走向,支持展示summary日志文件与通过`context``save_graphs`参数导出的`pb`文件。
![graph.png](./images/graph.png)
图6:计算图展示区
图6展示了计算图的网络结构。如图中所展示的,在展示区中,选中其中一个算子(图中圈红算子),可以看到该算子有两个输入和一个输出(实线代表算子的数据流走向)。
![graph_sidebar.png](./images/graph_sidebar.png)
图7:计算图功能区
图7展示了计算图可视化的功能区,包含以下内容:
- 文件选择框: 可以选择查看不同文件的计算图。
- 搜索框:可以对节点进行搜索,输入节点名称点击回车,即可展示该节点。
- 缩略图:展示整个网络图结构的缩略图,在查看超大图结构时,方便查看当前浏览的区域。
- 节点信息:展示选中的节点的基本信息,包括节点的名称、属性、输入节点、输出节点等信息。
- 图例:展示的是计算图中各个图标的含义。
#### 数据图可视化
数据图可视化用于展示单次模型训练的数据处理和数据增强信息。
![data_function.png](./images/data_function.png)
图8:数据图功能区
图8展示的数据图功能区包含以下内容:
- 图例:展示数据溯源图中各个图标的含义。
- 数据处理流水线:展示训练所使用的数据处理流水线,可以选择图中的单个节点查看详细信息。
- 节点信息:展示选中的节点的基本信息,包括使用的数据处理和增强算子的名称、参数等。
#### 图像可视化
图像可视化用于展示用户所指定的图片。
![image.png](./images/image_vi.png)
图9:图像可视化
图9展示通过滑动图中“步骤”滑条,查看不同步骤的图片。
![image_function.png](./images/image_function.png)
图10:图像可视化功能区
图10展示图像可视化的功能区,提供了选择查看不同标签,不同亮度和不同对比度来查看图片信息。
- 标签:提供了对所有标签进行多项选择的功能,用户可以通过勾选所需的标签,查看对应的图片信息。
- 亮度调整:可以调整所展示的所有图片亮度。
- 对比度调整:可以调整所展示的所有图片对比度。
#### 张量可视化
张量可视化用于将张量以表格以及直方图的形式进行展示。
![tensor_function.png](./images/tensor_function.png)
图11:张量可视化功能区
图11展示张量可视化的功能区,包含以下内容:
- 标签选择:提供了对所有标签进行多项选择的功能,用户可以通过勾选所需的标签,查看对应的表格数据或者直方图。
- 视图:可以选择`表格`或者`直方图`来展示tensor数据。在`直方图`视图下存在`纵轴``视角`的功能选择。
- 纵轴:可以选择`步骤``相对时间``绝对时间`中的任意一项,来作为直方图纵轴显示的数据。
- 视角:可以选择`正视``俯视`中的一种。`正视`是指从正面的角度查看直方图,此时不同步骤之间的数据会覆盖在一起。`俯视`是指偏移以45度角俯视直方图区域,这时可以呈现不同步骤之间数据的差异。
![tensor_table.png](./images/tensor_table.png)
图12:表格展示
图12将用户所记录的张量以表格的形式展示,包含以下功能:
- 点击表格右边小方框按钮,可以将表格放大。
- 表格中白色方框显示当前展示的是哪个维度下的张量数据,其中冒号`:`表示当前维度的所有值,可以在方框输入对应的索引或者`:`后按`Enter`键或者点击后边的打勾按钮来查询特定维度的张量数据。
假设某维度是32,则其索引范围是-32到31。注意:可以查询0维到2维的张量数据,不支持查询超过两维的张量数据,即不能设置超过两个冒号`:`的查询条件。
- 拖拽表格下方的空心圆圈可以查询特定步骤的张量数据。
![tensor_histogram.png](./images/tensor_histogram.png)
图13: 直方图展示
图13将用户所记录的张量以直方图的形式进行展示。点击图中右上角,可以将图放大。
### 模型溯源
模型溯源可视化用于展示所有训练的模型参数信息。
![image.png](./images/lineage_label.png)
图14:模型参数选择区
图14展示的模型参数选择区,列举了可供查看的模型参数标签。用户可以通过勾选所需的标签,查看相应的模型参数。
![image.png](./images/lineage_model_chart.png)
图15:模型溯源功能区
图15展示的模型溯源功能区,图像化展示了模型的参数信息。用户可以通过选择列的特定区域,展示区域范围内的模型信息。
![image.png](./images/lineage_model_table.png)
图16:模型列表
图16分组展示所有模型信息,用户可以按指定列进行升序或降序展示模型信息。
### 数据溯源
数据溯源可视化用于展示所有训练的数据处理和数据增强信息。
![data_label.png](./images/data_label.png)
图17:数据处理和增强算子选择区
图17展示的数据处理和数据增强算子选择区,列举了可供查看的数据处理和增强算子的名称。用户可以通过勾选所需的标签,查看相应的参数等信息。
![data_chart.png](./images/data_chart.png)
图18:数据溯源功能区
图18展示的数据溯源功能区,图像化展示了数据处理和数据增强使用的参数信息。用户可以通过选择列的特定区域,展示区域范围内的参数信息。
![data_table.png](./images/data_table.png)
图19:数据溯源列表
图19展示所有模型训练的数据处理和数据增强信息。
> 如果用户筛选模型溯源随后切换到数据溯源页面时,折线图将展示最新一次筛选过的模型溯源列。
### 对比看板
对比看板可视用于多个训练之间的标量曲线对比。
![multi_scalars.png](./images/multi_scalars.png)
图20: 标量对比曲线图
图20展示了多个训练之间的标量曲线对比效果,横坐标是训练步骤,纵坐标是标量值。
图中右上角有几个按钮功能,从左到右功能分别是全屏展示,切换Y轴比例,开启/关闭框选,分步回退和还原图形。
- 全屏展示即全屏展示该标量曲线,再点击一次即可恢复。
- 切换Y轴比例是指可以将Y轴坐标进行对数转换。
- 开启/关闭框选是指可以框选图中部分区域,并放大查看该区域, 可以在已放大的图形上叠加框选。
- 分步回退是指对同一个区域连续框选并放大查看时,可以逐步撤销操作。
- 还原图形是指进行了多次框选后,点击此按钮可以将图还原回原始状态。
![multi_scalars_select.png](./images/multi_scalars_select.png)
图21:对比看板可视功能区
图21展示的对比看板可视的功能区,提供了根据选择不同训练或标签,水平轴的不同维度和平滑度来进行标量对比的功能。
- 训练: 提供了对所有训练进行多项选择的功能,用户可以通过勾选或关键字筛选所需的训练。
- 标签:提供了对所有标签进行多项选择的功能,用户可以通过勾选所需的标签,查看对应的标量信息。
- 水平轴:可以选择“步骤”、“相对时间”、“绝对时间”中的任意一项,来作为标量曲线的水平轴。
- 平滑度:可以通过调整平滑度,对标量曲线进行平滑处理。
## 规格
为了控制列出summary列表的用时,MindInsight最多支持发现999个summary列表条目。
为了控制内存占用,MindInsight对标签(tag)数目和步骤(step)数目进行了限制:
- 每个训练看板的最大标签数量为300个标签。标量标签、图片标签、计算图标签、参数分布图(直方图)标签、张量标签的数量总和不得超过300个。特别地,每个训练看板最多有10个计算图标签、6个张量标签。当实际标签数量超过这一限制时,将依照MindInsight的处理顺序,保留最近处理的300个标签。
- 每个训练看板的每个标量标签最多有1000个步骤的数据。当实际步骤的数目超过这一限制时,将对数据进行随机采样,以满足这一限制。
- 每个训练看板的每个图片标签最多有10个步骤的数据。当实际步骤的数目超过这一限制时,将对数据进行随机采样,以满足这一限制。
- 每个训练看板的每个参数分布图(直方图)标签最多有50个步骤的数据。当实际步骤的数目超过这一限制时,将对数据进行随机采样,以满足这一限制。
- 每个训练看板的每个张量标签最多有20个步骤的数据。当实际步骤的数目超过这一限制时,将对数据进行随机采样,以满足这一限制。
出于性能上的考虑,MindInsight对比看板使用缓存机制加载训练的标量曲线数据,并进行以下限制:
- 对比看板只支持在缓存中的训练进行比较标量曲线对比。
- 缓存最多保留最新(按修改时间排列)的15个训练。
- 用户最多同时对比5个训练的标量曲线。
由于`TensorSummary`会记录完整Tensor数据,数据量通常会比较大,为了控制内存占用和出于性能上的考虑,MindInsight对Tensor的大小以及返回前端展示的数值个数进行以下限制:
- MindInsight最大支持加载含有1千万个数值的Tensor。
- Tensor加载后,在张量可视的表格视图下,最大支持查看10万个数值,如果所选择的维度查询得到的数值超过这一限制,则无法显示。
\ No newline at end of file
```
\ No newline at end of file
......@@ -4,7 +4,9 @@
.. toctree::
:maxdepth: 1
dashboard_and_lineage
summary_record
dashboard
lineage_and_scalars_comparision
system_metrics
performance_profiling
mindinsight_commands
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册