When the training result deviates from the expectation on Ascend, the input and output of the operator can be dumped for debugging through Asynchronous Data Dump.
> `comm_ops` operators are not supported by Asynchronous Data Dump. `comm_ops` can be found in [Operator List](https://www.mindspore.cn/docs/en/master/operator_list.html).
1. Turn on the switch to save graph IR: `context.set_context(save_graphs=True)`.
2. Execute training script.
3. Open `hwopt_d_end_graph_{graph id}.ir` in the directory you execute the script and find the name of the operators you want to Dump.
...
...
@@ -248,6 +250,9 @@ When the training result deviates from the expectation on Ascend, the input and
}
```
> - Iteration should be set to 0 in non data sink mode and data of every iterationi will be dumped.
> - Iteration should increase by 1 in data sink mode. For example, data of GetNext will be dumped in iteration 0 and data of compute graph will be dumped in iteration 1.
5. Set environment variables.
```bash
...
...
@@ -256,9 +261,8 @@ When the training result deviates from the expectation on Ascend, the input and
export DATA_DUMP_CONFIG_PATH=data_dump.json
```
> Set the environment variables before executing the training script. Setting environment variables during training will not take effect.
> Dump environment variables need to be configured before calling `mindspore.communication.management.init`.
> - Set the environment variables before executing the training script. Setting environment variables during training will not take effect.
> - Dump environment variables need to be configured before calling `mindspore.communication.management.init`.