When the training result deviates from the expectation on Ascend, the input and output of the operator can be dumped for debugging through Asynchronous Data Dump.
When the training result deviates from the expectation on Ascend, the input and output of the operator can be dumped for debugging through Asynchronous Data Dump.
> `comm_ops` operators are not supported by Asynchronous Data Dump. `comm_ops` can be found in [Operator List](https://www.mindspore.cn/docs/en/master/operator_list.html).
1. Turn on the switch to save graph IR: `context.set_context(save_graphs=True)`.
1. Turn on the switch to save graph IR: `context.set_context(save_graphs=True)`.
2. Execute training script.
2. Execute training script.
3. Open `hwopt_d_end_graph_{graph id}.ir` in the directory you execute the script and find the name of the operators you want to Dump.
3. Open `hwopt_d_end_graph_{graph id}.ir` in the directory you execute the script and find the name of the operators you want to Dump.
...
@@ -244,6 +246,9 @@ When the training result deviates from the expectation on Ascend, the input and
...
@@ -244,6 +246,9 @@ When the training result deviates from the expectation on Ascend, the input and
}
}
```
```
> - Iteration should be set to 0 in non data sink mode and data of every iterationi will be dumped.
> - Iteration should increase by 1 in data sink mode. For example, data of GetNext will be dumped in iteration 0 and data of compute graph will be dumped in iteration 1.
5. Set environment variables.
5. Set environment variables.
```bash
```bash
...
@@ -252,9 +257,8 @@ When the training result deviates from the expectation on Ascend, the input and
...
@@ -252,9 +257,8 @@ When the training result deviates from the expectation on Ascend, the input and
export DATA_DUMP_CONFIG_PATH=data_dump.json
export DATA_DUMP_CONFIG_PATH=data_dump.json
```
```
> Set the environment variables before executing the training script. Setting environment variables during training will not take effect.
> - Set the environment variables before executing the training script. Setting environment variables during training will not take effect.
> - Dump environment variables need to be configured before calling `mindspore.communication.management.init`.
> Dump environment variables need to be configured before calling `mindspore.communication.management.init`.