49.md 11.9 KB
Newer Older
W
wizardforcel 已提交
1
# C++ 前端中的 Autograd
W
wizardforcel 已提交
2

W
wizardforcel 已提交
3
> 原文:<https://pytorch.org/tutorials/advanced/cpp_autograd.html>
W
wizardforcel 已提交
4

W
wizardforcel 已提交
5
`autograd`软件包对于在 PyTorch 中构建高度灵活和动态的神经网络至关重要。 PyTorch Python 前端中的大多数 autograd API 也可以在 C++ 前端中使用,从而可以轻松地将 Autograd 代码从 Python 转换为 C++。
W
wizardforcel 已提交
6

W
wizardforcel 已提交
7
在本教程中,我们将看几个在 PyTorch C++ 前端中进行 Autograd 的示例。 请注意,本教程假定您已经对 Python 前端中的 Autograd 有基本的了解。 如果不是这种情况,请先阅读 [Autograd:自动微分](https://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html)
W
wizardforcel 已提交
8

W
wizardforcel 已提交
9
## 基本的 Autograd 操作
W
wizardforcel 已提交
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86

(改编自[本教程](https://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html#autograd-automatic-differentiation)

创建一个张量并设置`torch::requires_grad()`以跟踪它的计算

```py
auto x = torch::ones({2, 2}, torch::requires_grad());
std::cout << x << std::endl;

```

出:

```py
1 1
1 1
[ CPUFloatType{2,2} ]

```

进行张量运算:

```py
auto y = x + 2;
std::cout << y << std::endl;

```

出:

```py
 3  3
 3  3
[ CPUFloatType{2,2} ]

```

`y`是由于操作而创建的,因此具有`grad_fn`

```py
std::cout << y.grad_fn()->name() << std::endl;

```

出:

```py
AddBackward1

```

`y`上执行更多操作

```py
auto z = y * y * 3;
auto out = z.mean();

std::cout << z << std::endl;
std::cout << z.grad_fn()->name() << std::endl;
std::cout << out << std::endl;
std::cout << out.grad_fn()->name() << std::endl;

```

出:

```py
 27  27
 27  27
[ CPUFloatType{2,2} ]
MulBackward1
27
[ CPUFloatType{} ]
MeanBackward0

```

W
wizardforcel 已提交
87
`.requires_grad_( ... )`原地更改现有张量的`requires_grad`标志。
W
wizardforcel 已提交
88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117

```py
auto a = torch::randn({2, 2});
a = ((a * 3) / (a - 1));
std::cout << a.requires_grad() << std::endl;

a.requires_grad_(true);
std::cout << a.requires_grad() << std::endl;

auto b = (a * a).sum();
std::cout << b.grad_fn()->name() << std::endl;

```

出:

```py
false
true
SumBackward0

```

现在让我们反向传播。 因为`out`包含单个标量,所以`out.backward()`等效于`out.backward(torch::tensor(1.))`

```py
out.backward();

```

W
wizardforcel 已提交
118
打印梯度`d(out) / dx`
W
wizardforcel 已提交
119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135

```py
std::cout << x.grad() << std::endl;

```

出:

```py
 4.5000  4.5000
 4.5000  4.5000
[ CPUFloatType{2,2} ]

```

您应该具有`4.5`的矩阵。 有关如何获得此值的说明,请参见[本教程](https://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html#gradients)中的相应部分。

W
wizardforcel 已提交
136
现在,让我们来看一个向量雅各布产品的示例:
W
wizardforcel 已提交
137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203

```py
x = torch::randn(3, torch::requires_grad());

y = x * 2;
while (y.norm().item<double>() < 1000) {
  y = y * 2;
}

std::cout << y << std::endl;
std::cout << y.grad_fn()->name() << std::endl;

```

出:

```py
-1021.4020
  314.6695
 -613.4944
[ CPUFloatType{3} ]
MulBackward1

```

如果我们想要向量-Jacobian 乘积,请将向量作为参数传递给`backward`

```py
auto v = torch::tensor({0.1, 1.0, 0.0001}, torch::kFloat);
y.backward(v);

std::cout << x.grad() << std::endl;

```

出:

```py
  102.4000
 1024.0000
    0.1024
[ CPUFloatType{3} ]

```

您也可以通过在代码块中放置`torch::NoGradGuard`来停止对需要梯度的张量的跟踪历史的自动定格

```py
std::cout << x.requires_grad() << std::endl;
std::cout << x.pow(2).requires_grad() << std::endl;

{
  torch::NoGradGuard no_grad;
  std::cout << x.pow(2).requires_grad() << std::endl;
}

```

出:

```py
true
true
false

```

W
wizardforcel 已提交
204
或者使用`.detach()`获得具有相同内容但不需要梯度的新张量:
W
wizardforcel 已提交
205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222

```py
std::cout << x.requires_grad() << std::endl;
y = x.detach();
std::cout << y.requires_grad() << std::endl;
std::cout << x.eq(y).all().item<bool>() << std::endl;

```

出:

```py
true
false
true

```

W
wizardforcel 已提交
223
有关 C++ 张量自动梯度 API 的更多信息,例如`grad`/`requires_grad`/`is_leaf`/`backward`/`detach`/`detach_`/`register_hook`/`retain_grad`,请参见[相应的 C++  API 文档](https://pytorch.org/cppdocs/api/classat_1_1_tensor.html)
W
wizardforcel 已提交
224

W
wizardforcel 已提交
225
## 用 C++ 计算高阶梯度
W
wizardforcel 已提交
226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265

高阶梯度的应用之一是计算梯度罚分。 我们来看看使用`torch::autograd::grad`的示例:

```py
#include <torch/torch.h>

auto model = torch::nn::Linear(4, 3);

auto input = torch::randn({3, 4}).requires_grad_(true);
auto output = model(input);

// Calculate loss
auto target = torch::randn({3, 3});
auto loss = torch::nn::MSELoss()(output, target);

// Use norm of gradients as penalty
auto grad_output = torch::ones_like(output);
auto gradient = torch::autograd::grad({output}, {input}, /*grad_outputs=*/{grad_output}, /*create_graph=*/true)[0];
auto gradient_penalty = torch::pow((gradient.norm(2, /*dim=*/1) - 1), 2).mean();

// Add gradient penalty to loss
auto combined_loss = loss + gradient_penalty;
combined_loss.backward();

std::cout << input.grad() << std::endl;

```

出:

```py
-0.1042 -0.0638  0.0103  0.0723
-0.2543 -0.1222  0.0071  0.0814
-0.1683 -0.1052  0.0355  0.1024
[ CPUFloatType{3,4} ]

```

有关如何使用它们的更多信息,请参见`torch::autograd::backward`[链接](https://pytorch.org/cppdocs/api/function_namespacetorch_1_1autograd_1afa9b5d4329085df4b6b3d4b4be48914b.html))和`torch::autograd::grad`[链接](https://pytorch.org/cppdocs/api/function_namespacetorch_1_1autograd_1a1e03c42b14b40c306f9eb947ef842d9c.html))的文档。

W
wizardforcel 已提交
266
## 在 C++ 中使用自定义 Autograd 函数
W
wizardforcel 已提交
267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388

(改编自[本教程](https://pytorch.org/docs/stable/notes/extending.html#extending-torch-autograd)

`torch::autograd`添加新的基本操作需要为每个操作实现一个新的`torch::autograd::Function`子类。 `torch::autograd::Function`用于`torch::autograd`计算结果和梯度,并对操作历史进行编码。 每个新功能都需要您实现两种方法:`forward``backward`,有关详细要求,请参见[此链接](https://pytorch.org/cppdocs/api/structtorch_1_1autograd_1_1_function.html)

在下面,您可以从`torch::nn`找到`Linear`功能的代码:

```py
#include <torch/torch.h>

using namespace torch::autograd;

// Inherit from Function
class LinearFunction : public Function<LinearFunction> {
 public:
  // Note that both forward and backward are static functions

  // bias is an optional argument
  static torch::Tensor forward(
      AutogradContext *ctx, torch::Tensor input, torch::Tensor weight, torch::Tensor bias = torch::Tensor()) {
    ctx->save_for_backward({input, weight, bias});
    auto output = input.mm(weight.t());
    if (bias.defined()) {
      output += bias.unsqueeze(0).expand_as(output);
    }
    return output;
  }

  static tensor_list backward(AutogradContext *ctx, tensor_list grad_outputs) {
    auto saved = ctx->get_saved_variables();
    auto input = saved[0];
    auto weight = saved[1];
    auto bias = saved[2];

    auto grad_output = grad_outputs[0];
    auto grad_input = grad_output.mm(weight);
    auto grad_weight = grad_output.t().mm(input);
    auto grad_bias = torch::Tensor();
    if (bias.defined()) {
      grad_bias = grad_output.sum(0);
    }

    return {grad_input, grad_weight, grad_bias};
  }
};

```

然后,我们可以通过以下方式使用`LinearFunction`

```py
auto x = torch::randn({2, 3}).requires_grad_();
auto weight = torch::randn({4, 3}).requires_grad_();
auto y = LinearFunction::apply(x, weight);
y.sum().backward();

std::cout << x.grad() << std::endl;
std::cout << weight.grad() << std::endl;

```

出:

```py
 0.5314  1.2807  1.4864
 0.5314  1.2807  1.4864
[ CPUFloatType{2,3} ]
 3.7608  0.9101  0.0073
 3.7608  0.9101  0.0073
 3.7608  0.9101  0.0073
 3.7608  0.9101  0.0073
[ CPUFloatType{4,3} ]

```

在这里,我们给出了一个由非张量参数设置参数的函数的附加示例:

```py
#include <torch/torch.h>

using namespace torch::autograd;

class MulConstant : public Function<MulConstant> {
 public:
  static torch::Tensor forward(AutogradContext *ctx, torch::Tensor tensor, double constant) {
    // ctx is a context object that can be used to stash information
    // for backward computation
    ctx->saved_data["constant"] = constant;
    return tensor * constant;
  }

  static tensor_list backward(AutogradContext *ctx, tensor_list grad_outputs) {
    // We return as many input gradients as there were arguments.
    // Gradients of non-tensor arguments to forward must be `torch::Tensor()`.
    return {grad_outputs[0] * ctx->saved_data["constant"].toDouble(), torch::Tensor()};
  }
};

```

然后,我们可以通过以下方式使用`MulConstant`

```py
auto x = torch::randn({2}).requires_grad_();
auto y = MulConstant::apply(x, 5.5);
y.sum().backward();

std::cout << x.grad() << std::endl;

```

出:

```py
 5.5000
 5.5000
[ CPUFloatType{2} ]

```

有关`torch::autograd::Function`的更多信息,请参见[其文档](https://pytorch.org/cppdocs/api/structtorch_1_1autograd_1_1_function.html)

W
wizardforcel 已提交
389
## 将 Autograd 代码从 Python 转换为 C++ 
W
wizardforcel 已提交
390

W
wizardforcel 已提交
391
在较高的层次上,在 C++ 中使用 Autograd 的最简单方法是先在 Python 中拥有可用的 Autograd 代码,然后使用下表将您的 Autograd 代码从 Python 转换为 C++:
W
wizardforcel 已提交
392

W
wizardforcel 已提交
393
| Python | C++ |
W
wizardforcel 已提交
394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409
| --- | --- |
| `torch.autograd.backward` | `torch::autograd::backward`[链接](https://pytorch.org/cppdocs/api/function_namespacetorch_1_1autograd_1afa9b5d4329085df4b6b3d4b4be48914b.html)) |
| `torch.autograd.grad` | `torch::autograd::grad`[链接](https://pytorch.org/cppdocs/api/function_namespacetorch_1_1autograd_1a1e03c42b14b40c306f9eb947ef842d9c.html)) |
| `torch.Tensor.detach` | `torch::Tensor::detach`[链接](https://pytorch.org/cppdocs/api/classat_1_1_tensor.html#_CPPv4NK2at6Tensor6detachEv)) |
| `torch.Tensor.detach_` | `torch::Tensor::detach_`[链接](https://pytorch.org/cppdocs/api/classat_1_1_tensor.html#_CPPv4NK2at6Tensor7detach_Ev)) |
| `torch.Tensor.backward` | `torch::Tensor::backward`[链接](https://pytorch.org/cppdocs/api/classat_1_1_tensor.html#_CPPv4NK2at6Tensor8backwardERK6Tensorbb)) |
| `torch.Tensor.register_hook` | `torch::Tensor::register_hook`[链接](https://pytorch.org/cppdocs/api/classat_1_1_tensor.html#_CPPv4I0ENK2at6Tensor13register_hookE18hook_return_void_tI1TERR1T)) |
| `torch.Tensor.requires_grad` | `torch::Tensor::requires_grad_`[链接](https://pytorch.org/cppdocs/api/classat_1_1_tensor.html#_CPPv4NK2at6Tensor14requires_grad_Eb)) |
| `torch.Tensor.retain_grad` | `torch::Tensor::retain_grad`[链接](https://pytorch.org/cppdocs/api/classat_1_1_tensor.html#_CPPv4NK2at6Tensor11retain_gradEv)) |
| `torch.Tensor.grad` | `torch::Tensor::grad`[链接](https://pytorch.org/cppdocs/api/classat_1_1_tensor.html#_CPPv4NK2at6Tensor4gradEv)) |
| `torch.Tensor.grad_fn` | `torch::Tensor::grad_fn`[链接](https://pytorch.org/cppdocs/api/classat_1_1_tensor.html#_CPPv4NK2at6Tensor7grad_fnEv)) |
| `torch.Tensor.set_data` | `torch::Tensor::set_data`[链接](https://pytorch.org/cppdocs/api/classat_1_1_tensor.html#_CPPv4NK2at6Tensor8set_dataERK6Tensor)) |
| `torch.Tensor.data` | `torch::Tensor::data`[链接](https://pytorch.org/cppdocs/api/classat_1_1_tensor.html#_CPPv4NK2at6Tensor4dataEv)) |
| `torch.Tensor.output_nr` | `torch::Tensor::output_nr`[链接](https://pytorch.org/cppdocs/api/classat_1_1_tensor.html#_CPPv4NK2at6Tensor9output_nrEv)) |
| `torch.Tensor.is_leaf` | `torch::Tensor::is_leaf`[链接](https://pytorch.org/cppdocs/api/classat_1_1_tensor.html#_CPPv4NK2at6Tensor7is_leafEv)) |

W
wizardforcel 已提交
410
翻译后,您的大多数 Python Autograd 代码都应仅在 C++ 中工作。 如果不是这种情况,请在 [GitHub 问题](https://github.com/pytorch/pytorch/issues)中提交错误报告,我们将尽快对其进行修复。
W
wizardforcel 已提交
411

W
wizardforcel 已提交
412
## 结论
W
wizardforcel 已提交
413

W
wizardforcel 已提交
414
现在,您应该对 PyTorch 的 C++  autograd API 有了一个很好的了解。 您可以在此处找到本说明[中显示的代码示例。 与往常一样,如果您遇到任何问题或疑问,可以使用我们的](https://github.com/pytorch/examples/tree/master/cpp/autograd)[论坛](https://discuss.pytorch.org/)或 [GitHub 问题](https://github.com/pytorch/pytorch/issues)进行联系。