Fix ZeRO parameter initialization for tensors with `requires_grad=True` (#4138)

* Fix ZeRO parameter initialization for tensors with `requires_grad=True` * Simplify detach logic --------- Co-authored-by: N Logan Adams <114770087+loadams@users.noreply.github.com>

Fix ZeRO parameter initialization for tensors with `requires_grad=True` (#4138)
* Fix ZeRO parameter initialization for tensors with `requires_grad=True` * Simplify detach logic --------- Co-authored-by: N Logan Adams <114770087+loadams@users.noreply.github.com>
426810a2 · Xuehai Pan · GitHub · 9723a879 · 426810a2
隐藏空白更改
内联并排

Showing with 1 addition and 1 deletion

deepspeed/runtime/zero/partition_parameters.py deepspeed/runtime/zero/partition_parameters.py +1 -1

未找到文件。
--- a/deepspeed/runtime/zero/partition_parameters.py
+++ b/deepspeed/runtime/zero/partition_parameters.py
@@ -241,7 +241,7 @@ def zero_wrapper_for_fp_tensor_constructor(fn: Callable, target_fp_dtype: torch.
            kwargs['device'] = torch.device(get_accelerator().device_name(os.environ["LOCAL_RANK"]))
        tensor: Tensor = fn(*args, **kwargs)
        if tensor.is_floating_point():
-            tensor = tensor.to(target_fp_dtype)
+            tensor.data = tensor.data.to(target_fp_dtype)

        return tensor