Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
BaiXuePrincess
Paddle
提交
134f9c3e
P
Paddle
项目概览
BaiXuePrincess
/
Paddle
与 Fork 源项目一致
Fork自
PaddlePaddle / Paddle
通知
1
Star
1
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
0
列表
看板
标记
里程碑
合并请求
0
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
P
Paddle
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
0
Issue
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
未验证
提交
134f9c3e
编写于
9月 16, 2022
作者:
J
JZ-LIANG
提交者:
GitHub
9月 16, 2022
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
[Auto Parallel] Bugfix allreduce fuse for MP (#46086)
* bugfix * bugfix * typos fixed
上级
4fba3d5e
变更
3
显示空白变更内容
内联
并排
Showing
3 changed file
with
55 addition
and
54 deletion
+55
-54
python/paddle/distributed/auto_parallel/engine.py
python/paddle/distributed/auto_parallel/engine.py
+40
-40
python/paddle/distributed/auto_parallel/interface.py
python/paddle/distributed/auto_parallel/interface.py
+9
-8
python/paddle/distributed/passes/auto_parallel_data_parallel_optimization.py
...ibuted/passes/auto_parallel_data_parallel_optimization.py
+6
-6
未找到文件。
python/paddle/distributed/auto_parallel/engine.py
浏览文件 @
134f9c3e
...
@@ -636,12 +636,12 @@ class Engine:
...
@@ -636,12 +636,12 @@ class Engine:
Evaluate the loss and metrics of the model on evaluation data.
Evaluate the loss and metrics of the model on evaluation data.
Args:
Args:
eval
_data (Dataset): An instance of paddle paddle.io.Dataset. Default: None.
valid
_data (Dataset): An instance of paddle paddle.io.Dataset. Default: None.
eval
_sample_split (int, optional): Each sample of the eval dataset is assumed
valid
_sample_split (int, optional): Each sample of the eval dataset is assumed
to be a (input, label) pair by default and has two items. If each sample has
to be a (input, label) pair by default and has two items. If each sample has
more than two items,
eval_sample_split specifies how to split these items into
more than two items,
valid_sample_split specifies how to split these items into
input and label. The items before it are input and the left are label. Default: None.
input and label. The items before it are input and the left are label. Default: None.
batch_size (int, optional): The batch size of
eval
_data. The user's data will
batch_size (int, optional): The batch size of
valid
_data. The user's data will
be used directly without batching if set to None. Default: 1.
be used directly without batching if set to None. Default: 1.
steps (int, optional): It is the total number of steps (batches of samples) to draw before
steps (int, optional): It is the total number of steps (batches of samples) to draw before
stopping evaluation. If None, evaluation will run until the `valid_data` dataset is exhausted.
stopping evaluation. If None, evaluation will run until the `valid_data` dataset is exhausted.
...
...
python/paddle/distributed/auto_parallel/interface.py
浏览文件 @
134f9c3e
...
@@ -12,6 +12,7 @@
...
@@ -12,6 +12,7 @@
# See the License for the specific language governing permissions and
# See the License for the specific language governing permissions and
# limitations under the License.
# limitations under the License.
import
paddle
from
paddle.fluid
import
core
from
paddle.fluid
import
core
from
.process_mesh
import
ProcessMesh
from
.process_mesh
import
ProcessMesh
from
.process_mesh
import
get_current_process_mesh
from
.process_mesh
import
get_current_process_mesh
...
...
python/paddle/distributed/passes/auto_parallel_data_parallel_optimization.py
浏览文件 @
134f9c3e
...
@@ -111,14 +111,9 @@ class DataParallelOptimizationPass(PassBase):
...
@@ -111,14 +111,9 @@ class DataParallelOptimizationPass(PassBase):
if
not
self
.
_could_be_fuse
():
if
not
self
.
_could_be_fuse
():
return
[]
return
[]
with
open
(
'./before_program.txt.'
+
str
(
paddle
.
distributed
.
get_rank
()),
'w'
)
as
f
:
f
.
write
(
str
(
default_main_program
()))
grad_group
=
self
.
_group_grads
()
grad_group
=
self
.
_group_grads
()
self
.
_update_program
(
grad_group
)
self
.
_update_program
(
grad_group
)
with
open
(
'./after_program.txt.'
+
str
(
paddle
.
distributed
.
get_rank
()),
'w'
)
as
f
:
f
.
write
(
str
(
default_main_program
()))
return
grad_group
return
grad_group
def
_analyze_program
(
self
):
def
_analyze_program
(
self
):
...
@@ -569,6 +564,11 @@ class GradientsGroup(object):
...
@@ -569,6 +564,11 @@ class GradientsGroup(object):
self
.
remove_scale_op_indices
.
append
(
i
+
1
)
self
.
remove_scale_op_indices
.
append
(
i
+
1
)
if
len
(
self
.
gradients
)
==
1
:
if
len
(
self
.
gradients
)
==
1
:
# TODO Remove this is a temporary hack for Tensor Parallel. the logic
# for find grad_op should be more general.
if
self
.
ops
[
grad_op_idx
].
type
==
"c_allreduce_sum"
:
grad_op_idx
-=
1
grad_op
=
self
.
ops
[
grad_op_idx
]
grad_op
=
self
.
ops
[
grad_op_idx
]
assert
grad_var
.
name
in
grad_op
.
output_arg_names
,
"grad [{}] should be output of {}"
.
format
(
assert
grad_var
.
name
in
grad_op
.
output_arg_names
,
"grad [{}] should be output of {}"
.
format
(
grad_var
.
name
,
str
(
grad_op
))
grad_var
.
name
,
str
(
grad_op
))
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录