Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
PaddlePaddle
PaddleClas
提交
7ff257ea
P
PaddleClas
项目概览
PaddlePaddle
/
PaddleClas
接近 2 年 前同步成功
通知
116
Star
4999
Fork
1114
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
19
列表
看板
标记
里程碑
合并请求
6
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
P
PaddleClas
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
19
Issue
19
列表
看板
标记
里程碑
合并请求
6
合并请求
6
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
提交
7ff257ea
编写于
11月 21, 2022
作者:
H
HydrogenSulfate
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
fix random seed bug for pksampler in DDP
上级
0e5cbd2b
变更
2
隐藏空白更改
内联
并排
Showing
2 changed file
with
27 addition
and
4 deletion
+27
-4
ppcls/data/dataloader/pk_sampler.py
ppcls/data/dataloader/pk_sampler.py
+7
-0
ppcls/engine/engine.py
ppcls/engine/engine.py
+20
-4
未找到文件。
ppcls/data/dataloader/pk_sampler.py
浏览文件 @
7ff257ea
...
...
@@ -18,7 +18,9 @@ from __future__ import division
from
collections
import
defaultdict
import
numpy
as
np
import
paddle.distributed
as
dist
from
paddle.io
import
DistributedBatchSampler
from
ppcls.utils
import
logger
...
...
@@ -94,6 +96,11 @@ class PKSampler(DistributedBatchSampler):
format
(
diff
))
def
__iter__
(
self
):
# shuffing label_list manually in distributed environment
if
self
.
nranks
>
1
:
cur_rank
=
dist
.
get_rank
()
np
.
random
.
RandomState
(
42
+
cur_rank
).
shuffle
(
self
.
label_list
)
label_per_batch
=
self
.
batch_size
//
self
.
sample_per_id
for
_
in
range
(
len
(
self
)):
batch_index
=
[]
...
...
ppcls/engine/engine.py
浏览文件 @
7ff257ea
...
...
@@ -126,16 +126,18 @@ class Engine(object):
self
.
config
[
"DataLoader"
],
"Train"
,
self
.
device
,
self
.
use_dali
)
if
self
.
config
[
"DataLoader"
].
get
(
'UnLabelTrain'
,
None
)
is
not
None
:
self
.
unlabel_train_dataloader
=
build_dataloader
(
self
.
config
[
"DataLoader"
],
"UnLabelTrain"
,
self
.
device
,
self
.
use_dali
)
self
.
config
[
"DataLoader"
],
"UnLabelTrain"
,
self
.
device
,
self
.
use_dali
)
else
:
self
.
unlabel_train_dataloader
=
None
self
.
iter_per_epoch
=
len
(
self
.
train_dataloader
)
-
1
if
platform
.
system
(
self
.
iter_per_epoch
=
len
(
self
.
train_dataloader
)
-
1
if
platform
.
system
(
)
==
"Windows"
else
len
(
self
.
train_dataloader
)
if
self
.
config
[
"Global"
].
get
(
"iter_per_epoch"
,
None
):
# set max iteration per epoch mannualy, when training by iteration(s), such as XBM, FixMatch.
self
.
iter_per_epoch
=
self
.
config
[
"Global"
].
get
(
"iter_per_epoch"
)
self
.
iter_per_epoch
=
self
.
config
[
"Global"
].
get
(
"iter_per_epoch"
)
self
.
iter_per_epoch
=
self
.
iter_per_epoch
//
self
.
update_freq
*
self
.
update_freq
if
self
.
mode
==
"eval"
or
(
self
.
mode
==
"train"
and
...
...
@@ -329,6 +331,20 @@ class Engine(object):
))
>
0
:
self
.
train_loss_func
=
paddle
.
DataParallel
(
self
.
train_loss_func
)
# set different seed in different GPU manually in distributed environment
if
seed
is
None
:
logger
.
warning
(
"The random seed cannot be None in a distributed environment. Global.seed has been set to 42 by default"
)
self
.
config
[
"Global"
][
"seed"
]
=
seed
=
42
logger
.
info
(
f
"Set random seed to (
{
seed
}
+ $PADDLE_TRAINER_ID) for different trainer"
)
paddle
.
seed
(
seed
+
dist
.
get_rank
())
np
.
random
.
seed
(
seed
+
dist
.
get_rank
())
random
.
seed
(
seed
+
dist
.
get_rank
())
# build postprocess for infer
if
self
.
mode
==
'infer'
:
self
.
preprocess_func
=
create_operators
(
self
.
config
[
"Infer"
][
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录