add reward model
Showing
forward_demo.py
0 → 100644
src/rlhf/__init__.py
0 → 100644
src/rlhf/lora.py
0 → 100644
src/rlhf/optimizer.py
0 → 100644
src/rlhf/palm.py
0 → 100644
src/rlhf/ppo.py
0 → 100644
此差异已折叠。
src/rlhf/reward.py
0 → 100644
src/rlhf/utils.py
0 → 100644
train_rm.py
0 → 100644
想要评论请 注册 或 登录