Skip to content

  • 体验新版
    • 正在加载...
  • 登录
  • PaddlePaddle
  • Paddle
  • Issue
  • #9941

P
Paddle
  • 项目概览

PaddlePaddle / Paddle
大约 2 年 前同步成功

通知 2325
Star 20933
Fork 5424
  • 代码
    • 文件
    • 提交
    • 分支
    • Tags
    • 贡献者
    • 分支图
    • Diff
  • Issue 1423
    • 列表
    • 看板
    • 标记
    • 里程碑
  • 合并请求 543
  • Wiki 0
    • Wiki
  • 分析
    • 仓库
    • DevOps
  • 项目成员
  • Pages
P
Paddle
  • 项目概览
    • 项目概览
    • 详情
    • 发布
  • 仓库
    • 仓库
    • 文件
    • 提交
    • 分支
    • 标签
    • 贡献者
    • 分支图
    • 比较
  • Issue 1,423
    • Issue 1,423
    • 列表
    • 看板
    • 标记
    • 里程碑
  • 合并请求 543
    • 合并请求 543
  • Pages
  • 分析
    • 分析
    • 仓库分析
    • DevOps
  • Wiki 0
    • Wiki
  • 成员
    • 成员
  • 收起侧边栏
  • 动态
  • 分支图
  • 创建新Issue
  • 提交
  • Issue看板
已关闭
开放中
Opened 4月 16, 2018 by saxon_zh@saxon_zhGuest12 of 13 tasks completed12/13 tasks

fluid support asynchronous training

Created by: jacquesqiao

Project

https://github.com/PaddlePaddle/Paddle/projects/61

Design

  • Add async update design doc. https://github.com/PaddlePaddle/Paddle/pull/9932
  • Add distributed training overview doc. https://github.com/PaddlePaddle/Paddle/pull/9937

Operators

  • VariableResponse support deserialize var into local scope. #10060
  • Refine listen and serve op, Separate RunSyncLoop to a method, prepare for RunAsyncLoop. #10080
  • split optimization ops on pserver to independenty blocks #10123
  • Create sub socpe when it is necessary #10124
  • Add an RunAsyncUpdate(no barrier and no lock) to listen_and_serv_op #9997 (closed)
    • Prepare optimization block and PrepareContext for each parameter.
    • Add BlockQueue for each parameter block. The queue is used to store the gradient VariableMessage of this parameter from trainers.
    • Add a thread for each parameter to run optimization block.
    • The thread will read gradient from its BlockQueue, create a subscope to deserialize it and then use this subscope to run optimization block.
    • Add one thread to get parameter from the global scope for trainers.(Maybe we need a thread pool to speed up the get process. but it seems that GRPC interface can only work in one thread. Can have a test)
  • send_vars and read_vars from pserver without send_barrier and get_barrier.
  • Use multi thread todo update #10228

Transpiler #9997 (closed)

  • dist transpile async trainer program. Do not need to add .trainer_n suffix to gradient block in async mode.
  • dist transpile async pserver program. Do not need to aggregate gradient block.

Consider

  • need to consider how to add learning rate decay in asynchronous training. Do we need lr_decay?

Benchmark

  • benchmark of fluid async training #10180 (closed)
指派人
分配到
无
里程碑
无
分配里程碑
工时统计
无
截止日期
无
标识: paddlepaddle/Paddle#9941
渝ICP备2023009037号

京公网安备11010502055752号

网络110报警服务 Powered by GitLab CE v13.7
开源知识
Git 入门 Pro Git 电子书 在线学 Git
Markdown 基础入门 IT 技术知识开源图谱
帮助
使用手册 反馈建议 博客
《GitCode 隐私声明》 《GitCode 服务条款》 关于GitCode
Powered by GitLab CE v13.7