- 11 3月, 2020 6 次提交
-
-
由 Tunji Ruwase 提交于
Merge branch 'olruwase/scheduler_optimizer_bug' of github.com:microsoft/DeepSpeed into olruwase/scheduler_optimizer_bug
-
由 Olatunji Ruwase 提交于
-
由 Tunji Ruwase 提交于
Merge branch 'olruwase/scheduler_optimizer_bug' of github.com:microsoft/DeepSpeed into olruwase/scheduler_optimizer_bug
-
由 Tunji Ruwase 提交于
-
由 Shaden Smith 提交于
-
由 Olatunji Ruwase 提交于
-
- 10 3月, 2020 4 次提交
-
-
由 Cola 提交于
-
由 Olatunji Ruwase 提交于
-
由 Tunji Ruwase 提交于
-
由 Tunji Ruwase 提交于
-
- 09 3月, 2020 1 次提交
-
-
由 Incomplete 提交于
* Add --no_sudo to run without sudo * Add --pip_mirror to set the pip mirror * Default to running pip without sudo * Typo * Add --pip_sudo to Dockerfile and azure-pipelines.yml Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
- 07 3月, 2020 1 次提交
-
-
由 Olatunji Ruwase 提交于
-
- 04 3月, 2020 2 次提交
-
-
由 Jeff Rasley 提交于
-
由 Jeff Rasley 提交于
* add support for deepspeed env file to pass custom env values * simplify deepspeed config example
-
- 28 2月, 2020 2 次提交
-
-
由 Jeff Rasley 提交于
-
由 Jeff Rasley 提交于
* add text about mpirun
-
- 27 2月, 2020 3 次提交
-
-
由 Jeff Rasley 提交于
* add mpirun support for openmpi 4.0 * add master addr support from args * switch mpi detection to use mpi4py * set constant for default distributed port * Make sure deepspeed_mpi exits in args
-
由 Jeff Rasley 提交于
-
由 Jeff Rasley 提交于
* add auto-detect to torch dist init * update tests to infer distributed init status * prevent crash if dist_init_required is True but already initiliazed * only init if safe to do so (forgot to add this file in prev commit)
-
- 26 2月, 2020 1 次提交
-
-
由 zenlytix 提交于
* Update scripts to handle cases where you have other VMs in your sub * Support subs with other VMs and fix for PDSH permission error * Minor fix to support subs with other VMs * Added shutdown with or without delete VM option In Azure deallocate is like machine shutdown (and prevents billing). You can restart deallocated VM. To fully drop the VM delete is used. This command with "-d" option will fully delete the VM. Without any argument it justs deallocates / shutd down the VM.
-
- 25 2月, 2020 3 次提交
-
-
由 zenlytix 提交于
* Update scripts to handle cases where you have other VMs in your sub * Support subs with other VMs and fix for PDSH permission error * Minor fix to support subs with other VMs
-
由 Jeff Rasley 提交于
-
由 Shaden Smith 提交于
-
- 24 2月, 2020 1 次提交
-
-
由 Shaden Smith 提交于
* Removes DeepSpeedDataSource * dropping unused imports Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
- 22 2月, 2020 1 次提交
-
-
由 Olatunji Ruwase 提交于
* Support legacy optimizer fusion as config option * Configure for legacy optimizer fusion * Update configuration jsons for new apex
-
- 21 2月, 2020 1 次提交
-
-
由 Jeff Rasley 提交于
Also a fix for #94
-
- 20 2月, 2020 2 次提交
-
-
由 Jeff Rasley 提交于
Co-authored-by: NShaden Smith <ShadenTSmith@gmail.com>
-
由 Shaden Smith 提交于
-
- 15 2月, 2020 3 次提交
-
-
由 kouml 提交于
* add install requirements command line * add pillow library to fix version * modify to uppercase Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Jeff Rasley 提交于
bug fixes for adamw/lamb and corresponding tests
-
由 Shaden Smith 提交于
* Porting BingBertSquad test * Updating default paths. * Enable model tests. * Updating DeepSpeedExamples submodule * Adding BingBertSquad's log uploads. * Messed up the submodule again :-)
-
- 14 2月, 2020 1 次提交
-
-
由 Jeff Rasley 提交于
* Set up CI with Azure Pipelines for docker build/push
-
- 13 2月, 2020 4 次提交
-
-
由 Rahul Prasad 提交于
-
由 Jeff Rasley 提交于
* bump tf version in dockerfile * Update install.sh
-
由 Shaden Smith 提交于
-
由 eltonzheng 提交于
-
- 11 2月, 2020 4 次提交
-
-
由 Gaurav Menghani 提交于
* Fix broken link for the 1Cycle doc. * Removed the 1Cycle link from README.md.
-
由 Shaden Smith 提交于
-
由 Shaden Smith 提交于
-
由 Shaden Smith 提交于
* Importing 1Cycle tutorial. * image paths * Added LR schedule figure * line wrap * lowercase name * Updating README links * typo
-