MegEngine 天元 / MegEngine
大约 1 年前同步成功

代码
- 文件
- 提交
- 分支
- Tags
- 贡献者
- 分支图
- Diff
Issue 0
- 列表
- 看板
- 标记
- 里程碑
合并请求 0
DevOps
Wiki 0
- Wiki
分析
- 仓库
- DevOps
项目成员
Pages

体验新版 GitCode，发现更多精彩内容 >>

v0.6.0

新功能

NHWC的warpperspective添加matidx支持。
添加CUDA版本的remap算子支持。
支持编译ios whl包。
megengine模型支持TensorCore加速。
Parameter 中增加 replica_mode来指定是否需要同步。
collective_comm算子添加local_grad参数。

性能优化

持续优化CPU下NCHW44性能，在业务线模型有5%-30%性能提升。
添加更多midout支持，进一步减少binary size体积。

问题修复

修复使用vs2015 编译后megbrain执行速度慢的问题。
修复CPU端偶尔出现free < total的内存分配问题。
修复arm linux下GCC编译器无法inline小函数导致的性能问题。
修复cuda warpperspective算子在batch * img_size 超过INT_MAX时的计算错误。
修复cuda elemwise 在int8 broadcast情况下的计算错误，不影响NCHW4模型。
修复psroi_pooling 算子的indexing计算逻辑。
修复若干个JIT求导时的错误。
修复GCC7下编译问题。
修复部分NCHW→NCHWxx的转换器问题。
修复reduce和gather的求导问题。
修复fbs模型格式下无法正确加载含有多个graph的情况(不影响内部mdl模型格式)。
修复warpperspective在开midout时可能存在的undefined reference问题。
修复开exception引入的敏感词问题。
修复标注中由于categories乱序导致生成的contiguous id错误。
修复了当一个进程中存在多个 dataloader 实例时，MGE_PLASMA_STORE_MANAGER销毁行为不正确的问题。
修复无法加载量化int8 pkl模型的问题。
修复nn.flatten的API 说明。@ChaiMind

Thanks to our Contributors

本次release非常感谢@ChaiMind 提交PR，期待更多的开发者一起共建MegEngine！

New Features

Add matidx support to warp perspective operator of NHWC.
Add remap operator support for CUDA.
Support compile whl package of IOS.
MegEngine quantized model supports tensorcore acceleration.
Add replica_mode to Parameter to specify whether synchronization is required.
Add local_grad parameter to collective_ comm operators.

Optimization

Continue to optimize the performance of NCHW44 under CPU, and improve the performance of online model by 5% - 30%.
Add more midout support to further reduce the size of binary.

Bug Fixes

Fixed slow execution of megbrain compiled with vs2015.
Fix the memory allocation problem of free < total on CPU side occasionally.
Fix the performance problem caused by GCC compiler unable to inline small functions in arm Linux.
Fix CUDA warp perspective operator in batch * img_ size exceeds INT_MAX.
Fix CUDA elemwise's calculation error in int8 broadcast without affecting current CUDA nchw4 model.
Fix psroi_ Indexing computational logic of pooling operators.
Fixed several JIT grad errors.
Fix compilation problem under gcc7.
Fix some converter bugs of NCHW → NCHW4.
Fix the derivation problem of reduce and gather.
Fix the situation that FBS model format cannot load multiple graphs correctly (It does not affect the internal MDL model format).
Fix possible undefined reference issue with warp perspective operator when using midout.
Fix sensitive words introduced by exception.
Fix the generated contiguous id error due to the disorder of categories in the annotation.
Fix incorrect destruction behavior of MGE_PLASMA_STORE_MANAGER when multiple instances of dataloader exist in a process.
Fix the problem of loading quantified int8 pkl model.
fix the msgstr for nn.flatten.@ChaiMind

Thanks to our Contributors

A kind acknowledgement to PR lodged by @ChaiMind , and we are genuinely welcoming more developers to co-build MegEngine!

项目简介

MegEngine 是一个快速、可拓展、易于使用且支持自动求导的深度学习框架

🚀 Github 镜像仓库 🚀

源项目地址 ⬇ ⬇ ⬇

https://github.com/MegEngine/MegEngine

Apache License 2.0
文件大小 6.3 MB
仓库大小 6.3 MB

发行版本 37

MegEngine v1.13.1

8月 31, 2023

全部发行版

贡献者 39

全部贡献者

开发语言

C++ 79.8 %
Cuda 13.8 %
Python 4.9 %
C 0.9 %
CMake 0.5 %