新功能
- IndexingMultiAxisVec添加bool类型支持
- 添加windows和macOS打包功能
- 添加adaptive pooling算子
- mlir jit添加llvm-lit支持
- 添加weights preprocess开关控制是否在inference阶段缓存weights 预处理后的结果提升性能
- 添加MGB_USE_ATLAS_ASYNC_API宏控制开启异步API调用
- 按用户使用习惯 更新 dtype promotion的规则
- 在broadcast增加参数检查
- device的__repr__方法增加物理单元信息
问题修复
- 修复cpuinfo在arm linux下的编译warning
- 修复Imperative Runtime在出错情况下由于没有正确set exception导致卡住的问题
- 修复dump_with_testcase脚本在打开--output-strip-info选项时如果文件不存在会crash的问题
- 修复NCHW->NCHW4的pass对float类型的处理
- 在float转io16c32的pass中添加对deconv的处理
- 由于xcode的问题,在ios下关闭thread_local的支持
- 修复多机训练中,ParamPackSplit出现的refcnt计数问题
- 修复多线程下多个模型使用同一个compnode且开启record功能导致出错的问题
- 修复NCHW→NCHWxx的pass在处理conv_bias 且bias为空情况下的问题
- 修复jit.trace发生错误后使得后续trace完全不可用的问题
- 修复bool.sum()
- 修复graph binding 错误处理导致graph被错误回收的问题
- 修复jit.trace 对topk/warp/nms等op的处理
- 修复LocalConv2d算子对group的支持
- 修复 dump 中使用 optimize_for_inference 时的bug
- 修复NMSKeep、topk 、warp_perspective 被 trace 时的 bug
兼容性破坏
- 调整部分Function API的命名、参数或 import 路径,删除重复API
New Features
- Add bool dtype support for IndexingMultiAxisVec
- Add windows and macOS packaging capabilities
- Add adaptive pooling opr
- Add llvm-lit support for jit mlir
- Add the weights preprocess option to control whether the results of weights preprocessing are cached in the inference phase to improve performance
- Add macro MGB_USE_ATLAS_ASYNC_API to control whether enables asynchronous API calls
- Update dtype promotion rule
- Add parameter check in tensor broadcast method
- Update device repr method to show physical placement
Bug Fixes
- Fix cpuinfo compiling warning under arm Linux
- Fix the stuck problem due to incorrect set exception in case of error
- Fix crash when enabling --output-strip-info in dump_with_testcase if the file does not exist
- Fix nchw → nchw4 pass when handling float type
- Handle deconv opr in the pass from float to io16c32
- remove thread_local support in ios due to Xcode problem
- Fix refcnt counting problem in ParamPackSplit during multi-machine training
- Fix the crash problem that multiple models use the same compnode and enable record function under multithreading
- Fix nchw → nchwxx pass in processing conv_bias opr in the case of bias being empty
- Fix jit.trace when an error occurs that may make subsequent trace completely unavailable
- Fix bool.sum()
- Fix graph binding error handling that caused the graph to be malcollected
- Fix topk/warp/nms op when using jit.trace
- Fix group support for local conv2d operator
- Fix bug of optimize_for_inference in dump
- Fix bugs of NMSKeep, topk, warp_perspetive during trace
Compatibility violation
- Adjust names, paramters or import path of some functional API; delete dumplicated API