Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
MegEngine 天元
MegEngine
提交
5257991e
MegEngine
项目概览
MegEngine 天元
/
MegEngine
大约 1 年 前同步成功
通知
396
Star
4704
Fork
582
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
0
列表
看板
标记
里程碑
合并请求
0
DevOps
流水线
流水线任务
计划
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
MegEngine
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
0
Issue
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
DevOps
DevOps
流水线
流水线任务
计划
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
流水线任务
提交
Issue看板
体验新版 GitCode,发现更多精彩内容 >>
提交
5257991e
编写于
6月 19, 2020
作者:
M
Megvii Engine Team
提交者:
Xu Xinran
6月 22, 2020
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
fix(jit): fix jit doc and add NCHW44_DOT
GitOrigin-RevId: 5f5feae8e727dd111615022f2a21c7ede647156a
上级
cdf25c4a
变更
1
隐藏空白更改
内联
并排
Showing
1 changed file
with
32 addition
and
10 deletion
+32
-10
python_module/megengine/jit/__init__.py
python_module/megengine/jit/__init__.py
+32
-10
未找到文件。
python_module/megengine/jit/__init__.py
浏览文件 @
5257991e
...
...
@@ -442,17 +442,38 @@ class trace:
Serialize trace to file system.
:param fpath: positional only argument. Path of output file.
:param arg_names: names of the input tensors in the traced function
:param append: whether output is appended to ``fpath``
:param f16_io_f32_comp: whether to use float16 for I/O between oprs and use
:param arg_names: names of the input tensors in the traced function.
:param append: whether output is appended to ``fpath``.
:param optimize_for_inference: whether to enable optimize_for_inference
pass before dump.
:param enable_io16xc32: whether to use float16 for I/O between oprs and use
float32 as internal computation precision. Note the output var would be
changed to float16
:param f16_io_comp: whether to use float16 for both I/O and computation
precision
:param use_nhwcd4: whether to use NHWCD4 data format. This is faster on some
OpenCL devices
:param fuse_conv_bias_nonlinearity: whether to fuse conv+bias+nonlinearty
into one opr. This is supported only in NHWCD4 format.
changed to float16.
:param enable_ioc16: whether to use float16 for both I/O and computation
precision.
:param enable_hwcd4: whether to use NHWCD4 data layout. This is faster on some
OpenCL backend.
:param enable_nchw88: whether to use NCHW4 data layout. it currently
used in X86 AVX backend.
:param enable_nchw44: whether to use NCHW4 data layout. it currently
used in arm backend.
:param enable_nchw44_dot: whether to use NCHW4 data layout. it currently
used in armv8.2+dotprod backend.
:param enable_nchw4: whether to use NCHW4 data layout. it currently
used in nvidia backend(based on cudnn).
:param enable_nchw32 whether to use NCHW32 data layout. it currently
used in nvidia backend with tensorcore(based on cudnn).
:param enable_chwn4 whether to use CHWN4 data layout. it currently
used in nvidia backend with tensorcore.
:param enable_fuse_conv_bias_nonlinearity: whether to fuse conv+bias+nonlinearty
into one opr.
:param enable_fuse_conv_bias_with_z: whether to fuse conv_bias with z
input for inference on nvidia backend(this optimization pass will
result in mismatch of the precision of output of training and
inference)
"""
if
self
.
_status
!=
self
.
_FINISHED
:
raise
ValueError
(
"not traced"
)
...
...
@@ -475,6 +496,7 @@ class trace:
"enable_nchw88"
:
"use_nchw88"
,
"enable_nchw32"
:
"use_nchw32"
,
"enable_nchw44"
:
"use_nchw44"
,
"enable_nchw44_dot"
:
"use_nchw44_dot"
,
"enable_chwn4"
:
"use_chwn4"
,
"enable_fuse_conv_bias_nonlinearity"
:
"fuse_conv_bias_nonlinearity"
,
"enable_fuse_conv_bias_with_z"
:
"fuse_conv_bias_with_z"
,
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录