https://gitcode.net/wjd2002/ncnn/-/commit/4c861a0d1a4569c0a8d7d14e9e163e71686e0745Add Building with Intel oneAPI (#4920)2023-08-06T21:41:12+08:00mizu-baishiragawa4519@outlook.comhttps://gitcode.net/wjd2002/ncnn/-/commit/0a8cf31a0583026f115e243dcced1fe901cdbbe3Add POWER8 VSX toolchains (#4853)2023-08-06T22:16:34+08:00JeremyRand244188+JeremyRand@users.noreply.github.com
* Add POWER8 VSX toolchains
POWER8, though slower than POWER9, is still used in the wild; these
toolchains should still be much faster on POWER8 than POWER8 without VSX
optimizations.
* VSX toolchains: set -cpu arg in QEMU CI testshttps://gitcode.net/wjd2002/ncnn/-/commit/60fedae38b2eeab557a846e4bdcb30b697778daefix pnnx ghost reshape shape expression inputs, fix intmax overflow on fuse/e...2023-08-07T17:28:15+08:00nihuinihuini@tencent.comhttps://gitcode.net/wjd2002/ncnn/-/commit/285d0793d402763556f7f412dd1f0936a689b587pnnx fuse expression for scalar-like attribute and unbind chain (#4928)2023-08-10T14:24:24+08:00nihuinihuini@tencent.comhttps://gitcode.net/wjd2002/ncnn/-/commit/4abadd2ffb75bf209ded4254771be077fc1847b6binaryop implicit broadcast B with 1 dimension rank for outer axis (#4930)2023-08-10T21:29:49+08:00nihuinihuini@tencent.comhttps://gitcode.net/wjd2002/ncnn/-/commit/ffe1510c2f90134628a9751f600d59c10a98682dBump pypa/cibuildwheel from 2.13.1 to 2.15.0 (#4926)2023-08-11T11:11:21+08:00dependabot[bot]49699333+dependabot[bot]@users.noreply.github.com
Bumps [pypa/cibuildwheel](<a href="https://github.com/pypa/cibuildwheel" rel="nofollow noreferrer noopener" target="_blank">https://github.com/pypa/cibuildwheel</a>) from 2.13.1 to 2.15.0.
- [Release notes](<a href="https://github.com/pypa/cibuildwheel/releases" rel="nofollow noreferrer noopener" target="_blank">https://github.com/pypa/cibuildwheel/releases</a>)
- [Changelog](<a href="https://github.com/pypa/cibuildwheel/blob/main/docs/changelog.md" rel="nofollow noreferrer noopener" target="_blank">https://github.com/pypa/cibuildwheel/blob/main/docs/changelog.md</a>)
- [Commits](<a href="https://github.com/pypa/cibuildwheel/compare/v2.13.1...v2.15.0" rel="nofollow noreferrer noopener" target="_blank">https://github.com/pypa/cibuildwheel/compare/v2.13.1...v2.15.0</a>)
---
updated-dependencies:
- dependency-name: pypa/cibuildwheel
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: <span data-trailer="Signed-off-by:"><a href="mailto:support@github.com" title="support@github.com"></a><a href="javascript:void(0)" class="avatar s16 avatar-inline identicon bg5" style="text-decoration: none">N</a><a href="mailto:support@github.com" title="support@github.com">dependabot[bot]</a> <<a href="mailto:support@github.com" title="support@github.com">support@github.com</a>></span>
Co-authored-by: <span data-trailer="Co-authored-by:"><a href="mailto:49699333+dependabot%5Bbot%5D@users.noreply.github.com" title="49699333+dependabot[bot]@users.noreply.github.com"></a><a href="javascript:void(0)" class="avatar s16 avatar-inline identicon bg5" style="text-decoration: none">N</a><a href="mailto:49699333+dependabot%5Bbot%5D@users.noreply.github.com" title="49699333+dependabot[bot]@users.noreply.github.com">dependabot[bot]</a> <<a href="mailto:49699333+dependabot%5Bbot%5D@users.noreply.github.com" title="49699333+dependabot[bot]@users.noreply.github.com">49699333+dependabot[bot]@users.noreply.github.com</a>></span>https://gitcode.net/wjd2002/ncnn/-/commit/a24787b32b32acb2d6d365a6bdd8426d92ad74d0feat(benchmark/benchncnn.cpp): support user defined case (#4782)2023-08-11T11:17:24+08:00tpoisonoookhj.application@aliyun.comhttps://gitcode.net/wjd2002/ncnn/-/commit/75e10c6e6157b9c632199a22ae4951a292aa725dSupport mac platform static library compilation (#4859)2023-08-11T11:19:18+08:00佰阅43716063+Baiyuetribe@users.noreply.github.comhttps://gitcode.net/wjd2002/ncnn/-/commit/e80fcbca8f67cf107beebb4dd0333856879dc6faprefer faster and larger device local only memory on amd integrated graphics,...2023-08-12T19:43:30+08:00nihuinihuini@tencent.comprefer faster and larger device local only memory on amd integrated graphics, heap budget value follows the same strategy as blob allocator (#4936)
https://gitcode.net/wjd2002/ncnn/-/commit/070a6d40f27525427dd1c12153019a21f8fe9ac4support torch.t to ncnn (#4940)2023-08-14T15:46:31+08:00WXB64680548+XiaBing992@users.noreply.github.comhttps://gitcode.net/wjd2002/ncnn/-/commit/fed3b43c730d3ef6154beef1f1bb1c8f8fc68de1Add logxxx to log comp xxx rewriter where xxx = sigmoid or softmax (#4925)2023-08-15T17:21:39+08:00lrw042428592483@qq.com
* Add logxxx to log comp xxx rewriter
* Use pattern matching for LogSigmoid and LogSoftmax
* Add conversion passes for functional counterparts
* Update documentationhttps://gitcode.net/wjd2002/ncnn/-/commit/93e395dc4b8f24b30d64e0ac08448223df10ce1bpnnx convert torch maximum minimum and torch max min as expression (#4944)2023-08-15T17:22:44+08:00nihuinihuini@tencent.com
* reset device check dtype kind int
* placeholder for ncnn sign
* convert torch maximum minimum
* torch.max as expression
* torch.min as expressionhttps://gitcode.net/wjd2002/ncnn/-/commit/00da9251b1d986ca0d20a34efd7a79b897731ff0update python ci version (#4946)2023-08-15T23:12:50+08:00nihuinihuini@tencent.comhttps://gitcode.net/wjd2002/ncnn/-/commit/cbd838f670c94a589f60820e1cde0dc0af38bbb3[docs] Clean comments and prints when find vulkan (#4948)2023-08-15T23:43:49+08:00Zhuo Zhangimzhuo@foxmail.comhttps://gitcode.net/wjd2002/ncnn/-/commit/39721eeb9400e33f4708a36f6eb8f61e2ad3d53crequire c++17 for building with new protobuf (#4947)2023-08-16T11:48:56+08:00nihuinihuini@tencent.comhttps://gitcode.net/wjd2002/ncnn/-/commit/6b657a39cbee172a17b7ca8d66171197a17fd611fix _mm512_i32gather_epi32 and other scatter/gather routines have incorrect s...2023-08-19T22:56:19+08:00青菜萝卜冬瓜i@mail.chainsx.cnhttps://gitcode.net/wjd2002/ncnn/-/commit/cb674ac5eddb32f0709a60c81f71d2cbc6bc89dafix build with toolchain defined _L _U constants (#4957)2023-08-21T10:48:45+08:00nihuinihuini@tencent.com
**[how to build ncnn library](https://github.com/Tencent/ncnn/wiki/how-to-build) on Linux / Windows / macOS / Raspberry Pi3, Pi4 / Android / NVIDIA Jetson / iOS / WebAssembly / AllWinner D1 / Loongson 2K1000**
**[how to build ncnn library](https://github.com/Tencent/ncnn/wiki/how-to-build) on Linux / Windows / macOS / Raspberry Pi3, Pi4 / POWER / Android / NVIDIA Jetson / iOS / WebAssembly / AllWinner D1 / Loongson 2K1000**
-[Build for Linux / NVIDIA Jetson / Raspberry Pi3, Pi4 / POWER9](https://github.com/Tencent/ncnn/wiki/how-to-build#build-for-linux)
-[Build for Linux / NVIDIA Jetson / Raspberry Pi3, Pi4 / POWER](https://github.com/Tencent/ncnn/wiki/how-to-build#build-for-linux)
-[Build for Windows x64 using VS2017](https://github.com/Tencent/ncnn/wiki/how-to-build#build-for-windows-x64-using-visual-studio-community-2017)
-[Build for macOS](https://github.com/Tencent/ncnn/wiki/how-to-build#build-for-macos)
-[Build for ARM Cortex-A family with cross-compiling](https://github.com/Tencent/ncnn/wiki/how-to-build#build-for-arm-cortex-a-family-with-cross-compiling)
-[Build for Windows x64 using Visual Studio Community 2017](#build-for-windows-x64-using-visual-studio-community-2017)
-[Build for macOS](#build-for-macos)
...
...
@@ -88,9 +89,9 @@ You can add `-GNinja` to `cmake` above to use Ninja build system (invoke build u
For Rasberry Pi 3 on 32bit OS, add `-DCMAKE_TOOLCHAIN_FILE=../toolchains/pi3.toolchain.cmake` to cmake. You can also consider disabling Vulkan support as the Vulkan drivers for Rasberry Pi are still not mature, but it doesn't hurt to build the support in, but not use it.
#### POWER9
#### POWER
With Clang 13 or higher:
For POWER9 with Clang 13 or higher:
```shell
cd ncnn
...
...
@@ -102,7 +103,17 @@ make -j$(nproc)
Earlier versions of Clang may fail to build ncnn due to [Bug 49864](https://github.com/llvm/llvm-project/issues/49864). To use GCC instead, use the `power9le-linux-gnu-vsx.toolchain.cmake` toolchain file instead. Note that according to benchmarks, Clang appears to produce noticeably faster CPU inference than GCC for POWER9 targets.
Note that the POWER9 toolchain files only support little-endian mode.
For POWER8 instead of POWER9, use the `power8le-linux-gnu-vsx.clang.toolchain.cmake` or `power8le-linux-gnu-vsx.toolchain.cmake` toolchain file instead. POWER8 will be slower than POWER9.
Note that the POWER toolchain files only support little-endian mode.
#### Intel oneAPI
Besides the prerequests in this section, Intel oneAPI BaseKit and HPCKit should be installed. They are available from https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit.html and https://www.intel.com/content/www/us/en/developer/tools/oneapi/hpc-toolkit.html freely.
Intel oneAPI offers two kinds of compilers, the classic `icc/icpc` and the LLVM based `icx/icpx`. To build with these compilers, add `CC=icc CXX=icpc` or `CC=icx CXX=icpx` before the `cmake` command. When compiling with `icc/icpc`, cmake will warn that `xop`, `avx512`, and `bf16` extensions are not supported by the compiler, while `icx/icpx` works well.
Both of these compilers have been tested and passed the ncnn benchmark successfully. The results have been included in ncnn benchmark readme. Generally, `icx/icpx` are likely to show better performance than `icc/icpc` and the quantized models can benefit from the extensions `icx/icpx` supports.