提交 · ca8c3dd9b5ab7e94dbe9690f5abd6b711446d776 · Greenplum / Opencv

20 3月, 2021 1 次提交

Merge pull request #19632 from l-bat:lb/ie_arm_target · c0dd82fb

由 Liubov Batanina 提交于 3月 20, 2021

Added OpenVINO ARM target

* Added IE ARM target

* Added OpenVINO ARM target

* Delete ARM target

* Detect ARM platform

* Changed device name in ArmPlugin

* Change ARM detection

c0dd82fb

09 2月, 2021 1 次提交
- I
  Merge pull request #19479 from ilyachur:remove_v0_multiply · 8fa01330
  由 Ilya Churaev 提交于 2月 09, 2021
```
* Switched to v1 Multiply

* Apply changes only for new OV
```
  8fa01330
04 2月, 2021 1 次提交
- A
  
  dnn: rename clamp() => normalize_axis() · 83aa7113
  由 Alexander Alekhin 提交于 1月 30, 2021
  
  83aa7113
18 11月, 2020 1 次提交

Merge pull request #18220 from Omar-AE:hddl-supported · a316b11a

由 Omar Alzaibaq 提交于 11月 17, 2020

* added HDDL VPU support

* changed to return True in one line if any device connected

* dnn: use releaseHDDLPlugin()

* dnn(hddl): fix conditions

a316b11a

26 5月, 2020 1 次提交
- L
  
  Switch ngraph::op::v1::Multiply to v0 · b236f107
  由 Liubov Batanina 提交于 5月 26, 2020
  
  b236f107
03 3月, 2020 2 次提交
- A
  dnn(IE): use HAVE_DNN_IE_NN_BUILDER_2019 for NN Builder API code · 124bf833
  由 Alexander Alekhin 提交于 3月 03, 2020
```
- CMake option: OPENCV_DNN_IE_NN_BUILDER_2019
```
  124bf833
- A
  dnn(IE): use HAVE_DNN_IE_NN_BUILDER_2019 for NN Builder API code · 29d21447
  由 Alexander Alekhin 提交于 2月 27, 2020
```
- CMake option: OPENCV_DNN_IE_NN_BUILDER_2019
```
  29d21447
14 1月, 2020 1 次提交
- L
  
  Enable acrossSpatial normalizeL2 on Myriad · be86338a
  由 Liubov Batanina 提交于 1月 14, 2020
  
  be86338a
02 12月, 2019 1 次提交
- L
  Merge pull request #15537 from l-bat:ngraph · 7523c777
  由 Lubov Batanina 提交于 12月 02, 2019
```
* Support nGraph

* Fix resize
```
  7523c777
21 10月, 2019 1 次提交

Merge pull request #14827 from YashasSamaga:cuda4dnn-csl-low · 613c12e5

由 Yashas Samaga B L 提交于 10月 21, 2019

CUDA backend for the DNN module

* stub cuda4dnn design

* minor fixes for tests and doxygen

* add csl public api directory to module headers

* add low-level CSL components

* add high-level CSL components

* integrate csl::Tensor into backbone code

* switch to CPU iff unsupported; otherwise, fail on error

* add fully connected layer

* add softmax layer

* add activation layers

* support arbitary rank TensorDescriptor

* pass input wrappers to `initCUDA()`

* add 1d/2d/3d-convolution

* add pooling layer

* reorganize and refactor code

* fixes for gcc, clang and doxygen; remove cxx14/17 code

* add blank_layer

* add LRN layer

* add rounding modes for pooling layer

* split tensor.hpp into tensor.hpp and tensor_ops.hpp

* add concat layer

* add scale layer

* add batch normalization layer

* split math.cu into activations.cu and math.hpp

* add eltwise layer

* add flatten layer

* add tensor transform api

* add asymmetric padding support for convolution layer

* add reshape layer

* fix rebase issues

* add permute layer

* add padding support for concat layer

* refactor and reorganize code

* add normalize layer

* optimize bias addition in scale layer

* add prior box layer

* fix and optimize normalize layer

* add asymmetric padding support for pooling layer

* add event API

* improve pooling performance for some padding scenarios

* avoid over-allocation of compute resources to kernels

* improve prior box performance

* enable layer fusion

* add const layer

* add resize layer

* add slice layer

* add padding layer

* add deconvolution layer

* fix channelwise  ReLU initialization

* add vector traits

* add vectorized versions of relu, clipped_relu, power

* add vectorized concat kernels

* improve concat_with_offsets performance

* vectorize scale and bias kernels

* add support for multi-billion element tensors

* vectorize prior box kernels

* fix address alignment check

* improve bias addition performance of conv/deconv/fc layers

* restructure code for supporting multiple targets

* add DNN_TARGET_CUDA_FP64

* add DNN_TARGET_FP16

* improve vectorization

* add region layer

* improve tensor API, add dynamic ranks

1. use ManagedPtr instead of a Tensor in backend wrapper
2. add new methods to tensor classes
  - size_range: computes the combined size of for a given axis range
  - tensor span/view can be constructed from a raw pointer and shape
3. the tensor classes can change their rank at runtime (previously rank was fixed at compile-time)
4. remove device code from tensor classes (as they are unused)
5. enforce strict conditions on tensor class APIs to improve debugging ability

* fix parametric relu activation

* add squeeze/unsqueeze tensor API

* add reorg layer

* optimize permute and enable 2d permute

* enable 1d and 2d slice

* add split layer

* add shuffle channel layer

* allow tensors of different ranks in reshape primitive

* patch SliceOp to allow Crop Layer

* allow extra shape inputs in reshape layer

* use `std::move_backward` instead of `std::move` for insert in resizable_static_array

* improve workspace management

* add spatial LRN

* add nms (cpu) to region layer

* add max pooling with argmax ( and a fix to limits.hpp)

* add max unpooling layer

* rename DNN_TARGET_CUDA_FP32 to DNN_TARGET_CUDA

* update supportBackend to be more rigorous

* remove stray include from preventing non-cuda build

* include op_cuda.hpp outside condition #if

* refactoring, fixes and many optimizations

* drop DNN_TARGET_CUDA_FP64

* fix gcc errors

* increase max. tensor rank limit to six

* add Interp layer

* drop custom layers; use BackendNode

* vectorize activation kernels

* fixes for gcc

* remove wrong assertion

* fix broken assertion in unpooling primitive

* fix build errors in non-CUDA build

* completely remove workspace from public API

* fix permute layer

* enable accuracy and perf. tests for DNN_TARGET_CUDA

* add asynchronous forward

* vectorize eltwise ops

* vectorize fill kernel

* fixes for gcc

* remove CSL headers from public API

* remove csl header source group from cmake

* update min. cudnn version in cmake

* add numerically stable FP32 log1pexp

* refactor code

* add FP16 specialization to cudnn based tensor addition

* vectorize scale1 and bias1 + minor refactoring

* fix doxygen build

* fix invalid alignment assertion

* clear backend wrappers before allocateLayers

* ignore memory lock failures

* do not allocate internal blobs

* integrate NVTX

* add numerically stable half precision log1pexp

* fix indentation, following coding style,  improve docs

* remove accidental modification of IE code

* Revert "add asynchronous forward"

This reverts commit 1154b9da9da07e9b52f8a81bdcea48cf31c56f70.

* [cmake] throw error for unsupported CC versions

* fix rebase issues

* add more docs, refactor code, fix bugs

* minor refactoring and fixes

* resolve warnings/errors from clang

* remove haveCUDA() checks from supportBackend()

* remove NVTX integration

* changes based on review comments

* avoid exception when no CUDA device is present

* add color code for CUDA in Net::dump

613c12e5

07 8月, 2019 1 次提交

Merge pull request #15184 from l-bat:IE_R2 · 0e1ef8f8

由 Lubov Batanina 提交于 8月 06, 2019

Support new IE API (#15184)

* Add support OpenVINO R2 for layers

* Add Core API

* Fix tests

* Fix expectNoFallbacksFromIE for ONNX nets

* Remove deprecated API

* Remove td

* Remove TargetDevice

* Fix Async

* Add test

* Fix detectMyriadX

* Fix test

* Fix warning

0e1ef8f8

14 6月, 2019 1 次提交
- D
  Merge pull request #14792 from dkurt:dnn_ie_min_version_r5 · eba696a4
  由 Dmitry Kurtaev 提交于 6月 14, 2019
```
* Remove Inference Engine 2018R3 and 2018R4

* Fix 2018R5
```
  eba696a4
16 4月, 2019 1 次提交
- D
  
  Fix Normalize layer for Mac · 62d079fa
  由 Dmitry Kurtaev 提交于 4月 16, 2019
  
  62d079fa
03 4月, 2019 1 次提交
- A
  
  dnn: use OpenVINO 2019R1 defines · 8483801e
  由 Alexander Alekhin 提交于 4月 01, 2019
  
  8483801e
19 2月, 2019 1 次提交
- D
  
  Fix IE backend considering future changes. · ca5976e3
  由 Dmitry Kurtaev 提交于 2月 14, 2019
  
  ca5976e3
14 2月, 2019 1 次提交
- L
  
  Changed condition for resize and lrn layers · 183c0fca
  由 Liubov Batanina 提交于 2月 13, 2019
  
  183c0fca
12 2月, 2019 1 次提交
- D
  
  Fix Intel's Inference Engine backend from future. Second try. · 0711dab0
  由 Dmitry Kurtaev 提交于 2月 11, 2019
  
  0711dab0
11 2月, 2019 1 次提交
- L
  
  Enabled tests on IE backend · 6b4becfd
  由 Liubov Batanina 提交于 2月 11, 2019
  
  6b4becfd
07 2月, 2019 1 次提交
- L
  
  Using IE backend for normalize layer tests · b068d26f
  由 Liubov Batanina 提交于 2月 07, 2019
  
  b068d26f
17 1月, 2019 1 次提交
- D
  
  Move Inference Engine to new API · f0ddf302
  由 Dmitry Kurtaev 提交于 1月 14, 2019
  
  f0ddf302
26 9月, 2018 1 次提交

Merge pull request #12565 from dkurt:dnn_non_intel_gpu · 24ab7515

由 Dmitry Kurtaev 提交于 9月 26, 2018

* Remove isIntel check from deep learning layers

* Remove fp16->fp32 fallbacks where it's not necessary

* Fix Kernel::run to prevent localsize > globalsize

24ab7515

06 9月, 2018 1 次提交

Merge pull request #12264 from dkurt:dnn_remove_forward_method · d486204a

由 Dmitry Kurtaev 提交于 9月 06, 2018

* Remove a forward method in dnn::Layer

* Add a test

* Fix tests

* Mark multiple dnn::Layer::finalize methods as deprecated

* Replace back dnn's inputBlobs to vector of pointers

* Remove Layer::forward_fallback from CV_OCL_RUN scopes

d486204a

21 8月, 2018 1 次提交
- M
  
  Fixed windows build with InferenceEngine · 808c89ad
  由 Maksim Shabunin 提交于 8月 14, 2018
  
  808c89ad
14 8月, 2018 1 次提交
- M
  
  Fixed windows build with InferenceEngine · eff0f9d3
  由 Maksim Shabunin 提交于 8月 14, 2018
  
  eff0f9d3
13 8月, 2018 1 次提交
- D
  
  UINT8 face detection network using Intel's Inference Engine backend · f056c0f1
  由 Dmitry Kurtaev 提交于 8月 13, 2018
  
  f056c0f1
24 7月, 2018 1 次提交
- M
  
  More issues found by static analysis · cbb1e867
  由 Maksim Shabunin 提交于 7月 24, 2018
  
  cbb1e867
04 6月, 2018 1 次提交
- D
  
  Make Intel's Inference Engine backend is default if no preferable backend is specified. · b781ac73
  由 Dmitry Kurtaev 提交于 6月 01, 2018
  
  b781ac73
23 5月, 2018 1 次提交
- M
  
  dnn: fixed IE support on Windows · 895e10c3
  由 Maksim Shabunin 提交于 5月 23, 2018
  
  895e10c3
16 5月, 2018 1 次提交
- L
  fp16 ocl support for more layers · ba5e8bef
  由 Li Peng 提交于 4月 26, 2018
```
Signed-off-by: NLi Peng <peng.li@intel.com>
```
  ba5e8bef
12 4月, 2018 1 次提交
- D
  
  Blank and L2-normalization layers from Intel's Inference Engine · b92c3182
  由 Dmitry Kurtaev 提交于 4月 12, 2018
  
  b92c3182
10 4月, 2018 1 次提交
- D
  
  Fuse tf.nn.l2_normalize layer · 1ba72ca0
  由 Dmitry Kurtaev 提交于 4月 04, 2018
  
  1ba72ca0
28 3月, 2018 1 次提交
- A
  
  dnn: apply CV_OVERRIDE/CV_FINAL · 1060c0f4
  由 Alexander Alekhin 提交于 3月 15, 2018
  
  1060c0f4
22 2月, 2018 1 次提交
- L
  Fix for opencv face detector ocl test · e7d35d51
  由 Li Peng 提交于 2月 22, 2018
```
Signed-off-by: NLi Peng <peng.li@intel.com>
```
  e7d35d51
05 1月, 2018 1 次提交
- L
  add normalize_bbox layer ocl implementation · 67f9406c
  由 Li Peng 提交于 1月 04, 2018
```
Signed-off-by: NLi Peng <peng.li@intel.com>
```
  67f9406c
09 11月, 2017 1 次提交

Add new layer forward interface · 8f990837

由 Li Peng 提交于 11月 09, 2017

Add layer forward interface with InputArrayOfArrays and
OutputArrayOfArrays parameters, it allows UMat buffer to be
processed and transferred in the layers.
Signed-off-by: NLi Peng <peng.li@intel.com>

8f990837

11 10月, 2017 1 次提交
- D
  
  Removed LPNormalize layer. · 905a9dad
  由 Dmitry Kurtaev 提交于 8月 28, 2017
  
  905a9dad
28 6月, 2017 2 次提交

A

dnn: added trace macros · ed103833
由 Alexander Alekhin 提交于 6月 28, 2017

ed103833

another round of dnn optimization (#9011) · 8b3d6603

由 Vadim Pisarevsky 提交于 6月 28, 2017

* another round of dnn optimization:
* increased malloc alignment across OpenCV from 16 to 64 bytes to make it AVX2 and even AVX-512 friendly
* improved SIMD optimization of pooling layer, optimized average pooling
* cleaned up convolution layer implementation
* made activation layer "attacheable" to all other layers, including fully connected and addition layer.
* fixed bug in the fusion algorithm: "LayerData::consumers" should not be cleared, because it desctibes the topology.
* greatly optimized permutation layer, which improved SSD performance
* parallelized element-wise binary/ternary/... ops (sum, prod, max)

* also, added missing copyrights to many of the layer implementation files

* temporarily disabled (again) the check for intermediate blobs consistency; fixed warnings from various builders

8b3d6603

26 6月, 2017 1 次提交

dnn: move module from opencv_contrib · 93729784

由 Alexander Alekhin 提交于 6月 26, 2017

https://github.com/opencv/opencv_contrib/tree/e6f63c7a38ca40c5dc33e38736e3027e3528d6cb/modules/dnn

93729784

Greenplum / Opencv 大约 1 年 前同步成功

Greenplum / Opencv
大约 1 年前同步成功