未验证 提交 59c3b281 编写于 作者: A Ali Jahani 提交者: GitHub

GPU-Suport: Mask-RCNN + Minor GPU fixes (#2714)

* fixed cpu mask rcnn+preparation for gpu
* fix-limit gpu memory to 30% of total memory per worker
Co-authored-by: NNikita Manovich <nikita.manovich@intel.com>
上级 daedff42
......@@ -10,6 +10,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### Added
- CVAT-3D: support lidar data on the server side (<https://github.com/openvinotoolkit/cvat/pull/2534>)
- GPU support for Mask-RCNN and improvement in its deployment time (<https://github.com/openvinotoolkit/cvat/pull/2714>)
- CVAT-3D: Load all frames corresponding to the job instance
(<https://github.com/openvinotoolkit/cvat/pull/2645>)
- Intelligent scissors with OpenCV javascript (<https://github.com/openvinotoolkit/cvat/pull/2689>)
......@@ -23,7 +24,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Updated HTTPS install README section (cleanup and described more robust deploy)
- Logstash is improved for using with configurable elasticsearch outputs (<https://github.com/openvinotoolkit/cvat/pull/2531>)
- Bumped nuclio version to 1.5.16
- Bumped nuclio version to 1.5.16 (<https://github.com/openvinotoolkit/cvat/pull/2578>)
- All methods for interative segmentation accept negative points as well
- Persistent queue added to logstash (<https://github.com/openvinotoolkit/cvat/pull/2744>)
......@@ -36,7 +37,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
-
### Fixed
- More robust execution of nuclio GPU functions by limiting the GPU memory consumption per worker (<https://github.com/openvinotoolkit/cvat/pull/2714>)
- Kibana startup initialization (<https://github.com/openvinotoolkit/cvat/pull/2659>)
- The cursor jumps to the end of the line when renaming a task (<https://github.com/openvinotoolkit/cvat/pull/2669>)
- SSLCertVerificationError when remote source is used (<https://github.com/openvinotoolkit/cvat/pull/2683>)
......
......@@ -122,10 +122,10 @@ You develop CVAT under WSL (Windows subsystem for Linux) following next steps.
### DL models as serverless functions
Install [nuclio platform](https://github.com/nuclio/nuclio):
Follow this [guide](/cvat/apps/documentation/installation_automatic_annotation.md) to install Nuclio:
- You have to install `nuctl` command line tool to build and deploy serverless
functions. Download [the latest release](https://github.com/nuclio/nuclio/blob/development/docs/reference/nuctl/nuctl.md#download).
functions.
- The simplest way to explore Nuclio is to run its graphical user interface (GUI)
of the Nuclio dashboard. All you need in order to run the dashboard is Docker. See
[nuclio documentation](https://github.com/nuclio/nuclio#quick-start-steps)
......
......@@ -80,7 +80,7 @@ For more information about supported formats look at the
| [f-BRS](/serverless/pytorch/saic-vul/fbrs/nuclio) | interactor | PyTorch | X | |
| [Inside-Outside Guidance](/serverless/pytorch/shiyinzhang/iog/nuclio) | interactor | PyTorch | X | |
| [Faster RCNN](/serverless/tensorflow/faster_rcnn_inception_v2_coco/nuclio) | detector | TensorFlow | X | X |
| [Mask RCNN](/serverless/tensorflow/matterport/mask_rcnn/nuclio) | detector | TensorFlow | X | |
| [Mask RCNN](/serverless/tensorflow/matterport/mask_rcnn/nuclio) | detector | TensorFlow | X | X |
<!--lint enable maximum-line-length-->
......
......@@ -290,7 +290,7 @@ docker-compose -f docker-compose.yml \
### Semi-automatic and automatic annotation
Please follow [instructions](/cvat/apps/documentation/installation_automatic_annotation.md)
Please follow this [guide](/cvat/apps/documentation/installation_automatic_annotation.md).
### Stop all containers
......
......@@ -53,47 +53,80 @@
- See [deploy_cpu.sh](/serverless/deploy_cpu.sh) for more examples.
#### GPU Support
You will need to install Nvidia Container Toolkit and make sure your docker supports GPU. Follow [Nvidia docker instructions](https://www.tensorflow.org/install/docker#gpu_support).
Also you will need to add `--resource-limit nvidia.com/gpu=1` to the nuclio deployment command.
You will need to install [Nvidia Container Toolkit](https://www.tensorflow.org/install/docker#gpu_support).
Also you will need to add `--resource-limit nvidia.com/gpu=1 --triggers '{"myHttpTrigger": {"maxWorkers": 1}}'` to
the nuclio deployment command. You can increase the maxWorker if you have enough GPU memory.
As an example, below will run on the GPU:
```bash
nuctl deploy tf-faster-rcnn-inception-v2-coco-gpu \
--project-name cvat --path "serverless/tensorflow/faster_rcnn_inception_v2_coco/nuclio" --platform local \
--base-image tensorflow/tensorflow:2.1.1-gpu \
--desc "Faster RCNN from Tensorflow Object Detection GPU API" \
--image cvat/tf.faster_rcnn_inception_v2_coco_gpu \
nuctl deploy --project-name cvat \
--path `pwd`/tensorflow/matterport/mask_rcnn/nuclio \
--platform local --base-image tensorflow/tensorflow:1.15.5-gpu-py3 \
--desc "GPU based implementation of Mask RCNN on Python 3, Keras, and TensorFlow." \
--image cvat/tf.matterport.mask_rcnn_gpu
--triggers '{"myHttpTrigger": {"maxWorkers": 1}}' \
--resource-limit nvidia.com/gpu=1
```
**Note:**
- Since the model is loaded during deployment, the number of GPU functions you can deploy will be limited to your GPU memory.
- The number of GPU deployed functions will be limited to your GPU memory.
- See [deploy_gpu.sh](/serverless/deploy_gpu.sh) script for more examples.
####Debugging Nuclio Functions:
**Troubleshooting Nuclio Functions:**
- You can open nuclio dashboard at [localhost:8070](http://localhost:8070). Make sure status of your functions are up and running without any error.
- Test your deployed DL model as a serverless function. The command below should work on Linux and Mac OS.
```bash
image=$(curl https://upload.wikimedia.org/wikipedia/en/7/7d/Lenna_%28test_image%29.png --output - | base64 | tr -d '\n')
cat << EOF > /tmp/input.json
{"image": "$image"}
EOF
cat /tmp/input.json | nuctl invoke openvino.omz.public.yolo-v3-tf -c 'application/json'
```
- To check for internal server errors, run `docker ps -a` to see the list of containers. Find the container that you are interested, e.g. `nuclio-nuclio-tf-faster-rcnn-inception-v2-coco-gpu`. Then check its logs by
<details>
```bash
docker logs <name of your container>
20.07.17 12:07:44.519 nuctl.platform.invoker (I) Executing function {"method": "POST", "url": "http://:57308", "headers": {"Content-Type":["application/json"],"X-Nuclio-Log-Level":["info"],"X-Nuclio-Target":["openvino.omz.public.yolo-v3-tf"]}}
20.07.17 12:07:45.275 nuctl.platform.invoker (I) Got response {"status": "200 OK"}
20.07.17 12:07:45.275 nuctl (I) >>> Start of function logs
20.07.17 12:07:45.275 ino.omz.public.yolo-v3-tf (I) Run yolo-v3-tf model {"worker_id": "0", "time": 1594976864570.9353}
20.07.17 12:07:45.275 nuctl (I) <<< End of function logs
> Response headers:
Date = Fri, 17 Jul 2020 09:07:45 GMT
Content-Type = application/json
Content-Length = 100
Server = nuclio
> Response body:
[
{
"confidence": "0.9992254",
"label": "person",
"points": [
39,
124,
408,
512
],
"type": "rectangle"
}
]
```
</details>
- To check for internal server errors, run `docker ps -a` to see the list of containers.
Find the container that you are interested, e.g., `nuclio-nuclio-tf-faster-rcnn-inception-v2-coco-gpu`.
Then check its logs by `docker logs <name of your container>`
e.g.,
```bash
docker logs nuclio-nuclio-tf-faster-rcnn-inception-v2-coco-gpu
```
- If you would like to debug a code inside a container, you can use vscode to directly attach to a container [instructions](https://code.visualstudio.com/docs/remote/attach-container). To apply your changes, make sure to restart the container.
- To debug a code inside a container, you can use vscode to attach to a container [instructions](https://code.visualstudio.com/docs/remote/attach-container).
To apply your changes, make sure to restart the container.
```bash
docker restart <name_of_the_container>
```
> **⚠ WARNING:**
> Do not use nuclio dashboard to stop the container because with any modifications, it rebuilds the container and you will lose your changes.
......@@ -8,8 +8,18 @@ nuctl create project cvat
nuctl deploy --project-name cvat \
--path "$SCRIPT_DIR/tensorflow/faster_rcnn_inception_v2_coco/nuclio" \
--platform local --base-image tensorflow/tensorflow:2.1.1-gpu \
--desc "Faster RCNN from Tensorflow Object Detection GPU API" \
--desc "GPU based Faster RCNN from Tensorflow Object Detection API" \
--image cvat/tf.faster_rcnn_inception_v2_coco_gpu \
--resource-limit nvidia.com/gpu=1
--triggers '{"myHttpTrigger": {"maxWorkers": 1}}' \
--resource-limit nvidia.com/gpu=1 --verbose
nuctl deploy --project-name cvat \
--path "$SCRIPT_DIR/tensorflow/matterport/mask_rcnn/nuclio" \
--platform local --base-image tensorflow/tensorflow:1.15.5-gpu-py3 \
--desc "GPU based implementation of Mask RCNN on Python 3, Keras, and TensorFlow." \
--image cvat/tf.matterport.mask_rcnn_gpu\
--triggers '{"myHttpTrigger": {"maxWorkers": 1}}' \
--resource-limit nvidia.com/gpu=1 --verbose
nuctl get function
......@@ -15,9 +15,10 @@ class ModelLoader:
serialized_graph = fid.read()
od_graph_def.ParseFromString(serialized_graph)
tf.import_graph_def(od_graph_def, name='')
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
gpu_fraction = 0.333
gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=gpu_fraction,
allow_growth=True)
config = tf.ConfigProto(gpu_options=gpu_options)
self.session = tf.Session(graph=detection_graph, config=config)
self.image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
......
......@@ -102,22 +102,19 @@ spec:
value: /opt/nuclio/Mask_RCNN
build:
image: cvat/tf.matterport.mask_rcnn
baseImage: tensorflow/tensorflow:2.1.0-py3
baseImage: tensorflow/tensorflow:1.13.1-py3
directives:
postCopy:
- kind: WORKDIR
value: /opt/nuclio
- kind: RUN
value: apt update && apt install --no-install-recommends -y git curl libsm6 libxext6 libgl1-mesa-glx
value: apt update && apt install --no-install-recommends -y git curl
- kind: RUN
value: git clone https://github.com/matterport/Mask_RCNN.git
value: git clone --depth 1 https://github.com/matterport/Mask_RCNN.git
- kind: RUN
value: curl -L https://github.com/matterport/Mask_RCNN/releases/download/v2.0/mask_rcnn_coco.h5 -o Mask_RCNN/mask_rcnn_coco.h5
- kind: RUN
value: pip3 install scipy cython matplotlib scikit-image opencv-python-headless h5py \
imgaug IPython[all] tensorflow==1.13.1 keras==2.1.0 pillow pyyaml
- kind: RUN
value: pip3 install pycocotools
value: pip3 install numpy cython pyyaml keras==2.1.0 scikit-image Pillow
triggers:
myHttpTrigger:
......
# Copyright (C) 2018-2020 Intel Corporation
# Copyright (C) 2020-2021 Intel Corporation
#
# SPDX-License-Identifier: MIT
......@@ -6,24 +6,13 @@ import os
import numpy as np
import sys
from skimage.measure import find_contours, approximate_polygon
# workaround for tf.placeholder() is not compatible with eager execution
# https://github.com/tensorflow/tensorflow/issues/18165
import tensorflow as tf
tf.compat.v1.disable_eager_execution()
#import tensorflow.compat.v1 as tf
# tf.disable_v2_behavior()
# The directory should contain a clone of
# https://github.com/matterport/Mask_RCNN repository and
# downloaded mask_rcnn_coco.h5 model.
MASK_RCNN_DIR = os.path.abspath(os.environ.get('MASK_RCNN_DIR'))
if MASK_RCNN_DIR:
sys.path.append(MASK_RCNN_DIR) # To find local version of the library
sys.path.append(os.path.join(MASK_RCNN_DIR, 'samples/coco'))
from mrcnn import model as modellib
import coco
from mrcnn.config import Config
class ModelLoader:
def __init__(self, labels):
......@@ -31,12 +20,21 @@ class ModelLoader:
if COCO_MODEL_PATH is None:
raise OSError('Model path env not found in the system.')
class InferenceConfig(coco.CocoConfig):
# Set batch size to 1 since we'll be running inference on
# one image at a time. Batch size = GPU_COUNT * IMAGES_PER_GPU
class InferenceConfig(Config):
NAME = "coco"
NUM_CLASSES = 1 + 80 # COCO has 80 classes
GPU_COUNT = 1
IMAGES_PER_GPU = 1
# Limit gpu memory to 30% to allow for other nuclio gpu functions. Increase fraction as you like
import keras.backend.tensorflow_backend as ktf
def get_session(gpu_fraction=0.333):
gpu_options = tf.GPUOptions(
per_process_gpu_memory_fraction=gpu_fraction,
allow_growth=True)
return tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))
ktf.set_session(get_session())
# Print config details
self.config = InferenceConfig()
self.config.display()
......@@ -54,7 +52,7 @@ class ModelLoader:
for i in range(len(output["rois"])):
score = output["scores"][i]
class_id = output["class_ids"][i]
mask = output["masks"][:,:,i]
mask = output["masks"][:, :, i]
if score >= threshold:
mask = mask.astype(np.uint8)
contours = find_contours(mask, MASK_THRESHOLD)
......@@ -74,6 +72,4 @@ class ModelLoader:
"type": "polygon",
})
return results
return results
\ No newline at end of file
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册