OpenPose accepts variable output size for images

55bfb6d5 · gineshidalgo99 · 264b3209 · 55bfb6d5 · 55bfb6d5 · 55bfb6d5
60 changed file
--- a/3rdparty/Versions.txt
+++ b/3rdparty/Versions.txt
 Unix:
    - Caffe:
-        - Version 1.0.0, extracted from GitHub on 08/20/2017 from the current master branch.
+        - Version 1.0.0, extracted from GitHub on 09/30/2017 from the current master branch.
        - Link: https://github.com/BVLC/caffe

 Windows:

--- a/3rdparty/caffe/python/caffe/_caffe.cpp
+++ b/3rdparty/caffe/python/caffe/_caffe.cpp
@@ -464,6 +464,14 @@ BOOST_PYTHON_MODULE(_caffe) {
    .add_property("count",    static_cast<int (Blob<Dtype>::*)() const>(
        &Blob<Dtype>::count))
    .def("reshape",           bp::raw_function(&Blob_Reshape))
+#ifndef CPU_ONLY
+    .add_property("_gpu_data_ptr",
+        reinterpret_cast<uintptr_t (Blob<Dtype>::*)()>(
+          &Blob<Dtype>::mutable_gpu_data))
+    .add_property("_gpu_diff_ptr",
+        reinterpret_cast<uintptr_t (Blob<Dtype>::*)()>(
+          &Blob<Dtype>::mutable_gpu_diff))
+#endif
    .add_property("data",     bp::make_function(&Blob<Dtype>::mutable_cpu_data,
          NdarrayCallPolicies()))
    .add_property("diff",     bp::make_function(&Blob<Dtype>::mutable_cpu_diff,

--- a/3rdparty/caffe/python/caffe/classifier.py
+++ b/3rdparty/caffe/python/caffe/classifier.py
@@ -92,7 +92,7 @@ class Classifier(caffe.Net):

        # For oversampling, average predictions across crops.
        if oversample:
-            predictions = predictions.reshape((len(predictions) / 10, 10, -1))
+            predictions = predictions.reshape((len(predictions) // 10, 10, -1))
            predictions = predictions.mean(1)

        return predictions
--- a/README.md
+++ b/README.md
@@ -11,10 +11,11 @@ OpenPose is a **library for real-time multi-person keypoint detection and multi-


 ## Latest News
- Jul 2017: **Windows**, new [**portable demo**](doc/installation.md#installation---demo) **and** [**easier library installation**](doc/installation.md#installation---library)!
+- Sep 2017: **CMake** installer!
+- Jul 2017: [**Windows portable demo**](doc/installation.md#installation---demo)!
 - Jul 2017: **Hands** released!
 - Jun 2017: **Face** released!
- May 2017: **Windows** version released!
+- May 2017: **Windows** version!
 - Apr 2017: **Body** released!
 - Check all the [release notes](doc/release_notes.md).


--- a/doc/demo_overview.md
+++ b/doc/demo_overview.md
@@ -107,8 +107,8 @@ We enumerate some of the most important flags, check the `Flags Detailed Descrip
 - `--part_to_show`: Prediction channel to visualize.
 - `--no_display`: Display window not opened. Useful for servers and/or to slightly speed up OpenPose.
 - `--num_gpu 2 --num_gpu_start 1`: Parallelize over this number of GPUs starting by the desired device id. By default it uses all the available GPUs.
- `--net_resolution 656x368 --resolution 1280x720`: For HD input (default values).
- `--net_resolution 496x368 --resolution 640x480`: For VGA input.
+- `--net_resolution 656x368: For HD input (default value).
+- `--net_resolution 496x368: For VGA input.
 - `--model_pose MPI`: Model to use, affects number keypoints, speed and accuracy.
 - `--logging_level 3`: Logging messages threshold, range [0,255]: 0 will output any message & 255 will output none. Current messages in the range [1-4], 1 for low priority messages and 4 for important ones.

@@ -135,7 +135,7 @@ Each flag is divided into flag name, default value, and description.

 3. OpenPose
 - DEFINE_string(model_folder,             "models/",      "Folder path (absolute or relative) where the models (pose, face, ...) are located.");
- DEFINE_string(resolution,               "1280x720",     "The image resolution (display and output). Use \"-1x-1\" to force the program to use the default images resolution.");
+- DEFINE_string(output_resolution,        "-1x-1",        "The image resolution (display and output). Use \"-1x-1\" to force the program to use the input image resolution.");
 - DEFINE_int32(num_gpu,                   -1,             "The number of GPU devices to use. If negative, it will use all the available GPUs in your machine.");
 - DEFINE_int32(num_gpu_start,             0,              "GPU device start number.");
 - DEFINE_int32(keypoint_scale,            0,              "Scaling of the (x,y) coordinates of the final pose data array, i.e. the scale of the (x,y) coordinates that will be saved with the `write_keypoint` & `write_keypoint_json` flags. Select `0` to scale it to the original source resolution, `1`to scale it to the net output size (set with `net_resolution`), `2` to scale it to the final output size (set with `resolution`), `3` to scale it in the range [0,1], and 4 for range [-1,1]. Non related with `scale_number` and `scale_gap`.");

--- a/doc/release_notes.md
+++ b/doc/release_notes.md
@@ -113,8 +113,18 @@ OpenPose Library - Release Notes



-## Current version (future OpenPose 1.1.1)
+## Current version (future OpenPose 1.2.0)
 1. Main improvements:
    1. COCO JSON file outputs 0 as score for non-detected keypoints.
    2. Added example for OpenPose for user asynchronous output and cleaned all `tutorial_wrapper/` examples.
    3. Added `-1` option for `net_resolution` in order to auto-select the best possible aspect ratio given the user input.
+    4. Output images can have the input size, OpenPose able to change its size for each image and not required fixed size anymore.
+        1. FrameDisplayer accepts variable size images by rescaling every time a frame with bigger width or height is displayed (gui module).
+        2. OpOutputToCvMat & GuiInfoAdder does not require to know the output size at construction time, deduced from each image.
+        3. CvMatToOutput and Renderers allow to keep input resolution as output for images (core module).
+2. Functions or parameters renamed:
+    1. OpenPose able to change its size and initial size:
+        1. Flag `resolution` renamed as `output_resolution`.
+        2. FrameDisplayer, GuiInfoAdder and Gui constructors arguments modified (gui module).
+        3. OpOutputToCvMat constructor removed (core module).
+        4. New Renders classes to split GpuRenderers from CpuRenderers.
--- a/examples/openpose/openpose.cpp
+++ b/examples/openpose/openpose.cpp
@@ -51,8 +51,8 @@ DEFINE_bool(process_real_time,          false,          "Enable to keep the orig
                                                        " too long, it will skip frames. If it is too fast, it will slow it down.");
 // OpenPose
 DEFINE_string(model_folder,             "models/",      "Folder path (absolute or relative) where the models (pose, face, ...) are located.");
-DEFINE_string(resolution,               "1280x720",     "The image resolution (display and output). Use \"-1x-1\" to force the program to use the"
-                                                        " default images resolution.");
+DEFINE_string(output_resolution,        "-1x-1",        "The image resolution (display and output). Use \"-1x-1\" to force the program to use the"
+                                                        " input image resolution.");
 DEFINE_int32(num_gpu,                   -1,             "The number of GPU devices to use. If negative, it will use all the available GPUs in your"
                                                        " machine.");
 DEFINE_int32(num_gpu_start,             0,              "GPU device start number.");
@@ -172,7 +172,7 @@ int openPoseDemo()

    // Applying user defined configuration - Google flags to program variables
    // outputSize
-    const auto outputSize = op::flagsToPoint(FLAGS_resolution, "1280x720");
+    const auto outputSize = op::flagsToPoint(FLAGS_output_resolution, "-1x-1");
    // netInputSize
    const auto netInputSize = op::flagsToPoint(FLAGS_net_resolution, "-1x368");
    // faceNetInputSize

--- a/examples/tests/pose_accuracy_coco_test.sh
+++ b/examples/tests/pose_accuracy_coco_test.sh
@@ -21,13 +21,13 @@ JSON_FOLDER=../evaluation/coco_val_jsons/
 OP_BIN=./build/examples/openpose/openpose.bin

    # 1 scale
-$OP_BIN --image_dir $IMAGE_FOLDER --write_coco_json ${JSON_FOLDER}1.json --no_display --render_pose 0 --frame_last 3558
+$OP_BIN --image_dir $IMAGE_FOLDER --write_coco_json ${JSON_FOLDER}1.json --no_display --render_pose 0 --frame_last 3558 --output_resolution "1280x720"

 #     # 3 scales
-# $OP_BIN --image_dir $IMAGE_FOLDER --write_coco_json ${JSON_FOLDER}1_3.json --no_display --render_pose 0 --scale_number 3 --scale_gap 0.25 --frame_last 3558
+# $OP_BIN --image_dir $IMAGE_FOLDER --write_coco_json ${JSON_FOLDER}1_3.json --no_display --render_pose 0 --scale_number 3 --scale_gap 0.25 --frame_last 3558 --output_resolution "1280x720"

 #     # 4 scales
-# $OP_BIN --num_gpu 1 --image_dir $IMAGE_FOLDER --write_coco_json ${JSON_FOLDER}1_4.json --no_display --render_pose 0 --num_gpu 1 --scale_number 4 --scale_gap 0.25 --net_resolution "1312x736" --frame_last 3558
+# $OP_BIN --num_gpu 1 --image_dir $IMAGE_FOLDER --write_coco_json ${JSON_FOLDER}1_4.json --no_display --render_pose 0 --num_gpu 1 --scale_number 4 --scale_gap 0.25 --net_resolution "1312x736" --frame_last 3558 --output_resolution "1280x720"

 # Debugging - Rendered frames saved
 # $OP_BIN --image_dir $IMAGE_FOLDER --write_images ${JSON_FOLDER}frameOutput --no_display
--- a/examples/tests/wrapperHandFromJsonTest.hpp
+++ b/examples/tests/wrapperHandFromJsonTest.hpp
@@ -209,10 +209,9 @@ namespace op
            if (displayGui)
            {
                // Construct hand renderer
-                const auto handRenderer = std::make_shared<HandRenderer>(finalOutputSize, wrapperStructHand.renderThreshold,
-                                                                         wrapperStructHand.alphaKeypoint,
-                                                                         wrapperStructHand.alphaHeatMap,
-                                                                         wrapperStructHand.renderMode);
+                const auto handRenderer = std::make_shared<HandCpuRenderer>(wrapperStructHand.renderThreshold,
+                                                                            wrapperStructHand.alphaKeypoint,
+                                                                            wrapperStructHand.alphaHeatMap);
                // Add worker
                cpuRenderers.emplace_back(std::make_shared<WHandRenderer<TDatumsPtr>>(handRenderer));
            }
@@ -226,7 +225,7 @@ namespace op
            if (displayGui)
            {
                mPostProcessingWs = mergeWorkers(mPostProcessingWs, cpuRenderers);
-                const auto opOutputToCvMat = std::make_shared<OpOutputToCvMat>(finalOutputSize);
+                const auto opOutputToCvMat = std::make_shared<OpOutputToCvMat>();
                mPostProcessingWs.emplace_back(std::make_shared<WOpOutputToCvMat<TDatumsPtr>>(opOutputToCvMat));
            }
            // Re-scale pose if desired
@@ -249,10 +248,10 @@ namespace op
            spWGui = nullptr;
            if (displayGui)
            {
-                const auto guiInfoAdder = std::make_shared<GuiInfoAdder>(finalOutputSize, gpuNumber, displayGui);
+                const auto guiInfoAdder = std::make_shared<GuiInfoAdder>(gpuNumber, displayGui);
                mOutputWs.emplace_back(std::make_shared<WGuiInfoAdder<TDatumsPtr>>(guiInfoAdder));
                const auto gui = std::make_shared<Gui>(
-                    false, finalOutputSize, mThreadManager.getIsRunningSharedPtr()
+                    finalOutputSize, false, mThreadManager.getIsRunningSharedPtr()
                );
                spWGui = {std::make_shared<WGui<TDatumsPtr>>(gui)};
            }

--- a/examples/tutorial_pose/1_extract_from_image.cpp
+++ b/examples/tutorial_pose/1_extract_from_image.cpp
@@ -37,8 +37,8 @@ DEFINE_string(net_resolution,           "656x368",      "Multiples of 16. If it
                                                        " any of the dimensions, OP will choose the optimal aspect ratio depending on the user's"
                                                        " input value. E.g. the default `-1x368` is equivalent to `656x368` in 16:9 resolutions,"
                                                        " e.g. full HD (1980x1080) and HD (1280x720) resolutions.");
-DEFINE_string(resolution,               "1280x720",     "The image resolution (display and output). Use \"-1x-1\" to force the program to use the"
-                                                        " default images resolution.");
+DEFINE_string(output_resolution,        "-1x-1",        "The image resolution (display and output). Use \"-1x-1\" to force the program to use the"
+                                                        " input image resolution.");
 DEFINE_int32(num_gpu_start,             0,              "GPU device start number.");
 DEFINE_double(scale_gap,                0.3,            "Scale gap between scales. No effect unless scale_number > 1. Initial scale is always 1."
                                                        " If you want to change the initial scale, you actually want to multiply the"
@@ -67,7 +67,7 @@ int openPoseTutorialPose1()
    op::log("", op::Priority::Low, __LINE__, __FUNCTION__, __FILE__);
    // Step 2 - Read Google flags (user defined configuration)
    // outputSize
-    const auto outputSize = op::flagsToPoint(FLAGS_resolution, "1280x720");
+    const auto outputSize = op::flagsToPoint(FLAGS_output_resolution, "-1x-1");
    // netInputSize
    const auto netInputSize = op::flagsToPoint(FLAGS_net_resolution, "-1x368");
    // netOutputSize
@@ -86,11 +86,9 @@ int openPoseTutorialPose1()
    op::CvMatToOpOutput cvMatToOpOutput{outputSize};
    op::PoseExtractorCaffe poseExtractorCaffe{netInputSize, netOutputSize, outputSize, FLAGS_scale_number, poseModel,
                                              FLAGS_model_folder, FLAGS_num_gpu_start};
-    op::PoseRenderer poseRenderer{netOutputSize, outputSize, poseModel, nullptr, (float)FLAGS_render_threshold,
-                                  !FLAGS_disable_blending, (float)FLAGS_alpha_pose};
-    op::OpOutputToCvMat opOutputToCvMat{outputSize};
-    const op::Point<int> windowedSize = outputSize;
-    op::FrameDisplayer frameDisplayer{windowedSize, "OpenPose Tutorial - Example 1"};
+    op::PoseCpuRenderer poseRenderer{poseModel, (float)FLAGS_render_threshold, !FLAGS_disable_blending, (float)FLAGS_alpha_pose};
+    op::OpOutputToCvMat opOutputToCvMat;
+    op::FrameDisplayer frameDisplayer{"OpenPose Tutorial - Example 1", outputSize};
    // Step 4 - Initialize resources on desired thread (in this case single thread, i.e. we init resources here)
    poseExtractorCaffe.initializationOnThread();
    poseRenderer.initializationOnThread();

--- a/examples/tutorial_pose/2_extract_pose_or_heatmat_from_image.cpp
+++ b/examples/tutorial_pose/2_extract_pose_or_heatmat_from_image.cpp
@@ -37,8 +37,8 @@ DEFINE_string(net_resolution,           "656x368",      "Multiples of 16. If it
                                                        " any of the dimensions, OP will choose the optimal aspect ratio depending on the user's"
                                                        " input value. E.g. the default `-1x368` is equivalent to `656x368` in 16:9 resolutions,"
                                                        " e.g. full HD (1980x1080) and HD (1280x720) resolutions.");
-DEFINE_string(resolution,               "1280x720",     "The image resolution (display and output). Use \"-1x-1\" to force the program to use the"
-                                                        " default images resolution.");
+DEFINE_string(output_resolution,        "-1x-1",        "The image resolution (display and output). Use \"-1x-1\" to force the program to use the"
+                                                        " input image resolution.");
 DEFINE_int32(num_gpu_start,             0,              "GPU device start number.");
 DEFINE_double(scale_gap,                0.3,            "Scale gap between scales. No effect unless scale_number > 1. Initial scale is always 1."
                                                        " If you want to change the initial scale, you actually want to multiply the"
@@ -72,7 +72,7 @@ int openPoseTutorialPose2()
    op::log("", op::Priority::Low, __LINE__, __FUNCTION__, __FILE__);
    // Step 2 - Read Google flags (user defined configuration)
    // outputSize
-    const auto outputSize = op::flagsToPoint(FLAGS_resolution, "1280x720");
+    const auto outputSize = op::flagsToPoint(FLAGS_output_resolution, "-1x-1");
    // netInputSize
    const auto netInputSize = op::flagsToPoint(FLAGS_net_resolution, "-1x368");
    // netOutputSize
@@ -92,15 +92,14 @@ int openPoseTutorialPose2()
    std::shared_ptr<op::PoseExtractor> poseExtractorPtr = std::make_shared<op::PoseExtractorCaffe>(netInputSize, netOutputSize, outputSize,
                                                                                                   FLAGS_scale_number, poseModel,
                                                                                                   FLAGS_model_folder, FLAGS_num_gpu_start);
-    op::PoseRenderer poseRenderer{netOutputSize, outputSize, poseModel, poseExtractorPtr, (float)FLAGS_render_threshold,
-                                  !FLAGS_disable_blending, (float)FLAGS_alpha_pose, (float)FLAGS_alpha_heatmap};
-    poseRenderer.setElementToRender(FLAGS_part_to_show);
-    op::OpOutputToCvMat opOutputToCvMat{outputSize};
-    const op::Point<int> windowedSize = outputSize;
-    op::FrameDisplayer frameDisplayer{windowedSize, "OpenPose Tutorial - Example 2"};
+    op::PoseGpuRenderer poseGpuRenderer{netOutputSize, poseModel, poseExtractorPtr, (float)FLAGS_render_threshold,
+                                        !FLAGS_disable_blending, (float)FLAGS_alpha_pose, (float)FLAGS_alpha_heatmap};
+    poseGpuRenderer.setElementToRender(FLAGS_part_to_show);
+    op::OpOutputToCvMat opOutputToCvMat;
+    op::FrameDisplayer frameDisplayer{"OpenPose Tutorial - Example 2", outputSize};
    // Step 4 - Initialize resources on desired thread (in this case single thread, i.e. we init resources here)
    poseExtractorPtr->initializationOnThread();
-    poseRenderer.initializationOnThread();
+    poseGpuRenderer.initializationOnThread();

    // ------------------------- POSE ESTIMATION AND RENDERING -------------------------
    // Step 1 - Read and load image, error if empty (possibly wrong path)
@@ -119,7 +118,7 @@ int openPoseTutorialPose2()
    const auto poseKeypoints = poseExtractorPtr->getPoseKeypoints();
    const auto scaleNetToOutput = poseExtractorPtr->getScaleNetToOutput();
    // Step 4 - Render pose
-    poseRenderer.renderPose(outputArray, poseKeypoints, scaleNetToOutput);
+    poseGpuRenderer.renderPose(outputArray, poseKeypoints, scaleNetToOutput);
    // Step 5 - OpenPose output format to cv::Mat
    auto outputImage = opOutputToCvMat.formatToCvMat(outputArray);


--- a/examples/tutorial_thread/1_openpose_read_and_display.cpp
+++ b/examples/tutorial_thread/1_openpose_read_and_display.cpp
@@ -37,8 +37,8 @@ DEFINE_string(image_dir,                "",             "Process a directory of
 DEFINE_bool(process_real_time,          false,          "Enable to keep the original source frame rate (e.g. for video). If the processing time is"
                                                        " too long, it will skip frames. If it is too fast, it will slow it down.");
 // OpenPose
-DEFINE_string(resolution,               "1280x720",     "The image resolution (display and output). Use \"-1x-1\" to force the program to use the"
-                                                        " default images resolution.");
+DEFINE_string(output_resolution,        "-1x-1",        "The image resolution (display and output). Use \"-1x-1\" to force the program to use the"
+                                                        " input image resolution.");
 // Consumer
 DEFINE_bool(fullscreen,                 false,          "Run in full-screen mode (press f during runtime to toggle).");

@@ -53,7 +53,7 @@ int openPoseTutorialThread1()
    op::ConfigureLog::setPriorityThreshold((op::Priority)FLAGS_logging_level);
    // Step 2 - Read Google flags (user defined configuration)
    // outputSize
-    auto outputSize = op::flagsToPoint(FLAGS_resolution, "1280x720");
+    const auto outputSize = op::flagsToPoint(FLAGS_output_resolution, "-1x-1");
    // producerType
    const auto producerSharedPtr = op::flagsToProducer(FLAGS_image_dir, FLAGS_video, FLAGS_camera, FLAGS_camera_resolution, FLAGS_camera_fps);
    const auto displayProducerFpsMode = (FLAGS_process_real_time ? op::ProducerFpsMode::OriginalFps : op::ProducerFpsMode::RetrievalFps);
@@ -65,14 +65,6 @@ int openPoseTutorialThread1()
    videoSeekSharedPtr->second = 0;
    const op::Point<int> producerSize{(int)producerSharedPtr->get(CV_CAP_PROP_FRAME_WIDTH),
                                (int)producerSharedPtr->get(CV_CAP_PROP_FRAME_HEIGHT)};
-    if (outputSize.x == -1 || outputSize.y == -1)
-    {
-        if (producerSize.area() > 0)
-            outputSize = producerSize;
-        else
-            op::error("Output resolution = input resolution not valid for image reading (size might change between images).",
-                      __LINE__, __FUNCTION__, __FILE__);
-    }
    // Step 4 - Setting thread workers && manager
    typedef std::vector<op::Datum> TypedefDatumsNoPtr;
    typedef std::shared_ptr<TypedefDatumsNoPtr> TypedefDatums;
@@ -82,7 +74,7 @@ int openPoseTutorialThread1()
    auto DatumProducer = std::make_shared<op::DatumProducer<TypedefDatumsNoPtr>>(producerSharedPtr);
    auto wDatumProducer = std::make_shared<op::WDatumProducer<TypedefDatums, TypedefDatumsNoPtr>>(DatumProducer);
    // GUI (Display)
-    auto gui = std::make_shared<op::Gui>(FLAGS_fullscreen, outputSize, threadManager.getIsRunningSharedPtr());
+    auto gui = std::make_shared<op::Gui>(outputSize, FLAGS_fullscreen, threadManager.getIsRunningSharedPtr());
    auto wGui = std::make_shared<op::WGui<TypedefDatums>>(gui);

    // ------------------------- CONFIGURING THREADING -------------------------

--- a/examples/tutorial_thread/2_user_processing_function.cpp
+++ b/examples/tutorial_thread/2_user_processing_function.cpp
@@ -38,8 +38,8 @@ DEFINE_string(image_dir,                "",             "Process a directory of
 DEFINE_bool(process_real_time,          false,          "Enable to keep the original source frame rate (e.g. for video). If the processing time is"
                                                        " too long, it will skip frames. If it is too fast, it will slow it down.");
 // OpenPose
-DEFINE_string(resolution,               "1280x720",     "The image resolution (display and output). Use \"-1x-1\" to force the program to use the"
-                                                        " default images resolution.");
+DEFINE_string(output_resolution,        "-1x-1",        "The image resolution (display and output). Use \"-1x-1\" to force the program to use the"
+                                                        " input image resolution.");
 // Consumer
 DEFINE_bool(fullscreen,                 false,          "Run in full-screen mode (press f during runtime to toggle).");

@@ -87,7 +87,7 @@ int openPoseTutorialThread2()
    op::ConfigureLog::setPriorityThreshold((op::Priority)FLAGS_logging_level);
    // Step 2 - Read Google flags (user defined configuration)
    // outputSize
-    auto outputSize = op::flagsToPoint(FLAGS_resolution, "1280x720");
+    const auto outputSize = op::flagsToPoint(FLAGS_output_resolution, "-1x-1");
    // producerType
    const auto producerSharedPtr = op::flagsToProducer(FLAGS_image_dir, FLAGS_video, FLAGS_camera, FLAGS_camera_resolution, FLAGS_camera_fps);
    const auto displayProducerFpsMode = (FLAGS_process_real_time ? op::ProducerFpsMode::OriginalFps : op::ProducerFpsMode::RetrievalFps);
@@ -99,14 +99,6 @@ int openPoseTutorialThread2()
    videoSeekSharedPtr->second = 0;
    const op::Point<int> producerSize{(int)producerSharedPtr->get(CV_CAP_PROP_FRAME_WIDTH),
                                (int)producerSharedPtr->get(CV_CAP_PROP_FRAME_HEIGHT)};
-    if (outputSize.x == -1 || outputSize.y == -1)
-    {
-        if (producerSize.area() > 0)
-            outputSize = producerSize;
-        else
-            op::error("Output resolution = input resolution not valid for image reading (size might change between images).",
-                      __LINE__, __FUNCTION__, __FILE__);
-    }
    // Step 4 - Setting thread workers && manager
    typedef std::vector<op::Datum> TypedefDatumsNoPtr;
    typedef std::shared_ptr<TypedefDatumsNoPtr> TypedefDatums;
@@ -118,7 +110,7 @@ int openPoseTutorialThread2()
    // Specific WUserClass
    auto wUserClass = std::make_shared<WUserClass>();
    // GUI (Display)
-    auto gui = std::make_shared<op::Gui>(FLAGS_fullscreen, outputSize, threadManager.getIsRunningSharedPtr());
+    auto gui = std::make_shared<op::Gui>(outputSize, FLAGS_fullscreen, threadManager.getIsRunningSharedPtr());
    auto wGui = std::make_shared<op::WGui<TypedefDatums>>(gui);

    // ------------------------- CONFIGURING THREADING -------------------------

--- a/examples/tutorial_wrapper/1_user_asynchronous_output.cpp
+++ b/examples/tutorial_wrapper/1_user_asynchronous_output.cpp
@@ -51,8 +51,8 @@ DEFINE_bool(process_real_time,          false,          "Enable to keep the orig
                                                        " too long, it will skip frames. If it is too fast, it will slow it down.");
 // OpenPose
 DEFINE_string(model_folder,             "models/",      "Folder path (absolute or relative) where the models (pose, face, ...) are located.");
-DEFINE_string(resolution,               "1280x720",     "The image resolution (display and output). Use \"-1x-1\" to force the program to use the"
-                                                        " default images resolution.");
+DEFINE_string(output_resolution,        "-1x-1",        "The image resolution (display and output). Use \"-1x-1\" to force the program to use the"
+                                                        " input image resolution.");
 DEFINE_int32(num_gpu,                   -1,             "The number of GPU devices to use. If negative, it will use all the available GPUs in your"
                                                        " machine.");
 DEFINE_int32(num_gpu_start,             0,              "GPU device start number.");
@@ -239,7 +239,7 @@ int openPoseTutorialWrapper3()

    // Applying user defined configuration - Google flags to program variables
    // outputSize
-    const auto outputSize = op::flagsToPoint(FLAGS_resolution, "1280x720");
+    const auto outputSize = op::flagsToPoint(FLAGS_output_resolution, "-1x-1");
    // netInputSize
    const auto netInputSize = op::flagsToPoint(FLAGS_net_resolution, "-1x368");
    // faceNetInputSize

--- a/examples/tutorial_wrapper/2_user_synchronous.cpp
+++ b/examples/tutorial_wrapper/2_user_synchronous.cpp
@@ -35,8 +35,8 @@ DEFINE_int32(logging_level,             3,              "The logging level. Inte
 DEFINE_string(image_dir,                "examples/media/",      "Process a directory of images. Read all standard formats (jpg, png, bmp, etc.).");
 // OpenPose
 DEFINE_string(model_folder,             "models/",      "Folder path (absolute or relative) where the models (pose, face, ...) are located.");
-DEFINE_string(resolution,               "1280x720",     "The image resolution (display and output). Use \"-1x-1\" to force the program to use the"
-                                                        " default images resolution.");
+DEFINE_string(output_resolution,        "-1x-1",        "The image resolution (display and output). Use \"-1x-1\" to force the program to use the"
+                                                        " input image resolution.");
 DEFINE_int32(num_gpu,                   -1,             "The number of GPU devices to use. If negative, it will use all the available GPUs in your"
                                                        " machine.");
 DEFINE_int32(num_gpu_start,             0,              "GPU device start number.");
@@ -319,7 +319,7 @@ int openPoseTutorialWrapper2()

    // Applying user defined configuration - Google flags to program variables
    // outputSize
-    const auto outputSize = op::flagsToPoint(FLAGS_resolution, "1280x720");
+    const auto outputSize = op::flagsToPoint(FLAGS_output_resolution, "-1x-1");
    // netInputSize
    const auto netInputSize = op::flagsToPoint(FLAGS_net_resolution, "-1x368");
    // faceNetInputSize

--- a/examples/tutorial_wrapper/3_user_asynchronous.cpp
+++ b/examples/tutorial_wrapper/3_user_asynchronous.cpp
@@ -35,8 +35,8 @@ DEFINE_int32(logging_level,             3,              "The logging level. Inte
 DEFINE_string(image_dir,                "examples/media/",      "Process a directory of images. Read all standard formats (jpg, png, bmp, etc.).");
 // OpenPose
 DEFINE_string(model_folder,             "models/",      "Folder path (absolute or relative) where the models (pose, face, ...) are located.");
-DEFINE_string(resolution,               "1280x720",     "The image resolution (display and output). Use \"-1x-1\" to force the program to use the"
-                                                        " default images resolution.");
+DEFINE_string(output_resolution,        "-1x-1",        "The image resolution (display and output). Use \"-1x-1\" to force the program to use the"
+                                                        " input image resolution.");
 DEFINE_int32(num_gpu,                   -1,             "The number of GPU devices to use. If negative, it will use all the available GPUs in your"
                                                        " machine.");
 DEFINE_int32(num_gpu_start,             0,              "GPU device start number.");
@@ -278,7 +278,7 @@ int openPoseTutorialWrapper1()

    // Applying user defined configuration - Google flags to program variables
    // outputSize
-    const auto outputSize = op::flagsToPoint(FLAGS_resolution, "1280x720");
+    const auto outputSize = op::flagsToPoint(FLAGS_output_resolution, "-1x-1");
    // netInputSize
    const auto netInputSize = op::flagsToPoint(FLAGS_net_resolution, "-1x368");
    // faceNetInputSize

--- a/examples_beta/openpose3d/openpose3d.cpp
+++ b/examples_beta/openpose3d/openpose3d.cpp
@@ -35,8 +35,8 @@ DEFINE_int32(logging_level,             3,              "The logging level. Inte
                                                        " low priority messages and 4 for important ones.");
 // OpenPose
 DEFINE_string(model_folder,             "models/",      "Folder path (absolute or relative) where the models (pose, face, ...) are located.");
-DEFINE_string(resolution,               "1280x720",     "The image resolution (display and output). Use \"-1x-1\" to force the program to use the"
-                                                        " default images resolution.");
+DEFINE_string(output_resolution,        "-1x-1",        "The image resolution (display and output). Use \"-1x-1\" to force the program to use the"
+                                                        " input image resolution.");
 DEFINE_int32(num_gpu,                   -1,             "The number of GPU devices to use. If negative, it will use all the available GPUs in your"
                                                        " machine.");
 DEFINE_int32(num_gpu_start,             0,              "GPU device start number.");
@@ -153,7 +153,7 @@ int openpose3d()

    // Applying user defined configuration - Google flags to program variables
    // outputSize
-    const auto outputSize = op::flagsToPoint(FLAGS_resolution, "1280x720");
+    const auto outputSize = op::flagsToPoint(FLAGS_output_resolution, "-1x-1");
    // netInputSize
    const auto netInputSize = op::flagsToPoint(FLAGS_net_resolution, "-1x368");
    // faceNetInputSize

--- a/include/openpose/core/cvMatToOpOutput.hpp
+++ b/include/openpose/core/cvMatToOpOutput.hpp
@@ -9,7 +9,8 @@ namespace op
    class OP_API CvMatToOpOutput
    {
    public:
-        CvMatToOpOutput(const Point<int>& outputResolution, const bool generateOutput = true);
+        // Use outputResolution <= {0,0} to keep input resolution
+        CvMatToOpOutput(const Point<int>& outputResolution = Point<int>{0, 0}, const bool generateOutput = true);

        std::tuple<double, Array<float>> format(const cv::Mat& cvInputData) const;


--- a/include/openpose/core/gpuRenderer.hpp
+++ b/include/openpose/core/gpuRenderer.hpp
+#ifndef OPENPOSE_CORE_GPU_RENDERER_HPP
+#define OPENPOSE_CORE_GPU_RENDERER_HPP
+
+#include <atomic>
+#include <tuple>
+#include <openpose/core/common.hpp>
+#include <openpose/core/renderer.hpp>
+
+namespace op
+{
+    class OP_API GpuRenderer : public Renderer
+    {
+    public:
+        explicit GpuRenderer(const float renderThreshold, const float alphaKeypoint,
+                             const float alphaHeatMap, const bool blendOriginalFrame = true,
+                             const unsigned int elementToRender = 0u, const unsigned int numberElementsToRender = 0u);
+
+        ~GpuRenderer();
+
+        std::tuple<std::shared_ptr<float*>, std::shared_ptr<bool>, std::shared_ptr<std::atomic<unsigned int>>,
+                   std::shared_ptr<std::atomic<unsigned long long>>, std::shared_ptr<const unsigned int>>
+                   getSharedParameters();
+
+        void setSharedParametersAndIfLast(const std::tuple<std::shared_ptr<float*>, std::shared_ptr<bool>,
+                                                           std::shared_ptr<std::atomic<unsigned int>>,
+                                                           std::shared_ptr<std::atomic<unsigned long long>>,
+                                                           std::shared_ptr<const unsigned int>>& tuple,
+                                          const bool isLast);
+
+    protected:
+        std::shared_ptr<float*> spGpuMemory;
+
+        void cpuToGpuMemoryIfNotCopiedYet(const float* const cpuMemory, const unsigned long long memoryVolume);
+
+        void gpuToCpuMemoryIfLastRenderer(float* cpuMemory, const unsigned long long memoryVolume);
+
+    private:
+        std::shared_ptr<std::atomic<unsigned long long>> spVolume;
+        bool mIsFirstRenderer;
+        bool mIsLastRenderer;
+        std::shared_ptr<bool> spGpuMemoryAllocated;
+
+        DELETE_COPY(GpuRenderer);
+    };
+}
+
+#endif // OPENPOSE_CORE_GPU_RENDERER_HPP
--- a/include/openpose/core/headers.hpp
+++ b/include/openpose/core/headers.hpp
@@ -8,6 +8,7 @@
 #include <openpose/core/cvMatToOpOutput.hpp>
 #include <openpose/core/datum.hpp>
 #include <openpose/core/enumClasses.hpp>
+#include <openpose/core/gpuRenderer.hpp>
 #include <openpose/core/keypointScaler.hpp>
 #include <openpose/core/macros.hpp>
 #include <openpose/core/net.hpp>

--- a/include/openpose/core/opOutputToCvMat.hpp
+++ b/include/openpose/core/opOutputToCvMat.hpp
@@ -9,12 +9,7 @@ namespace op
    class OP_API OpOutputToCvMat
    {
    public:
-        explicit OpOutputToCvMat(const Point<int>& outputResolution);
-
        cv::Mat formatToCvMat(const Array<float>& outputData) const;
-
-    private:
-        const std::array<int, 3> mOutputResolution;
    };
 }


--- a/include/openpose/core/renderer.hpp
+++ b/include/openpose/core/renderer.hpp
@@ -2,7 +2,6 @@
 #define OPENPOSE_CORE_RENDERER_HPP

 #include <atomic>
-#include <tuple>
 #include <openpose/core/common.hpp>

 namespace op
@@ -10,23 +9,17 @@ namespace op
    class OP_API Renderer
    {
    public:
-        explicit Renderer(const unsigned long long volume, const float alphaKeypoint, const float alphaHeatMap,
-                          const unsigned int elementToRender = 0u, const unsigned int numberElementsToRender = 0u);
-
-        ~Renderer();
-
-        void initializationOnThread();
+        explicit Renderer(const float renderThreshold, const float alphaKeypoint, const float alphaHeatMap,
+                          const bool blendOriginalFrame = true, const unsigned int elementToRender = 0u,
+                          const unsigned int numberElementsToRender = 0u);

        void increaseElementToRender(const int increment);

        void setElementToRender(const int elementToRender);

-        std::tuple<std::shared_ptr<float*>, std::shared_ptr<bool>, std::shared_ptr<std::atomic<unsigned int>>,
-                   std::shared_ptr<const unsigned int>> getSharedParameters();
+        bool getBlendOriginalFrame() const;

-        void setSharedParametersAndIfLast(const std::tuple<std::shared_ptr<float*>, std::shared_ptr<bool>,
-                                                           std::shared_ptr<std::atomic<unsigned int>>,
-                                                           std::shared_ptr<const unsigned int>>& tuple, const bool isLast);
+        void setBlendOriginalFrame(const bool blendOriginalFrame);

        float getAlphaKeypoint() const;

@@ -36,22 +29,20 @@ namespace op

        void setAlphaHeatMap(const float alphaHeatMap);

+        bool getShowGooglyEyes() const;
+
+        void setShowGooglyEyes(const bool showGooglyEyes);
+
    protected:
-        std::shared_ptr<float*> spGpuMemoryPtr;
+        const float mRenderThreshold;
+        std::atomic<bool> mBlendOriginalFrame;
        std::shared_ptr<std::atomic<unsigned int>> spElementToRender;
        std::shared_ptr<const unsigned int> spNumberElementsToRender;
-
-        void cpuToGpuMemoryIfNotCopiedYet(const float* const cpuMemory);
-
-        void gpuToCpuMemoryIfLastRenderer(float* cpuMemory);
+        std::atomic<bool> mShowGooglyEyes;

    private:
-        const unsigned long long mVolume;
        float mAlphaKeypoint;
        float mAlphaHeatMap;
-        bool mIsFirstRenderer;
-        bool mIsLastRenderer;
-        std::shared_ptr<bool> spGpuMemoryAllocated;

        DELETE_COPY(Renderer);
    };

--- a/include/openpose/face/faceCpuRenderer.hpp
+++ b/include/openpose/face/faceCpuRenderer.hpp
+#ifndef OPENPOSE_FACE_FACE_CPU_RENDERER_HPP
+#define OPENPOSE_FACE_FACE_CPU_RENDERER_HPP
+
+#include <openpose/core/common.hpp>
+#include <openpose/core/renderer.hpp>
+#include <openpose/face/faceParameters.hpp>
+#include <openpose/face/faceRenderer.hpp>
+
+namespace op
+{
+    class OP_API FaceCpuRenderer : public Renderer, public FaceRenderer
+    {
+    public:
+        FaceCpuRenderer(const float renderThreshold, const float alphaKeypoint = FACE_DEFAULT_ALPHA_KEYPOINT,
+                        const float alphaHeatMap = FACE_DEFAULT_ALPHA_HEAT_MAP);
+
+        void renderFace(Array<float>& outputData, const Array<float>& faceKeypoints);
+
+        DELETE_COPY(FaceCpuRenderer);
+    };
+}
+
+#endif // OPENPOSE_FACE_FACE_CPU_RENDERER_HPP
--- a/include/openpose/face/faceGpuRenderer.hpp
+++ b/include/openpose/face/faceGpuRenderer.hpp
+#ifndef OPENPOSE_FACE_FACE_GPU_RENDERER_HPP
+#define OPENPOSE_FACE_FACE_GPU_RENDERER_HPP
+
+#include <openpose/core/common.hpp>
+#include <openpose/core/gpuRenderer.hpp>
+#include <openpose/face/faceParameters.hpp>
+#include <openpose/face/faceRenderer.hpp>
+
+namespace op
+{
+    class OP_API FaceGpuRenderer : public GpuRenderer, public FaceRenderer
+    {
+    public:
+        FaceGpuRenderer(const float renderThreshold,
+                        const float alphaKeypoint = FACE_DEFAULT_ALPHA_KEYPOINT,
+                        const float alphaHeatMap = FACE_DEFAULT_ALPHA_HEAT_MAP);
+
+        ~FaceGpuRenderer();
+
+        void initializationOnThread();
+
+        void renderFace(Array<float>& outputData, const Array<float>& faceKeypoints);
+
+    private:
+        float* pGpuFace; // GPU aux memory
+
+        DELETE_COPY(FaceGpuRenderer);
+    };
+}
+
+#endif // OPENPOSE_FACE_FACE_GPU_RENDERER_HPP
--- a/include/openpose/face/faceRenderer.hpp
+++ b/include/openpose/face/faceRenderer.hpp
@@ -2,38 +2,15 @@
 #define OPENPOSE_FACE_FACE_RENDERER_HPP

 #include <openpose/core/common.hpp>
-#include <openpose/core/enumClasses.hpp>
-#include <openpose/core/renderer.hpp>
-#include <openpose/face/faceParameters.hpp>
-#include <openpose/thread/worker.hpp>

 namespace op
 {
-    class OP_API FaceRenderer : public Renderer
+    class OP_API FaceRenderer
    {
    public:
-        FaceRenderer(const Point<int>& frameSize, const float renderThreshold,
-                     const float alphaKeypoint = FACE_DEFAULT_ALPHA_KEYPOINT,
-                     const float alphaHeatMap = FACE_DEFAULT_ALPHA_HEAT_MAP,
-                     const RenderMode renderMode = RenderMode::Cpu);
+        virtual void initializationOnThread(){};

-        ~FaceRenderer();
-
-        void initializationOnThread();
-
-        void renderFace(Array<float>& outputData, const Array<float>& faceKeypoints);
-
-    private:
-        const float mRenderThreshold;
-        const Point<int> mFrameSize;
-        const RenderMode mRenderMode;
-        float* pGpuFace; // GPU aux memory
-
-        void renderFaceCpu(Array<float>& outputData, const Array<float>& faceKeypoints);
-
-        void renderFaceGpu(Array<float>& outputData, const Array<float>& faceKeypoints);
-
-        DELETE_COPY(FaceRenderer);
+        virtual void renderFace(Array<float>& outputData, const Array<float>& faceKeypoints) = 0;
    };
 }


--- a/include/openpose/face/headers.hpp
+++ b/include/openpose/face/headers.hpp
@@ -5,6 +5,8 @@
 #include <openpose/face/faceDetector.hpp>
 #include <openpose/face/faceExtractor.hpp>
 #include <openpose/face/faceParameters.hpp>
+#include <openpose/face/faceCpuRenderer.hpp>
+#include <openpose/face/faceGpuRenderer.hpp>
 #include <openpose/face/faceRenderer.hpp>
 #include <openpose/face/renderFace.hpp>
 #include <openpose/face/wFaceDetector.hpp>

--- a/include/openpose/gui/enumClasses.hpp
+++ b/include/openpose/gui/enumClasses.hpp
@@ -11,7 +11,6 @@ namespace op
    {
        FullScreen, /**< Full screen mode. */
        Windowed,   /**< Windowed mode, depending on the frame output size. */
-        // NoDisplay,  /**< Not displaying the output. */
    };
 }


--- a/include/openpose/gui/frameDisplayer.hpp
+++ b/include/openpose/gui/frameDisplayer.hpp
@@ -15,11 +15,11 @@ namespace op
    public:
        /**
         * Constructor of the FrameDisplayer class.
-         * @param fullScreen bool from which the FrameDisplayer::GuiDisplayMode property mGuiDisplayMode will be set, i.e. specifying the type of initial display (it can be changed later).
-         * @param windowedSize const Point<int> with the windored output resolution (width and height).
         * @param windowedName const std::string value with the opencv resulting display name. Showed at the top-left part of the window.
+         * @param initialWindowedSize const Point<int> with the initial window output resolution (width and height).
+         * @param fullScreen bool from which the FrameDisplayer::GuiDisplayMode property mGuiDisplayMode will be set, i.e. specifying the type of initial display (it can be changed later).
         */
-        FrameDisplayer(const Point<int>& windowedSize, const std::string& windowedName = "OpenPose Display", const bool fullScreen = false);
+        FrameDisplayer(const std::string& windowedName = "OpenPose Display", const Point<int>& initialWindowedSize = Point<int>{}, const bool fullScreen = false);

        // Due to OpenCV visualization issues (all visualization functions must be in the same thread)
        void initializationOnThread();
@@ -44,8 +44,8 @@ namespace op
        void displayFrame(const cv::Mat& frame, const int waitKeyValue = -1);

    private:
-        const Point<int> mWindowedSize;
        const std::string mWindowName;
+        Point<int> mWindowedSize;
        GuiDisplayMode mGuiDisplayMode;
    };
 }

--- a/include/openpose/gui/gui.hpp
+++ b/include/openpose/gui/gui.hpp
@@ -4,19 +4,19 @@
 #include <atomic>
 #include <opencv2/core/core.hpp> // cv::Mat
 #include <openpose/core/common.hpp>
+#include <openpose/core/renderer.hpp>
 #include <openpose/gui/enumClasses.hpp>
 #include <openpose/gui/frameDisplayer.hpp>
 #include <openpose/pose/poseExtractor.hpp>
-#include <openpose/pose/poseRenderer.hpp>

 namespace op
 {
    class OP_API Gui
    {
    public:
-        Gui(const bool fullScreen, const Point<int>& outputSize, const std::shared_ptr<std::atomic<bool>>& isRunningSharedPtr,
+        Gui(const Point<int>& outputSize, const bool fullScreen, const std::shared_ptr<std::atomic<bool>>& isRunningSharedPtr,
            const std::shared_ptr<std::pair<std::atomic<bool>, std::atomic<int>>>& videoSeekSharedPtr = nullptr,
-            const std::vector<std::shared_ptr<PoseExtractor>>& poseExtractors = {}, const std::vector<std::shared_ptr<PoseRenderer>>& poseRenderers = {});
+            const std::vector<std::shared_ptr<PoseExtractor>>& poseExtractors = {}, const std::vector<std::shared_ptr<Renderer>>& renderers = {});

        void initializationOnThread();

@@ -27,7 +27,7 @@ namespace op
        FrameDisplayer mFrameDisplayer;
        // Other variables
        std::vector<std::shared_ptr<PoseExtractor>> mPoseExtractors;
-        std::vector<std::shared_ptr<PoseRenderer>> mPoseRenderers;
+        std::vector<std::shared_ptr<Renderer>> mRenderers;
        std::shared_ptr<std::atomic<bool>> spIsRunning;
        std::shared_ptr<std::pair<std::atomic<bool>, std::atomic<int>>> spVideoSeek;
    };

--- a/include/openpose/gui/guiInfoAdder.hpp
+++ b/include/openpose/gui/guiInfoAdder.hpp
@@ -10,14 +10,12 @@ namespace op
    class OP_API GuiInfoAdder
    {
    public:
-        GuiInfoAdder(const Point<int>& outputSize, const int numberGpus, const bool guiEnabled = false);
+        GuiInfoAdder(const int numberGpus, const bool guiEnabled = false);

        void addInfo(cv::Mat& cvOutputData, const Array<float>& poseKeypoints, const unsigned long long id, const std::string& elementRenderedName);

    private:
        // Const variables
-        const Point<int> mOutputSize;
-        const int mBorderMargin;
        const int mNumberGpus;
        const bool mGuiEnabled;
        // Other variables

--- a/include/openpose/hand/handCpuRenderer.hpp
+++ b/include/openpose/hand/handCpuRenderer.hpp
+#ifndef OPENPOSE_HAND_HAND_CPU_RENDERER_HPP
+#define OPENPOSE_HAND_HAND_CPU_RENDERER_HPP
+
+#include <openpose/core/common.hpp>
+#include <openpose/core/renderer.hpp>
+#include <openpose/hand/handParameters.hpp>
+#include <openpose/hand/handRenderer.hpp>
+
+namespace op
+{
+    class OP_API HandCpuRenderer : public Renderer, public HandRenderer
+    {
+    public:
+        HandCpuRenderer(const float renderThreshold, const float alphaKeypoint = HAND_DEFAULT_ALPHA_KEYPOINT,
+                        const float alphaHeatMap = HAND_DEFAULT_ALPHA_HEAT_MAP);
+
+        void renderHand(Array<float>& outputData, const std::array<Array<float>, 2>& handKeypoints);
+
+        DELETE_COPY(HandCpuRenderer);
+    };
+}
+
+#endif // OPENPOSE_HAND_HAND_CPU_RENDERER_HPP
--- a/include/openpose/hand/handGpuRenderer.hpp
+++ b/include/openpose/hand/handGpuRenderer.hpp
+#ifndef OPENPOSE_HAND_HAND_GPU_RENDERER_HPP
+#define OPENPOSE_HAND_HAND_GPU_RENDERER_HPP
+
+#include <openpose/core/common.hpp>
+#include <openpose/core/gpuRenderer.hpp>
+#include <openpose/hand/handParameters.hpp>
+#include <openpose/hand/handRenderer.hpp>
+
+namespace op
+{
+    class OP_API HandGpuRenderer : public GpuRenderer, public HandRenderer
+    {
+    public:
+        HandGpuRenderer(const float renderThreshold,
+                        const float alphaKeypoint = HAND_DEFAULT_ALPHA_KEYPOINT,
+                        const float alphaHeatMap = HAND_DEFAULT_ALPHA_HEAT_MAP);
+
+        ~HandGpuRenderer();
+
+        void initializationOnThread();
+
+        void renderHand(Array<float>& outputData, const std::array<Array<float>, 2>& handKeypoints);
+
+    private:
+        float* pGpuHand; // GPU aux memory
+
+        DELETE_COPY(HandGpuRenderer);
+    };
+}
+
+#endif // OPENPOSE_HAND_HAND_GPU_RENDERER_HPP
--- a/include/openpose/hand/handRenderer.hpp
+++ b/include/openpose/hand/handRenderer.hpp
@@ -2,38 +2,15 @@
 #define OPENPOSE_HAND_HAND_RENDERER_HPP

 #include <openpose/core/common.hpp>
-#include <openpose/core/enumClasses.hpp>
-#include <openpose/core/renderer.hpp>
-#include <openpose/hand/handParameters.hpp>
-#include <openpose/thread/worker.hpp>

 namespace op
 {
-    class OP_API HandRenderer : public Renderer
+    class OP_API HandRenderer
    {
    public:
-        HandRenderer(const Point<int>& frameSize, const float renderThreshold,
-                     const float alphaKeypoint = HAND_DEFAULT_ALPHA_KEYPOINT,
-                     const float alphaHeatMap = HAND_DEFAULT_ALPHA_HEAT_MAP,
-                     const RenderMode renderMode = RenderMode::Cpu);
+        virtual void initializationOnThread(){};

-        ~HandRenderer();
-
-        void initializationOnThread();
-
-        void renderHand(Array<float>& outputData, const std::array<Array<float>, 2>& handKeypoints);
-
-    private:
-        const float mRenderThreshold;
-        const Point<int> mFrameSize;
-        const RenderMode mRenderMode;
-        float* pGpuHand; // GPU aux memory
-
-        void renderHandCpu(Array<float>& outputData, const std::array<Array<float>, 2>& handKeypoints) const;
-
-        void renderHandGpu(Array<float>& outputData, const std::array<Array<float>, 2>& handKeypoints);
-
-        DELETE_COPY(HandRenderer);
+        virtual void renderHand(Array<float>& outputData, const std::array<Array<float>, 2>& handKeypoints) = 0;
    };
 }


--- a/include/openpose/hand/headers.hpp
+++ b/include/openpose/hand/headers.hpp
@@ -6,6 +6,8 @@
 #include <openpose/hand/handDetectorFromTxt.hpp>
 #include <openpose/hand/handExtractor.hpp>
 #include <openpose/hand/handParameters.hpp>
+#include <openpose/hand/handCpuRenderer.hpp>
+#include <openpose/hand/handGpuRenderer.hpp>
 #include <openpose/hand/handRenderer.hpp>
 #include <openpose/hand/renderHand.hpp>
 #include <openpose/hand/wHandDetector.hpp>

--- a/include/openpose/pose/headers.hpp
+++ b/include/openpose/pose/headers.hpp
@@ -5,10 +5,12 @@
 #include <openpose/pose/bodyPartConnectorBase.hpp>
 #include <openpose/pose/bodyPartConnectorCaffe.hpp>
 #include <openpose/pose/enumClasses.hpp>
+#include <openpose/pose/poseCpuRenderer.hpp>
 #include <openpose/pose/poseExtractor.hpp>
 #include <openpose/pose/poseExtractorCaffe.hpp>
-#include <openpose/pose/poseRenderer.hpp>
+#include <openpose/pose/poseGpuRenderer.hpp>
 #include <openpose/pose/poseParameters.hpp>
+#include <openpose/pose/poseRenderer.hpp>
 #include <openpose/pose/renderPose.hpp>
 #include <openpose/pose/wPoseExtractor.hpp>
 #include <openpose/pose/wPoseRenderer.hpp>

--- a/include/openpose/pose/poseCpuRenderer.hpp
+++ b/include/openpose/pose/poseCpuRenderer.hpp
+#ifndef OPENPOSE_POSE_POSE_CPU_RENDERER_HPP
+#define OPENPOSE_POSE_POSE_CPU_RENDERER_HPP
+
+#include <openpose/core/common.hpp>
+#include <openpose/core/renderer.hpp>
+#include <openpose/pose/enumClasses.hpp>
+#include <openpose/pose/poseExtractor.hpp>
+#include <openpose/pose/poseParameters.hpp>
+#include <openpose/pose/poseRenderer.hpp>
+
+namespace op
+{
+    class OP_API PoseCpuRenderer : public Renderer, public PoseRenderer
+    {
+    public:
+        PoseCpuRenderer(const PoseModel poseModel, const float renderThreshold, const bool blendOriginalFrame = true,
+                        const float alphaKeypoint = POSE_DEFAULT_ALPHA_KEYPOINT,
+                        const float alphaHeatMap = POSE_DEFAULT_ALPHA_HEAT_MAP);
+
+        std::pair<int, std::string> renderPose(Array<float>& outputData, const Array<float>& poseKeypoints,
+                                               const float scaleNetToOutput = -1.f);
+
+    private:
+        DELETE_COPY(PoseCpuRenderer);
+    };
+}
+
+#endif // OPENPOSE_POSE_POSE_CPU_RENDERER_HPP
--- a/include/openpose/pose/poseGpuRenderer.hpp
+++ b/include/openpose/pose/poseGpuRenderer.hpp
+#ifndef OPENPOSE_POSE_POSE_GPU_RENDERER_HPP
+#define OPENPOSE_POSE_POSE_GPU_RENDERER_HPP
+
+#include <openpose/core/common.hpp>
+#include <openpose/core/gpuRenderer.hpp>
+#include <openpose/pose/enumClasses.hpp>
+#include <openpose/pose/poseExtractor.hpp>
+#include <openpose/pose/poseParameters.hpp>
+#include <openpose/pose/poseRenderer.hpp>
+
+namespace op
+{
+    class OP_API PoseGpuRenderer : public GpuRenderer, public PoseRenderer
+    {
+    public:
+        PoseGpuRenderer(const Point<int>& heatMapsSize, const PoseModel poseModel,
+                        const std::shared_ptr<PoseExtractor>& poseExtractor, const float renderThreshold,
+                        const bool blendOriginalFrame = true, const float alphaKeypoint = POSE_DEFAULT_ALPHA_KEYPOINT,
+                        const float alphaHeatMap = POSE_DEFAULT_ALPHA_HEAT_MAP,
+                        const unsigned int elementToRender = 0u);
+
+        ~PoseGpuRenderer();
+
+        void initializationOnThread();
+
+        std::pair<int, std::string> renderPose(Array<float>& outputData, const Array<float>& poseKeypoints,
+                                               const float scaleNetToOutput = -1.f);
+
+    private:
+        const Point<int> mHeatMapsSize;
+        const std::shared_ptr<PoseExtractor> spPoseExtractor;
+        // Init with thread
+        float* pGpuPose; // GPU aux memory
+
+        DELETE_COPY(PoseGpuRenderer);
+    };
+}
+
+#endif // OPENPOSE_POSE_POSE_GPU_RENDERER_HPP
--- a/include/openpose/pose/poseRenderer.hpp
+++ b/include/openpose/pose/poseRenderer.hpp
@@ -2,53 +2,24 @@
 #define OPENPOSE_POSE_POSE_RENDERER_HPP

 #include <openpose/core/common.hpp>
-#include <openpose/core/enumClasses.hpp>
-#include <openpose/core/renderer.hpp>
 #include <openpose/pose/enumClasses.hpp>
-#include <openpose/pose/poseExtractor.hpp>
-#include <openpose/pose/poseParameters.hpp>

 namespace op
 {
-    class OP_API PoseRenderer : public Renderer
+    class OP_API PoseRenderer
    {
    public:
-        PoseRenderer(const Point<int>& heatMapsSize, const Point<int>& outputSize, const PoseModel poseModel,
-                     const std::shared_ptr<PoseExtractor>& poseExtractor, const float renderThreshold,
-                     const bool blendOriginalFrame = true, const float alphaKeypoint = POSE_DEFAULT_ALPHA_KEYPOINT,
-                     const float alphaHeatMap = POSE_DEFAULT_ALPHA_HEAT_MAP, const unsigned int elementToRender = 0u,
-                     const RenderMode renderMode = RenderMode::Gpu);
+        PoseRenderer(const PoseModel poseModel);

-        ~PoseRenderer();
+        virtual void initializationOnThread(){};

-        void initializationOnThread();
+        virtual std::pair<int, std::string> renderPose(Array<float>& outputData, const Array<float>& poseKeypoints, const float scaleNetToOutput = -1.f) = 0;

-        bool getBlendOriginalFrame() const;
-
-        bool getShowGooglyEyes() const;
-
-        void setBlendOriginalFrame(const bool blendOriginalFrame);
-
-        void setShowGooglyEyes(const bool showGooglyEyes);
-
-        std::pair<int, std::string> renderPose(Array<float>& outputData, const Array<float>& poseKeypoints, const float scaleNetToOutput = -1.f);
-
-    private:
-        const float mRenderThreshold;
-        const Point<int> mHeatMapsSize;
-        const Point<int> mOutputSize;
+    protected:
        const PoseModel mPoseModel;
        const std::map<unsigned int, std::string> mPartIndexToName;
-        const std::shared_ptr<PoseExtractor> spPoseExtractor;
-        const RenderMode mRenderMode;
-        std::atomic<bool> mBlendOriginalFrame;
-        std::atomic<bool> mShowGooglyEyes;
-        // Init with thread
-        float* pGpuPose; // GPU aux memory

-        std::pair<int, std::string> renderPoseCpu(Array<float>& outputData, const Array<float>& poseKeypoints, const float scaleNetToOutput = -1.f);
-
-        std::pair<int, std::string> renderPoseGpu(Array<float>& outputData, const Array<float>& poseKeypoints, const float scaleNetToOutput = -1.f);
+    private:

        DELETE_COPY(PoseRenderer);
    };

--- a/include/openpose/wrapper/wrapper.hpp
+++ b/include/openpose/wrapper/wrapper.hpp
--- a/src/openpose/core/CMakeLists.txt
+++ b/src/openpose/core/CMakeLists.txt
@@ -4,6 +4,7 @@ cuda_add_library(core
    cvMatToOpOutput.cpp
    datum.cpp
    defineTemplates.cpp
+    gpuRenderer.cpp
    keypointScaler.cpp
    maximumBase.cpp
    maximumBase.cu

--- a/src/openpose/core/cvMatToOpOutput.cpp
+++ b/src/openpose/core/cvMatToOpOutput.cpp
@@ -5,7 +5,7 @@ namespace op
 {
    CvMatToOpOutput::CvMatToOpOutput(const Point<int>& outputResolution, const bool generateOutput) :
        mGenerateOutput{generateOutput},
-        mOutputSize3D{{3, outputResolution.y, outputResolution.x}}
+        mOutputSize3D{3, outputResolution.y, outputResolution.x}
    {
    }

@@ -18,18 +18,32 @@ namespace op
                error("Wrong input element (empty cvInputData).", __LINE__, __FUNCTION__, __FILE__);
            if (cvInputData.channels() != 3)
                error("Input images must be 3-channel BGR.", __LINE__, __FUNCTION__, __FILE__);
-
+            // scaleInputToOutput - Scale between input and desired output size
+            double scaleInputToOutput;
+            Point<int> outputResolution;
+            // Output = mOutputSize3D size
+            if (mOutputSize3D[1] > 0 && mOutputSize3D[2] > 0)
+            {
+                outputResolution = Point<int>{mOutputSize3D[2], mOutputSize3D[1]};
+                scaleInputToOutput = resizeGetScaleFactor(Point<int>{cvInputData.cols, cvInputData.rows},
+                                                          outputResolution);
+            }
+            // Output = input size
+            else
+            {
+                outputResolution = Point<int>{cvInputData.cols, cvInputData.rows};
+                scaleInputToOutput = 1.;
+            }
            // outputData - Reescale keeping aspect ratio and transform to float the output image
-            const Point<int> outputResolution{mOutputSize3D[2], mOutputSize3D[1]};
-            const double scaleInputToOutput = resizeGetScaleFactor(Point<int>{cvInputData.cols, cvInputData.rows}, outputResolution);
-            const cv::Mat frameWithOutputSize = resizeFixedAspectRatio(cvInputData, scaleInputToOutput, outputResolution);
            Array<float> outputData;
            if (mGenerateOutput)
            {
-                outputData.reset(mOutputSize3D);
+                const cv::Mat frameWithOutputSize = resizeFixedAspectRatio(cvInputData, scaleInputToOutput,
+                                                                           outputResolution);
+                outputData.reset({3, outputResolution.y, outputResolution.x});
                uCharCvMatToFloatPtr(outputData.getPtr(), frameWithOutputSize, false);
            }
-
+            // Return result
            return std::make_tuple(scaleInputToOutput, outputData);
        }
        catch (const std::exception& e)

--- a/src/openpose/core/gpuRenderer.cpp
+++ b/src/openpose/core/gpuRenderer.cpp
+#ifndef CPU_ONLY
+    #include <cuda.h>
+    #include <cuda_runtime_api.h>
+#endif
+#include <openpose/core/gpuRenderer.hpp>
+
+namespace op
+{
+    void checkAndIncreaseGpuMemory(std::shared_ptr<float*>& gpuMemoryPtr,
+                                   std::shared_ptr<std::atomic<unsigned long long>>& currentVolumePtr,
+                                   const unsigned long long memoryVolume)
+    {
+        try
+        {
+            #ifndef CPU_ONLY
+                if (*currentVolumePtr < memoryVolume)
+                {
+                    *currentVolumePtr = memoryVolume;
+                    cudaFree(*gpuMemoryPtr);
+                    cudaMalloc((void**)(gpuMemoryPtr.get()), *currentVolumePtr * sizeof(float));
+                }
+            #endif
+        }
+        catch (const std::exception& e)
+        {
+            error(e.what(), __LINE__, __FUNCTION__, __FILE__);
+        }
+    }
+
+    GpuRenderer::GpuRenderer(const float renderThreshold, const float alphaKeypoint,
+                             const float alphaHeatMap, const bool blendOriginalFrame,
+                             const unsigned int elementToRender, const unsigned int numberElementsToRender) :
+        Renderer{renderThreshold, alphaKeypoint, alphaHeatMap, blendOriginalFrame, elementToRender,
+                 numberElementsToRender},
+        spGpuMemory{std::make_shared<float*>()},
+        spVolume{std::make_shared<std::atomic<unsigned long long>>(0)},
+        mIsFirstRenderer{true},
+        mIsLastRenderer{true},
+        spGpuMemoryAllocated{std::make_shared<bool>(false)}
+    {
+    }
+
+    GpuRenderer::~GpuRenderer()
+    {
+        try
+        {
+            #ifndef CPU_ONLY
+                if (mIsLastRenderer)
+                    cudaFree(*spGpuMemory);
+            #endif
+        }
+        catch (const std::exception& e)
+        {
+            error(e.what(), __LINE__, __FUNCTION__, __FILE__);
+        }
+    }
+
+    std::tuple<std::shared_ptr<float*>, std::shared_ptr<bool>, std::shared_ptr<std::atomic<unsigned int>>,
+               std::shared_ptr<std::atomic<unsigned long long>>, std::shared_ptr<const unsigned int>>
+               GpuRenderer::getSharedParameters()
+    {
+        try
+        {
+            mIsLastRenderer = false;
+            return std::make_tuple(spGpuMemory, spGpuMemoryAllocated, spElementToRender, spVolume, spNumberElementsToRender);
+        }
+        catch (const std::exception& e)
+        {
+            error(e.what(), __LINE__, __FUNCTION__, __FILE__);
+            return std::make_tuple(nullptr, nullptr, nullptr, nullptr, nullptr);
+        }
+    }
+
+    void GpuRenderer::setSharedParametersAndIfLast(const std::tuple<std::shared_ptr<float*>, std::shared_ptr<bool>,
+                                                                    std::shared_ptr<std::atomic<unsigned int>>,
+                                                                    std::shared_ptr<std::atomic<unsigned long long>>,
+                                                                    std::shared_ptr<const unsigned int>>& tuple,
+                                                   const bool isLast)
+    {
+        try
+        {
+            mIsFirstRenderer = false;
+            mIsLastRenderer = isLast;
+            spGpuMemory = std::get<0>(tuple);
+            spGpuMemoryAllocated = std::get<1>(tuple);
+            spElementToRender = std::get<2>(tuple);
+            spVolume = std::get<3>(tuple);
+            spNumberElementsToRender = std::get<4>(tuple);
+        }
+        catch (const std::exception& e)
+        {
+            error(e.what(), __LINE__, __FUNCTION__, __FILE__);
+        }
+    }
+
+    void GpuRenderer::cpuToGpuMemoryIfNotCopiedYet(const float* const cpuMemory, const unsigned long long memoryVolume)
+    {
+        try
+        {
+            #ifndef CPU_ONLY
+                if (!*spGpuMemoryAllocated)
+                {
+                    checkAndIncreaseGpuMemory(spGpuMemory, spVolume, memoryVolume);
+                    cudaMemcpy(*spGpuMemory, cpuMemory, memoryVolume * sizeof(float), cudaMemcpyHostToDevice);
+                    *spGpuMemoryAllocated = true;
+                }
+            #else
+                error("GPU rendering not available if `CPU_ONLY` is set.", __LINE__, __FUNCTION__, __FILE__);
+                UNUSED(cpuMemory);
+            #endif
+        }
+        catch (const std::exception& e)
+        {
+            error(e.what(), __LINE__, __FUNCTION__, __FILE__);
+        }
+    }
+
+    void GpuRenderer::gpuToCpuMemoryIfLastRenderer(float* cpuMemory, const unsigned long long memoryVolume)
+    {
+        try
+        {
+            #ifndef CPU_ONLY
+                if (*spGpuMemoryAllocated && mIsLastRenderer)
+                {
+                    if (*spVolume < memoryVolume)
+                        error("CPU is asking for more memory than it was copied into GPU.",
+                              __LINE__, __FUNCTION__, __FILE__);
+                    cudaMemcpy(cpuMemory, *spGpuMemory, memoryVolume * sizeof(float), cudaMemcpyDeviceToHost);
+                    *spGpuMemoryAllocated = false;
+                }
+            #else
+                error("GPU rendering not available if `CPU_ONLY` is set.", __LINE__, __FUNCTION__, __FILE__);
+                UNUSED(cpuMemory);
+            #endif
+        }
+        catch (const std::exception& e)
+        {
+            error(e.what(), __LINE__, __FUNCTION__, __FILE__);
+        }
+    }
+}
--- a/src/openpose/core/opOutputToCvMat.cpp
+++ b/src/openpose/core/opOutputToCvMat.cpp
@@ -3,11 +3,6 @@

 namespace op
 {
-    OpOutputToCvMat::OpOutputToCvMat(const Point<int>& outputResolution) :
-        mOutputResolution{outputResolution.x, outputResolution.y, 3}
-    {
-    }
-
    cv::Mat OpOutputToCvMat::formatToCvMat(const Array<float>& outputData) const
    {
        try
@@ -15,10 +10,12 @@ namespace op
            // Security checks
            if (outputData.empty())
                error("Wrong input element (empty outputData).", __LINE__, __FUNCTION__, __FILE__);
-
+            // outputData to cvMat
            cv::Mat cvMat;
-            floatPtrToUCharCvMat(cvMat, outputData.getConstPtr(), mOutputResolution);
-
+            const std::array<int, 3> outputResolution{outputData.getSize(2), outputData.getSize(1),
+                                                      outputData.getSize(0)};
+            floatPtrToUCharCvMat(cvMat, outputData.getConstPtr(), outputResolution);
+            // Return cvMat
            return cvMat;
        }
        catch (const std::exception& e)

--- a/src/openpose/core/renderer.cpp
+++ b/src/openpose/core/renderer.cpp
-#ifndef CPU_ONLY
-    #include <cuda.h>
-    #include <cuda_runtime_api.h>
-#endif
 #include <openpose/core/renderer.hpp>

 namespace op
 {
-    Renderer::Renderer(const unsigned long long volume, const float alphaKeypoint, const float alphaHeatMap,
-                       const unsigned int elementToRender, const unsigned int numberElementsToRender) :
-        spGpuMemoryPtr{std::make_shared<float*>()},
+    Renderer::Renderer(const float renderThreshold, const float alphaKeypoint, const float alphaHeatMap,
+                       const bool blendOriginalFrame, const unsigned int elementToRender,
+                       const unsigned int numberElementsToRender) :
+        mRenderThreshold{renderThreshold},
+        mBlendOriginalFrame{blendOriginalFrame},
        spElementToRender{std::make_shared<std::atomic<unsigned int>>(elementToRender)},
        spNumberElementsToRender{std::make_shared<const unsigned int>(numberElementsToRender)},
-        mVolume{volume},
+        mShowGooglyEyes{false},
        mAlphaKeypoint{alphaKeypoint},
-        mAlphaHeatMap{alphaHeatMap},
-        mIsFirstRenderer{true},
-        mIsLastRenderer{true},
-        spGpuMemoryAllocated{std::make_shared<bool>(false)}
+        mAlphaHeatMap{alphaHeatMap}
    {
    }

-    Renderer::~Renderer()
-    {
-        try
-        {
-            #ifndef CPU_ONLY
-                if (mIsLastRenderer)
-                    cudaFree(*spGpuMemoryPtr);
-            #endif
-        }
-        catch (const std::exception& e)
-        {
-            error(e.what(), __LINE__, __FUNCTION__, __FILE__);
-        }
-    }
-
-    void Renderer::initializationOnThread()
-    {
-        try
-        {
-            #ifndef CPU_ONLY
-                if (mIsFirstRenderer)
-                    cudaMalloc((void**)(spGpuMemoryPtr.get()), mVolume * sizeof(float));
-            #endif
-        }
-        catch (const std::exception& e)
-        {
-            error(e.what(), __LINE__, __FUNCTION__, __FILE__);
-        }
-    }
-
    void Renderer::increaseElementToRender(const int increment)
    {
        try
@@ -79,33 +44,24 @@ namespace op
        }
    }

-    std::tuple<std::shared_ptr<float*>, std::shared_ptr<bool>, std::shared_ptr<std::atomic<unsigned int>>,
-               std::shared_ptr<const unsigned int>> Renderer::getSharedParameters()
+    float Renderer::getAlphaKeypoint() const
    {
        try
        {
-            mIsLastRenderer = false;
-            return std::make_tuple(spGpuMemoryPtr, spGpuMemoryAllocated, spElementToRender, spNumberElementsToRender);
+            return mAlphaKeypoint;
        }
        catch (const std::exception& e)
        {
            error(e.what(), __LINE__, __FUNCTION__, __FILE__);
-            return std::make_tuple(nullptr, nullptr, nullptr, nullptr);
+            return 0.f;
        }
    }

-    void Renderer::setSharedParametersAndIfLast(const std::tuple<std::shared_ptr<float*>, std::shared_ptr<bool>,
-                                                                 std::shared_ptr<std::atomic<unsigned int>>,
-                                                                 std::shared_ptr<const unsigned int>>& tuple, const bool isLast)
+    void Renderer::setAlphaKeypoint(const float alphaKeypoint)
    {
        try
        {
-            mIsFirstRenderer = false;
-            mIsLastRenderer = isLast;
-            spGpuMemoryPtr = std::get<0>(tuple);
-            spGpuMemoryAllocated = std::get<1>(tuple);
-            spElementToRender = std::get<2>(tuple);
-            spNumberElementsToRender = std::get<3>(tuple);
+            mAlphaKeypoint = alphaKeypoint;
        }
        catch (const std::exception& e)
        {
@@ -113,11 +69,11 @@ namespace op
        }
    }

-    float Renderer::getAlphaKeypoint() const
+    float Renderer::getAlphaHeatMap() const
    {
        try
        {
-            return mAlphaKeypoint;
+            return mAlphaHeatMap;
        }
        catch (const std::exception& e)
        {
@@ -126,11 +82,11 @@ namespace op
        }
    }

-    void Renderer::setAlphaKeypoint(const float alphaKeypoint)
+    void Renderer::setAlphaHeatMap(const float alphaHeatMap)
    {
        try
        {
-            mAlphaKeypoint = alphaKeypoint;
+            mAlphaHeatMap = alphaHeatMap;
        }
        catch (const std::exception& e)
        {
@@ -138,24 +94,24 @@ namespace op
        }
    }

-    float Renderer::getAlphaHeatMap() const
+    bool Renderer::getBlendOriginalFrame() const
    {
        try
        {
-            return mAlphaHeatMap;
+            return mBlendOriginalFrame;
        }
        catch (const std::exception& e)
        {
            error(e.what(), __LINE__, __FUNCTION__, __FILE__);
-            return 0.f;
+            return false;
        }
    }

-    void Renderer::setAlphaHeatMap(const float alphaHeatMap)
+    void Renderer::setBlendOriginalFrame(const bool blendOriginalFrame)
    {
        try
        {
-            mAlphaHeatMap = alphaHeatMap;
+            mBlendOriginalFrame = blendOriginalFrame;
        }
        catch (const std::exception& e)
        {
@@ -163,41 +119,24 @@ namespace op
        }
    }

-    void Renderer::cpuToGpuMemoryIfNotCopiedYet(const float* const cpuMemory)
+    bool Renderer::getShowGooglyEyes() const
    {
        try
        {
-            #ifndef CPU_ONLY
-                if (!*spGpuMemoryAllocated)
-                {
-                    cudaMemcpy(*spGpuMemoryPtr, cpuMemory, mVolume * sizeof(float), cudaMemcpyHostToDevice);
-                    *spGpuMemoryAllocated = true;
-                }
-            #else
-                error("GPU rendering not available if `CPU_ONLY` is set.", __LINE__, __FUNCTION__, __FILE__);
-                UNUSED(cpuMemory);
-            #endif
+            return mShowGooglyEyes;
        }
        catch (const std::exception& e)
        {
            error(e.what(), __LINE__, __FUNCTION__, __FILE__);
+            return false;
        }
    }

-    void Renderer::gpuToCpuMemoryIfLastRenderer(float* cpuMemory)
+    void Renderer::setShowGooglyEyes(const bool showGooglyEyes)
    {
        try
        {
-            #ifndef CPU_ONLY
-                if (*spGpuMemoryAllocated && mIsLastRenderer)
-                {
-                    cudaMemcpy(cpuMemory, *spGpuMemoryPtr, mVolume * sizeof(float), cudaMemcpyDeviceToHost);
-                    *spGpuMemoryAllocated = false;
-                }
-            #else
-                error("GPU rendering not available if `CPU_ONLY` is set.", __LINE__, __FUNCTION__, __FILE__);
-                UNUSED(cpuMemory);
-            #endif
+            mShowGooglyEyes = showGooglyEyes;
        }
        catch (const std::exception& e)
        {

--- a/src/openpose/face/CMakeLists.txt
+++ b/src/openpose/face/CMakeLists.txt
@@ -2,7 +2,8 @@ set(SOURCES
    defineTemplates.cpp
    faceDetector.cpp
    faceExtractor.cpp
-    faceRenderer.cpp
+    faceCpuRenderer.cpp
+    faceGpuRenderer.cpp
    renderFace.cpp
    renderFace.cu
 )

--- a/src/openpose/face/faceCpuRenderer.cpp
+++ b/src/openpose/face/faceCpuRenderer.cpp
+#include <openpose/face/renderFace.hpp>
+#include <openpose/face/faceCpuRenderer.hpp>
+
+namespace op
+{
+    FaceCpuRenderer::FaceCpuRenderer(const float renderThreshold, const float alphaKeypoint,
+                                     const float alphaHeatMap) :
+        Renderer{renderThreshold, alphaKeypoint, alphaHeatMap}
+    {
+    }
+
+    void FaceCpuRenderer::renderFace(Array<float>& outputData, const Array<float>& faceKeypoints)
+    {
+        try
+        {
+            // Security checks
+            if (outputData.empty())
+                error("Empty Array<float> outputData.", __LINE__, __FUNCTION__, __FILE__);
+            // CPU rendering
+            renderFaceKeypointsCpu(outputData, faceKeypoints, mRenderThreshold);
+        }
+        catch (const std::exception& e)
+        {
+            error(e.what(), __LINE__, __FUNCTION__, __FILE__);
+        }
+    }
+}
--- a/src/openpose/face/faceRenderer.cpp
+++ b/src/openpose/face/faceRenderer.cpp
@@ -4,20 +4,17 @@
 #endif
 #include <openpose/face/renderFace.hpp>
 #include <openpose/utilities/cuda.hpp>
-#include <openpose/face/faceRenderer.hpp>
+#include <openpose/face/faceGpuRenderer.hpp>

 namespace op
 {
-    FaceRenderer::FaceRenderer(const Point<int>& frameSize, const float renderThreshold, const float alphaKeypoint,
-                               const float alphaHeatMap, const RenderMode renderMode) :
-        Renderer{(unsigned long long)(frameSize.area() * 3), alphaKeypoint, alphaHeatMap},
-        mRenderThreshold{renderThreshold},
-        mFrameSize{frameSize},
-        mRenderMode{renderMode}
+    FaceGpuRenderer::FaceGpuRenderer(const float renderThreshold, const float alphaKeypoint,
+                                     const float alphaHeatMap) :
+        GpuRenderer{renderThreshold, alphaKeypoint, alphaHeatMap}
    {
    }

-    FaceRenderer::~FaceRenderer()
+    FaceGpuRenderer::~FaceGpuRenderer()
    {
        try
        {
@@ -32,12 +29,11 @@ namespace op
        }
    }

-    void FaceRenderer::initializationOnThread()
+    void FaceGpuRenderer::initializationOnThread()
    {
        try
        {
            log("Starting initialization on thread.", Priority::Low, __LINE__, __FUNCTION__, __FILE__);
-            Renderer::initializationOnThread();
            // GPU memory allocation for rendering
            #ifndef CPU_ONLY
                cudaMalloc((void**)(&pGpuFace), POSE_MAX_PEOPLE * FACE_NUMBER_PARTS * 3 * sizeof(float));
@@ -50,62 +46,32 @@ namespace op
        }
    }

-    void FaceRenderer::renderFace(Array<float>& outputData, const Array<float>& faceKeypoints)
+    void FaceGpuRenderer::renderFace(Array<float>& outputData, const Array<float>& faceKeypoints)
    {
        try
        {
            // Security checks
            if (outputData.empty())
                error("Empty Array<float> outputData.", __LINE__, __FUNCTION__, __FILE__);
-
-            // CPU rendering
-            if (mRenderMode == RenderMode::Cpu)
-                renderFaceCpu(outputData, faceKeypoints);
-
-            // GPU rendering
-            else
-                renderFaceGpu(outputData, faceKeypoints);
-        }
-        catch (const std::exception& e)
-        {
-            error(e.what(), __LINE__, __FUNCTION__, __FILE__);
-        }
-    }
-
-    void FaceRenderer::renderFaceCpu(Array<float>& outputData, const Array<float>& faceKeypoints)
-    {
-        try
-        {
-            renderFaceKeypointsCpu(outputData, faceKeypoints, mRenderThreshold);
-        }
-        catch (const std::exception& e)
-        {
-            error(e.what(), __LINE__, __FUNCTION__, __FILE__);
-        }
-    }
-
-    void FaceRenderer::renderFaceGpu(Array<float>& outputData, const Array<float>& faceKeypoints)
-    {
-        try
-        {
            // GPU rendering
            #ifndef CPU_ONLY
                const auto elementRendered = spElementToRender->load(); // I prefer std::round(T&) over intRound(T) for std::atomic
                const auto numberPeople = faceKeypoints.getSize(0);
+                const Point<int> frameSize{outputData.getSize(2), outputData.getSize(1)};
                if (numberPeople > 0 && elementRendered == 0)
                {
-                    cpuToGpuMemoryIfNotCopiedYet(outputData.getPtr());
+                    cpuToGpuMemoryIfNotCopiedYet(outputData.getPtr(), outputData.getVolume());
                    // Draw faceKeypoints
                    cudaMemcpy(pGpuFace, faceKeypoints.getConstPtr(),
                               faceKeypoints.getSize(0) * FACE_NUMBER_PARTS * 3 * sizeof(float),
                               cudaMemcpyHostToDevice);
-                    renderFaceKeypointsGpu(*spGpuMemoryPtr, mFrameSize, pGpuFace, faceKeypoints.getSize(0),
+                    renderFaceKeypointsGpu(*spGpuMemory, frameSize, pGpuFace, faceKeypoints.getSize(0),
                                           mRenderThreshold, getAlphaKeypoint());
                    // CUDA check
                    cudaCheck(__LINE__, __FUNCTION__, __FILE__);
                }
                // GPU memory to CPU if last renderer
-                gpuToCpuMemoryIfLastRenderer(outputData.getPtr());
+                gpuToCpuMemoryIfLastRenderer(outputData.getPtr(), outputData.getVolume());
                cudaCheck(__LINE__, __FUNCTION__, __FILE__);
            // CPU_ONLY mode
            #else

--- a/src/openpose/gui/frameDisplayer.cpp
+++ b/src/openpose/gui/frameDisplayer.cpp
+#include <opencv2/opencv.hpp> // cv::imshow, cv::waitKey, cv::namedWindow, cv::setWindowProperty
 #include <opencv2/highgui/highgui.hpp> // cv::imshow, cv::waitKey, cv::namedWindow, cv::setWindowProperty
 #include <openpose/gui/frameDisplayer.hpp>

 namespace op
 {
-    FrameDisplayer::FrameDisplayer(const Point<int>& windowedSize, const std::string& windowedName, const bool fullScreen) :
-        mWindowedSize{windowedSize},
+    FrameDisplayer::FrameDisplayer(const std::string& windowedName, const Point<int>& initialWindowedSize, const bool fullScreen) :
        mWindowName{windowedName},
+        mWindowedSize{initialWindowedSize},
        mGuiDisplayMode{(fullScreen ? GuiDisplayMode::FullScreen : GuiDisplayMode::Windowed)}
    {
+        try
+        {
+            // If initial window size = 0 --> initialize to 640x480
+            if (mWindowedSize.x <= 0 || mWindowedSize.y <= 0)
+                mWindowedSize = Point<int>{640, 480};
+        }
+        catch (const std::exception& e)
+        {
+            error(e.what(), __LINE__, __FUNCTION__, __FILE__);
+        }
    }

    void FrameDisplayer::initializationOnThread()
@@ -72,6 +83,14 @@ namespace op
    {
        try
        {
+            // If frame > window size --> Resize window
+            if (mWindowedSize.x < frame.cols || mWindowedSize.y < frame.rows)
+            {
+                mWindowedSize.x = std::max(mWindowedSize.x, frame.cols);
+                mWindowedSize.y = std::max(mWindowedSize.y, frame.rows);
+                cv::resizeWindow(mWindowName, mWindowedSize.x, mWindowedSize.y);
+                cv::waitKey(1); // This one will show most probably a white image (I guess the program does not have time to render in 1 msec)
+            }
            cv::imshow(mWindowName, frame);
            if (waitKeyValue != -1)
                cv::waitKey(waitKeyValue);

--- a/src/openpose/gui/gui.cpp
+++ b/src/openpose/gui/gui.cpp
@@ -18,7 +18,7 @@ namespace op
            if (!helpCvMat.empty())
            {
                const auto fullScreen = false;
-                FrameDisplayer frameDisplayer{Point<int>{helpCvMat.cols, helpCvMat.rows}, OPEN_POSE_TEXT + " - GUI Help", fullScreen};
+                FrameDisplayer frameDisplayer{OPEN_POSE_TEXT + " - GUI Help", Point<int>{helpCvMat.cols, helpCvMat.rows}, fullScreen};
                frameDisplayer.displayFrame(helpCvMat, 33);
            }
        }
@@ -28,8 +28,8 @@ namespace op
        }
    }

-    void handleWaitKey(bool& guiPaused, FrameDisplayer& mFrameDisplayer, std::vector<std::shared_ptr<PoseExtractor>>& mPoseExtractors,
-                       std::vector<std::shared_ptr<PoseRenderer>>& mPoseRenderers, std::shared_ptr<std::atomic<bool>>& isRunningSharedPtr,
+    void handleWaitKey(bool& guiPaused, FrameDisplayer& frameDisplayer, std::vector<std::shared_ptr<PoseExtractor>>& poseExtractors,
+                       std::vector<std::shared_ptr<Renderer>>& renderers, std::shared_ptr<std::atomic<bool>>& isRunningSharedPtr,
                       std::shared_ptr<std::pair<std::atomic<bool>, std::atomic<int>>>& spVideoSeek)
    {
        try
@@ -54,7 +54,7 @@ namespace op
                    showGuiHelp();
                // Switch full screen - normal screen
                else if (castedKey=='f')
-                    mFrameDisplayer.switchGuiDisplayMode();
+                    frameDisplayer.switchGuiDisplayMode();
                // ------------------------- Producer-Related ------------------------- //
                // Pause
                else if (castedKey==' ')
@@ -81,45 +81,45 @@ namespace op
                // Enable/disable blending
                else if (castedKey=='b')
                {
-                    for (auto& poseRenderer : mPoseRenderers)
-                        poseRenderer->setBlendOriginalFrame(!poseRenderer->getBlendOriginalFrame());
+                    for (auto& renderer : renderers)
+                        renderer->setBlendOriginalFrame(!renderer->getBlendOriginalFrame());
                }
                // ------------------------- OpenPose-Related ------------------------- //
                // Modifying thresholds
                else if (castedKey=='-' || castedKey=='=')
-                    for (auto& poseExtractor : mPoseExtractors)
+                    for (auto& poseExtractor : poseExtractors)
                        poseExtractor->increase(PoseProperty::NMSThreshold, 0.005f * (castedKey=='-' ? -1 : 1));
                else if (castedKey=='_' || castedKey=='+')
-                    for (auto& poseExtractor : mPoseExtractors)
+                    for (auto& poseExtractor : poseExtractors)
                        poseExtractor->increase(PoseProperty::ConnectMinSubsetScore, 0.005f * (castedKey=='_' ? -1 : 1));
                else if (castedKey=='[' || castedKey==']')
-                    for (auto& poseExtractor : mPoseExtractors)
+                    for (auto& poseExtractor : poseExtractors)
                        poseExtractor->increase(PoseProperty::ConnectInterThreshold, 0.005f * (castedKey=='[' ? -1 : 1));
                else if (castedKey=='{' || castedKey=='}')
-                    for (auto& poseExtractor : mPoseExtractors)
+                    for (auto& poseExtractor : poseExtractors)
                        poseExtractor->increase(PoseProperty::ConnectInterMinAboveThreshold, (castedKey=='{' ? -1 : 1));
                else if (castedKey==';' || castedKey=='\'')
-                    for (auto& poseExtractor : mPoseExtractors)
+                    for (auto& poseExtractor : poseExtractors)
                        poseExtractor->increase(PoseProperty::ConnectMinSubsetCnt, (castedKey==';' ? -1 : 1));
                // ------------------------- Miscellaneous ------------------------- //
                // Show googly eyes
                else if (castedKey=='g')
-                    for (auto& poseRenderer : mPoseRenderers)
-                        poseRenderer->setShowGooglyEyes(!poseRenderer->getShowGooglyEyes());
+                    for (auto& renderer : renderers)
+                        renderer->setShowGooglyEyes(!renderer->getShowGooglyEyes());
                // ------------------------- OpenPose-Related ------------------------- //
                else if (castedKey==',' || castedKey=='.')
                {
                    const auto increment = (castedKey=='.' ? 1 : -1);
-                    for (auto& poseRenderer : mPoseRenderers)
-                        poseRenderer->increaseElementToRender(increment);
+                    for (auto& renderer : renderers)
+                        renderer->increaseElementToRender(increment);
                }
                else
                {
                    const std::string key2part = "0123456789qwertyuiopasd";
                    const auto newElementToRender = key2part.find(castedKey);
                    if (newElementToRender != std::string::npos)
-                        for (auto& poseRenderer : mPoseRenderers)
-                            poseRenderer->setElementToRender((int)newElementToRender);
+                        for (auto& renderer : renderers)
+                            renderer->setElementToRender((int)newElementToRender);
                }
            }
        }
@@ -129,19 +129,19 @@ namespace op
        }
    }

-    void handleUserInput(FrameDisplayer& mFrameDisplayer, std::vector<std::shared_ptr<PoseExtractor>>& mPoseExtractors,
-                         std::vector<std::shared_ptr<PoseRenderer>>& mPoseRenderers, std::shared_ptr<std::atomic<bool>>& isRunningSharedPtr,
+    void handleUserInput(FrameDisplayer& frameDisplayer, std::vector<std::shared_ptr<PoseExtractor>>& poseExtractors,
+                         std::vector<std::shared_ptr<Renderer>>& renderers, std::shared_ptr<std::atomic<bool>>& isRunningSharedPtr,
                         std::shared_ptr<std::pair<std::atomic<bool>, std::atomic<int>>>& spVideoSeek)
    {
        try
        {
            // The handleUserInput must be always performed, even if no tDatum is detected
            bool guiPaused = false;
-            handleWaitKey(guiPaused, mFrameDisplayer, mPoseExtractors, mPoseRenderers, isRunningSharedPtr, spVideoSeek);
+            handleWaitKey(guiPaused, frameDisplayer, poseExtractors, renderers, isRunningSharedPtr, spVideoSeek);
            while (guiPaused)
            {
                std::this_thread::sleep_for(std::chrono::milliseconds{1});
-                handleWaitKey(guiPaused, mFrameDisplayer, mPoseExtractors, mPoseRenderers, isRunningSharedPtr, spVideoSeek);
+                handleWaitKey(guiPaused, frameDisplayer, poseExtractors, renderers, isRunningSharedPtr, spVideoSeek);
            }
        }
        catch (const std::exception& e)
@@ -150,12 +150,12 @@ namespace op
        }
    }

-    Gui::Gui(const bool fullScreen, const Point<int>& outputSize, const std::shared_ptr<std::atomic<bool>>& isRunningSharedPtr, 
+    Gui::Gui(const Point<int>& outputSize, const bool fullScreen, const std::shared_ptr<std::atomic<bool>>& isRunningSharedPtr,
             const std::shared_ptr<std::pair<std::atomic<bool>, std::atomic<int>>>& videoSeekSharedPtr,
-             const std::vector<std::shared_ptr<PoseExtractor>>& poseExtractors, const std::vector<std::shared_ptr<PoseRenderer>>& poseRenderers) :
-        mFrameDisplayer{outputSize, OPEN_POSE_TEXT, fullScreen},
+             const std::vector<std::shared_ptr<PoseExtractor>>& poseExtractors, const std::vector<std::shared_ptr<Renderer>>& renderers) :
+        mFrameDisplayer{OPEN_POSE_TEXT, outputSize, fullScreen},
        mPoseExtractors{poseExtractors},
-        mPoseRenderers{poseRenderers},
+        mRenderers{renderers},
        spIsRunning{isRunningSharedPtr},
        spVideoSeek{videoSeekSharedPtr}
    {
@@ -178,7 +178,7 @@ namespace op
                mFrameDisplayer.displayFrame(cvOutputData, -1);

            // Handle user input
-            handleUserInput(mFrameDisplayer, mPoseExtractors, mPoseRenderers, spIsRunning, spVideoSeek);
+            handleUserInput(mFrameDisplayer, mPoseExtractors, mRenderers, spIsRunning, spVideoSeek);
        }
        catch (const std::exception& e)
        {

--- a/src/openpose/gui/guiInfoAdder.cpp
+++ b/src/openpose/gui/guiInfoAdder.cpp
@@ -49,9 +49,7 @@ namespace op
        }
    }

-    GuiInfoAdder::GuiInfoAdder(const Point<int>& outputSize, const int numberGpus, const bool guiEnabled) :
-        mOutputSize{outputSize},
-        mBorderMargin{intRound(fastMax(mOutputSize.x, mOutputSize.y) * 0.025)},
+    GuiInfoAdder::GuiInfoAdder(const int numberGpus, const bool guiEnabled) :
        mNumberGpus{numberGpus},
        mGuiEnabled{guiEnabled},
        mFpsCounter{0u},
@@ -68,6 +66,8 @@ namespace op
            // Security checks
            if (cvOutputData.empty())
                error("Wrong input element (empty cvOutputData).", __LINE__, __FUNCTION__, __FILE__);
+            // Size
+            const auto borderMargin = intRound(fastMax(cvOutputData.cols, cvOutputData.rows) * 0.025);
            // Update fps
            updateFps(mLastId, mFps, mFpsCounter, mFpsQueue, id, mNumberGpus);
            // Used colors
@@ -77,7 +77,7 @@ namespace op
            std::snprintf(charArrayAux, 15, "%4.1f fps", mFps);
            // Recording inverse: sec/gpu
            // std::snprintf(charArrayAux, 15, "%4.2f s/gpu", (mFps != 0. ? mNumberGpus/mFps : 0.));
-            putTextOnCvMat(cvOutputData, charArrayAux, {intRound(mOutputSize.x - mBorderMargin), mBorderMargin},
+            putTextOnCvMat(cvOutputData, charArrayAux, {intRound(cvOutputData.cols - borderMargin), borderMargin},
                           white, true);
            // Part to show
            // Allowing some buffer when changing the part to show (if >= 2 GPUs)
@@ -96,13 +96,13 @@ namespace op
            putTextOnCvMat(cvOutputData, "OpenPose - " +
                           (!mLastElementRenderedName.empty() ?
                                mLastElementRenderedName : (mGuiEnabled ? "'h' for help" : "")),
-                           {mBorderMargin, mBorderMargin}, white, false);
+                           {borderMargin, borderMargin}, white, false);
            // Frame number
            putTextOnCvMat(cvOutputData, "Frame: " + std::to_string(id),
-                           {mBorderMargin, (int)(mOutputSize.y - mBorderMargin)}, white, false);
+                           {borderMargin, (int)(cvOutputData.rows - borderMargin)}, white, false);
            // Number people
            putTextOnCvMat(cvOutputData, "People: " + std::to_string(poseKeypoints.getSize(0)),
-                           {(int)(mOutputSize.x - mBorderMargin), (int)(mOutputSize.y - mBorderMargin)}, white, true);
+                           {(int)(cvOutputData.cols - borderMargin), (int)(cvOutputData.rows - borderMargin)}, white, true);
        }
        catch (const std::exception& e)
        {

--- a/src/openpose/hand/CMakeLists.txt
+++ b/src/openpose/hand/CMakeLists.txt
@@ -3,7 +3,8 @@ set(SOURCES
    handDetector.cpp
    handDetectorFromTxt.cpp
    handExtractor.cpp
-    handRenderer.cpp
+    handCpuRenderer.cpp
+    handGpuRenderer.cpp
    renderHand.cpp
    renderHand.cu)


--- a/src/openpose/hand/handCpuRenderer.cpp
+++ b/src/openpose/hand/handCpuRenderer.cpp
+#include <openpose/hand/renderHand.hpp>
+#include <openpose/hand/handCpuRenderer.hpp>
+
+namespace op
+{
+    HandCpuRenderer::HandCpuRenderer(const float renderThreshold, const float alphaKeypoint,
+                                     const float alphaHeatMap) :
+        Renderer{renderThreshold, alphaKeypoint, alphaHeatMap}
+    {
+    }
+
+    void HandCpuRenderer::renderHand(Array<float>& outputData, const std::array<Array<float>, 2>& handKeypoints)
+    {
+        try
+        {
+            // Security checks
+            if (outputData.empty())
+                error("Empty Array<float> outputData.", __LINE__, __FUNCTION__, __FILE__);
+            if (handKeypoints[0].getSize(0) != handKeypoints[1].getSize(0))
+                error("Wrong hand format: handKeypoints.getSize(0) != handKeypoints.getSize(1).", __LINE__, __FUNCTION__, __FILE__);
+            // CPU rendering
+            renderHandKeypointsCpu(outputData, handKeypoints, mRenderThreshold);
+        }
+        catch (const std::exception& e)
+        {
+            error(e.what(), __LINE__, __FUNCTION__, __FILE__);
+        }
+    }
+}
--- a/src/openpose/hand/handRenderer.cpp
+++ b/src/openpose/hand/handRenderer.cpp
@@ -2,23 +2,19 @@
    #include <cuda.h>
    #include <cuda_runtime_api.h>
 #endif
-#include <openpose/hand/handParameters.hpp>
 #include <openpose/hand/renderHand.hpp>
 #include <openpose/utilities/cuda.hpp>
-#include <openpose/hand/handRenderer.hpp>
+#include <openpose/hand/handGpuRenderer.hpp>

 namespace op
 {
-    HandRenderer::HandRenderer(const Point<int>& frameSize,  const float renderThreshold, const float alphaKeypoint,
-                               const float alphaHeatMap, const RenderMode renderMode) :
-        Renderer{(unsigned long long)(frameSize.area() * 3), alphaKeypoint, alphaHeatMap},
-        mRenderThreshold{renderThreshold},
-        mFrameSize{frameSize},
-        mRenderMode{renderMode}
+    HandGpuRenderer::HandGpuRenderer(const float renderThreshold, const float alphaKeypoint,
+                                     const float alphaHeatMap) :
+        GpuRenderer{renderThreshold, alphaKeypoint, alphaHeatMap}
    {
    }

-    HandRenderer::~HandRenderer()
+    HandGpuRenderer::~HandGpuRenderer()
    {
        try
        {
@@ -33,12 +29,11 @@ namespace op
        }
    }

-    void HandRenderer::initializationOnThread()
+    void HandGpuRenderer::initializationOnThread()
    {
        try
        {
            log("Starting initialization on thread.", Priority::Low, __LINE__, __FUNCTION__, __FILE__);
-            Renderer::initializationOnThread();
            // GPU memory allocation for rendering
            #ifndef CPU_ONLY
                cudaMalloc((void**)(&pGpuHand), HAND_MAX_HANDS * HAND_NUMBER_PARTS * 3 * sizeof(float));
@@ -51,7 +46,7 @@ namespace op
        }
    }

-    void HandRenderer::renderHand(Array<float>& outputData, const std::array<Array<float>, 2>& handKeypoints)
+    void HandGpuRenderer::renderHand(Array<float>& outputData, const std::array<Array<float>, 2>& handKeypoints)
    {
        try
        {
@@ -60,56 +55,26 @@ namespace op
                error("Empty Array<float> outputData.", __LINE__, __FUNCTION__, __FILE__);
            if (handKeypoints[0].getSize(0) != handKeypoints[1].getSize(0))
                error("Wrong hand format: handKeypoints.getSize(0) != handKeypoints.getSize(1).", __LINE__, __FUNCTION__, __FILE__);
-
-            // CPU rendering
-            if (mRenderMode == RenderMode::Cpu)
-                renderHandCpu(outputData, handKeypoints);
-
-            // GPU rendering
-            else
-                renderHandGpu(outputData, handKeypoints);
-        }
-        catch (const std::exception& e)
-        {
-            error(e.what(), __LINE__, __FUNCTION__, __FILE__);
-        }
-    }
-
-    void HandRenderer::renderHandCpu(Array<float>& outputData, const std::array<Array<float>, 2>& handKeypoints) const
-    {
-        try
-        {
-            renderHandKeypointsCpu(outputData, handKeypoints, mRenderThreshold);
-        }
-        catch (const std::exception& e)
-        {
-            error(e.what(), __LINE__, __FUNCTION__, __FILE__);
-        }
-    }
-
-    void HandRenderer::renderHandGpu(Array<float>& outputData, const std::array<Array<float>, 2>& handKeypoints)
-    {
-        try
-        {
            // GPU rendering
            #ifndef CPU_ONLY
                const auto elementRendered = spElementToRender->load(); // I prefer std::round(T&) over intRound(T) for std::atomic
                const auto numberPeople = handKeypoints[0].getSize(0);
+                const Point<int> frameSize{outputData.getSize(2), outputData.getSize(1)};
                // GPU rendering
                if (numberPeople > 0 && elementRendered == 0)
                {
-                    cpuToGpuMemoryIfNotCopiedYet(outputData.getPtr());
+                    cpuToGpuMemoryIfNotCopiedYet(outputData.getPtr(), outputData.getVolume());
                    // Draw handKeypoints
                    const auto handArea = handKeypoints[0].getSize(1)*handKeypoints[0].getSize(2);
                    const auto handVolume = numberPeople * handArea;
                    cudaMemcpy(pGpuHand, handKeypoints[0].getConstPtr(), handVolume * sizeof(float), cudaMemcpyHostToDevice);
                    cudaMemcpy(pGpuHand + handVolume, handKeypoints[1].getConstPtr(), handVolume * sizeof(float), cudaMemcpyHostToDevice);
-                    renderHandKeypointsGpu(*spGpuMemoryPtr, mFrameSize, pGpuHand, 2 * numberPeople, mRenderThreshold);
+                    renderHandKeypointsGpu(*spGpuMemory, frameSize, pGpuHand, 2 * numberPeople, mRenderThreshold);
                    // CUDA check
                    cudaCheck(__LINE__, __FUNCTION__, __FILE__);
                }
                // GPU memory to CPU if last renderer
-                gpuToCpuMemoryIfLastRenderer(outputData.getPtr());
+                gpuToCpuMemoryIfLastRenderer(outputData.getPtr(), outputData.getVolume());
                cudaCheck(__LINE__, __FUNCTION__, __FILE__);
            // CPU_ONLY mode
            #else

--- a/src/openpose/pose/CMakeLists.txt
+++ b/src/openpose/pose/CMakeLists.txt
@@ -3,8 +3,10 @@ set(SOURCES
    bodyPartConnectorBase.cu
    bodyPartConnectorCaffe.cpp
    defineTemplates.cpp
+    poseCpuRenderer.cpp
    poseExtractor.cpp
    poseExtractorCaffe.cpp
+    poseGpuRenderer.cpp
    poseParameters.cpp
    poseRenderer.cpp
    renderPose.cpp

--- a/src/openpose/pose/poseCpuRenderer.cpp
+++ b/src/openpose/pose/poseCpuRenderer.cpp
+#include <openpose/pose/renderPose.hpp>
+#include <openpose/pose/poseCpuRenderer.hpp>
+
+namespace op
+{
+    PoseCpuRenderer::PoseCpuRenderer(const PoseModel poseModel, const float renderThreshold,
+                                     const bool blendOriginalFrame, const float alphaKeypoint,
+                                     const float alphaHeatMap) :
+        Renderer{renderThreshold, alphaKeypoint, alphaHeatMap, blendOriginalFrame},
+        PoseRenderer{poseModel}
+    {
+    }
+
+    std::pair<int, std::string> PoseCpuRenderer::renderPose(Array<float>& outputData,
+                                                            const Array<float>& poseKeypoints,
+                                                            const float scaleNetToOutput)
+    {
+        try
+        {
+            // Security checks
+            if (outputData.empty())
+                error("Empty Array<float> outputData.", __LINE__, __FUNCTION__, __FILE__);
+            // CPU rendering
+            const auto elementRendered = spElementToRender->load();
+            std::string elementRenderedName;
+            // Draw poseKeypoints
+            if (elementRendered == 0)
+                renderPoseKeypointsCpu(outputData, poseKeypoints, mPoseModel, mRenderThreshold, mBlendOriginalFrame);
+            // Draw heat maps / PAFs
+            else
+            {
+                UNUSED(scaleNetToOutput);
+                error("CPU rendering only available for drawing keypoints, no heat maps nor PAFs.",
+                      __LINE__, __FUNCTION__, __FILE__);
+            }
+            // Return result
+            return std::make_pair(elementRendered, elementRenderedName);
+        }
+        catch (const std::exception& e)
+        {
+            error(e.what(), __LINE__, __FUNCTION__, __FILE__);
+            return std::make_pair(-1, "");
+        }
+    }
+}
--- a/src/openpose/pose/poseExtractorCaffe.cpp
+++ b/src/openpose/pose/poseExtractorCaffe.cpp
@@ -26,6 +26,8 @@ namespace op
            const auto resizeScaleCheck = resizeScale / (mNetOutputSize.y/(float)netInputSize.y);
            if (1+1e-6 < resizeScaleCheck || resizeScaleCheck < 1-1e-6)
                error("Net input and output size must be proportional. resizeScaleCheck = " + std::to_string(resizeScaleCheck), __LINE__, __FUNCTION__, __FILE__);
+            // Layers parameters
+            spBodyPartConnectorCaffe->setPoseModel(mPoseModel);
        }
        catch (const std::exception& e)
        {
@@ -60,7 +62,6 @@ namespace op

            // Pose extractor blob and layer
            spPoseBlob = {std::make_shared<caffe::Blob<float>>(1,1,1,1)};
-            spBodyPartConnectorCaffe->setPoseModel(mPoseModel);
            spBodyPartConnectorCaffe->Reshape({spHeatMapsBlob.get(), spPeaksBlob.get()}, {spPoseBlob.get()});
            cudaCheck(__LINE__, __FUNCTION__, __FILE__);

@@ -104,7 +105,10 @@ namespace op
            // Get scale net to output
            const auto scaleProducerToNetInput = resizeGetScaleFactor(inputDataSize, mNetOutputSize);
            const Point<int> netSize{intRound(scaleProducerToNetInput*inputDataSize.x), intRound(scaleProducerToNetInput*inputDataSize.y)};
-            mScaleNetToOutput = {(float)resizeGetScaleFactor(netSize, mOutputSize)};
+            if (mOutputSize.x > 0 && mOutputSize.y > 0)
+                mScaleNetToOutput = {(float)resizeGetScaleFactor(netSize, mOutputSize)};
+            else
+                mScaleNetToOutput = {(float)resizeGetScaleFactor(netSize, inputDataSize)};

            // 4. Connecting body parts
            spBodyPartConnectorCaffe->setScaleNetToOutput(mScaleNetToOutput);

--- a/src/openpose/pose/poseGpuRenderer.cpp
+++ b/src/openpose/pose/poseGpuRenderer.cpp
+#ifndef CPU_ONLY
+    #include <cuda.h>
+    #include <cuda_runtime_api.h>
+#endif
+#include <openpose/pose/poseParameters.hpp>
+#include <openpose/pose/renderPose.hpp>
+#include <openpose/utilities/cuda.hpp>
+#include <openpose/pose/poseGpuRenderer.hpp>
+
+namespace op
+{
+    PoseGpuRenderer::PoseGpuRenderer(const Point<int>& heatMapsSize, const PoseModel poseModel,
+                                     const std::shared_ptr<PoseExtractor>& poseExtractor, const float renderThreshold,
+                                     const bool blendOriginalFrame, const float alphaKeypoint, const float alphaHeatMap,
+                                     const unsigned int elementToRender) :
+        // #body elements to render = #body parts (size()) + #body part pair connections + 3 (+whole pose +whole heatmaps +PAFs)
+        // POSE_BODY_PART_MAPPING crashes on Windows, replaced by getPoseBodyPartMapping
+        GpuRenderer{renderThreshold, alphaKeypoint, alphaHeatMap, blendOriginalFrame, elementToRender,
+                    (unsigned int)(getPoseBodyPartMapping(poseModel).size() + POSE_BODY_PART_PAIRS[(int)poseModel].size()/2 + 3)}, // mNumberElementsToRender
+        PoseRenderer{poseModel},
+        mHeatMapsSize{heatMapsSize},
+        spPoseExtractor{poseExtractor},
+        pGpuPose{nullptr}
+    {
+    }
+
+    PoseGpuRenderer::~PoseGpuRenderer()
+    {
+        try
+        {
+            // Free CUDA pointers - Note that if pointers are 0 (i.e. nullptr), no operation is performed.
+            #ifndef CPU_ONLY
+                cudaFree(pGpuPose);
+            #endif
+        }
+        catch (const std::exception& e)
+        {
+            error(e.what(), __LINE__, __FUNCTION__, __FILE__);
+        }
+    }
+
+    void PoseGpuRenderer::initializationOnThread()
+    {
+        try
+        {
+            log("Starting initialization on thread.", Priority::Low, __LINE__, __FUNCTION__, __FILE__);
+            // GPU memory allocation for rendering
+            #ifndef CPU_ONLY
+                cudaMalloc((void**)(&pGpuPose), POSE_MAX_PEOPLE * POSE_NUMBER_BODY_PARTS[(int)mPoseModel] * 3 * sizeof(float));
+                cudaCheck(__LINE__, __FUNCTION__, __FILE__);
+            #endif
+            log("Finished initialization on thread.", Priority::Low, __LINE__, __FUNCTION__, __FILE__);
+        }
+        catch (const std::exception& e)
+        {
+            error(e.what(), __LINE__, __FUNCTION__, __FILE__);
+        }
+    }
+
+    std::pair<int, std::string> PoseGpuRenderer::renderPose(Array<float>& outputData, const Array<float>& poseKeypoints,
+                                                            const float scaleNetToOutput)
+    {
+        try
+        {
+            // Security checks
+            if (outputData.empty())
+                error("Empty Array<float> outputData.", __LINE__, __FUNCTION__, __FILE__);
+            // GPU rendering
+            const auto elementRendered = spElementToRender->load();
+            std::string elementRenderedName;
+            #ifndef CPU_ONLY
+                const auto numberPeople = poseKeypoints.getSize(0);
+                if (numberPeople > 0 || elementRendered != 0 || !mBlendOriginalFrame)
+                {
+                    cpuToGpuMemoryIfNotCopiedYet(outputData.getPtr(), outputData.getVolume());
+                    cudaCheck(__LINE__, __FUNCTION__, __FILE__);
+                    const auto numberBodyParts = POSE_NUMBER_BODY_PARTS[(int)mPoseModel];
+                    const auto numberBodyPartsPlusBkg = numberBodyParts+1;
+                    const Point<int> frameSize{outputData.getSize(2), outputData.getSize(1)};
+                    // Draw poseKeypoints
+                    if (elementRendered == 0)
+                    {
+                        if (!poseKeypoints.empty())
+                            cudaMemcpy(pGpuPose, poseKeypoints.getConstPtr(), numberPeople * numberBodyParts * 3 * sizeof(float),
+                                       cudaMemcpyHostToDevice);
+                        renderPoseKeypointsGpu(*spGpuMemory, mPoseModel, numberPeople, frameSize, pGpuPose,
+                                               mRenderThreshold, mShowGooglyEyes, mBlendOriginalFrame, getAlphaKeypoint());
+                    }
+                    else
+                    {
+                        if (scaleNetToOutput == -1.f)
+                            error("Non valid scaleNetToOutput.", __LINE__, __FUNCTION__, __FILE__);
+                        // Draw specific body part or bkg
+                        if (elementRendered <= numberBodyPartsPlusBkg)
+                        {
+                            elementRenderedName = mPartIndexToName.at(elementRendered-1);
+                            renderPoseHeatMapGpu(*spGpuMemory, mPoseModel, frameSize, spPoseExtractor->getHeatMapCpuConstPtr(),
+                                                 mHeatMapsSize, scaleNetToOutput, elementRendered,
+                                                 (mBlendOriginalFrame ? getAlphaHeatMap() : 1.f));
+                        }
+                        // Draw PAFs (Part Affinity Fields)
+                        else if (elementRendered == numberBodyPartsPlusBkg+1)
+                        {
+                            elementRenderedName = "Heatmaps";
+                            renderPoseHeatMapsGpu(*spGpuMemory, mPoseModel, frameSize, spPoseExtractor->getHeatMapCpuConstPtr(),
+                                                  mHeatMapsSize, scaleNetToOutput, (mBlendOriginalFrame ? getAlphaHeatMap() : 1.f));
+                        }
+                        // Draw PAFs (Part Affinity Fields)
+                        else if (elementRendered == numberBodyPartsPlusBkg+2)
+                        {
+                            elementRenderedName = "PAFs (Part Affinity Fields)";
+                            renderPosePAFsGpu(*spGpuMemory, mPoseModel, frameSize, spPoseExtractor->getHeatMapCpuConstPtr(),
+                                              mHeatMapsSize, scaleNetToOutput, (mBlendOriginalFrame ? getAlphaHeatMap() : 1.f));
+                        }
+                        // Draw affinity between 2 body parts
+                        else
+                        {
+                            const auto affinityPart = (elementRendered-numberBodyPartsPlusBkg-3)*2;
+                            const auto affinityPartMapped = POSE_MAP_IDX[(int)mPoseModel].at(affinityPart);
+                            elementRenderedName = mPartIndexToName.at(affinityPartMapped);
+                            elementRenderedName = elementRenderedName.substr(0, elementRenderedName.find("("));
+                            renderPosePAFGpu(*spGpuMemory, mPoseModel, frameSize, spPoseExtractor->getHeatMapCpuConstPtr(),
+                                             mHeatMapsSize, scaleNetToOutput, affinityPartMapped,
+                                             (mBlendOriginalFrame ? getAlphaHeatMap() : 1.f));
+                        }
+                    }
+                }
+                // GPU memory to CPU if last renderer
+                gpuToCpuMemoryIfLastRenderer(outputData.getPtr(), outputData.getVolume());
+                cudaCheck(__LINE__, __FUNCTION__, __FILE__);
+            // CPU_ONLY mode
+            #else
+                error("GPU rendering not available if `CPU_ONLY` is set.", __LINE__, __FUNCTION__, __FILE__);
+                UNUSED(elementRendered);
+                UNUSED(outputData);
+                UNUSED(poseKeypoints);
+                UNUSED(scaleNetToOutput);
+            #endif
+            // Return result
+            return std::make_pair(elementRendered, elementRenderedName);
+        }
+        catch (const std::exception& e)
+        {
+            error(e.what(), __LINE__, __FUNCTION__, __FILE__);
+            return std::make_pair(-1, "");
+        }
+    }
+}
--- a/src/openpose/pose/poseRenderer.cpp
+++ b/src/openpose/pose/poseRenderer.cpp
-#ifndef CPU_ONLY
-    #include <cuda.h>
-    #include <cuda_runtime_api.h>
-#endif
 #include <openpose/pose/poseParameters.hpp>
 #include <openpose/pose/renderPose.hpp>
-#include <openpose/utilities/cuda.hpp>
 #include <openpose/pose/poseRenderer.hpp>

 namespace op
@@ -38,246 +33,9 @@ namespace op
        }
    }

-    PoseRenderer::PoseRenderer(const Point<int>& heatMapsSize, const Point<int>& outputSize, const PoseModel poseModel,
-                               const std::shared_ptr<PoseExtractor>& poseExtractor, const float renderThreshold,
-                               const bool blendOriginalFrame, const float alphaKeypoint, const float alphaHeatMap,
-                               const unsigned int elementToRender, const RenderMode renderMode) :
-        // #body elements to render = #body parts (size()) + #body part pair connections + 3 (+whole pose +whole heatmaps +PAFs)
-        // POSE_BODY_PART_MAPPING crashes on Windows, replaced by getPoseBodyPartMapping
-        Renderer{(unsigned long long)(outputSize.area() * 3), alphaKeypoint, alphaHeatMap, elementToRender,
-                 (unsigned int)(getPoseBodyPartMapping(poseModel).size() + POSE_BODY_PART_PAIRS[(int)poseModel].size()/2 + 3)}, // mNumberElementsToRender
-        mRenderThreshold{renderThreshold},
-        mHeatMapsSize{heatMapsSize},
-        mOutputSize{outputSize},
+    PoseRenderer::PoseRenderer(const PoseModel poseModel) :
        mPoseModel{poseModel},
-        mPartIndexToName{createPartToName(poseModel)},
-        spPoseExtractor{poseExtractor},
-        mRenderMode{renderMode},
-        mBlendOriginalFrame{blendOriginalFrame},
-        mShowGooglyEyes{false},
-        pGpuPose{nullptr}
+        mPartIndexToName{createPartToName(poseModel)}
    {
    }
-
-    PoseRenderer::~PoseRenderer()
-    {
-        try
-        {
-            // Free CUDA pointers - Note that if pointers are 0 (i.e. nullptr), no operation is performed.
-            #ifndef CPU_ONLY
-                cudaFree(pGpuPose);
-            #endif
-        }
-        catch (const std::exception& e)
-        {
-            error(e.what(), __LINE__, __FUNCTION__, __FILE__);
-        }
-    }
-
-    void PoseRenderer::initializationOnThread()
-    {
-        try
-        {
-            log("Starting initialization on thread.", Priority::Low, __LINE__, __FUNCTION__, __FILE__);
-            Renderer::initializationOnThread();
-            // GPU memory allocation for rendering
-            #ifndef CPU_ONLY
-                cudaMalloc((void**)(&pGpuPose), POSE_MAX_PEOPLE * POSE_NUMBER_BODY_PARTS[(int)mPoseModel] * 3 * sizeof(float));
-                cudaCheck(__LINE__, __FUNCTION__, __FILE__);
-            #endif
-            log("Finished initialization on thread.", Priority::Low, __LINE__, __FUNCTION__, __FILE__);
-        }
-        catch (const std::exception& e)
-        {
-            error(e.what(), __LINE__, __FUNCTION__, __FILE__);
-        }
-    }
-
-    bool PoseRenderer::getBlendOriginalFrame() const
-    {
-        try
-        {
-            return mBlendOriginalFrame;
-        }
-        catch (const std::exception& e)
-        {
-            error(e.what(), __LINE__, __FUNCTION__, __FILE__);
-            return false;
-        }
-    }
-
-    bool PoseRenderer::getShowGooglyEyes() const
-    {
-        try
-        {
-            return mShowGooglyEyes;
-        }
-        catch (const std::exception& e)
-        {
-            error(e.what(), __LINE__, __FUNCTION__, __FILE__);
-            return false;
-        }
-    }
-
-    void PoseRenderer::setBlendOriginalFrame(const bool blendOriginalFrame)
-    {
-        try
-        {
-            mBlendOriginalFrame = blendOriginalFrame;
-        }
-        catch (const std::exception& e)
-        {
-            error(e.what(), __LINE__, __FUNCTION__, __FILE__);
-        }
-    }
-
-    void PoseRenderer::setShowGooglyEyes(const bool showGooglyEyes)
-    {
-        try
-        {
-            mShowGooglyEyes = showGooglyEyes;
-        }
-        catch (const std::exception& e)
-        {
-            error(e.what(), __LINE__, __FUNCTION__, __FILE__);
-        }
-    }
-
-    std::pair<int, std::string> PoseRenderer::renderPose(Array<float>& outputData, const Array<float>& poseKeypoints,
-                                                         const float scaleNetToOutput)
-    {
-        try
-        {
-            // Security checks
-            if (outputData.empty())
-                error("Empty Array<float> outputData.", __LINE__, __FUNCTION__, __FILE__);
-
-            // CPU rendering
-            if (mRenderMode == RenderMode::Cpu)
-                return renderPoseCpu(outputData, poseKeypoints, scaleNetToOutput);
-
-            // GPU rendering
-            else
-                return renderPoseGpu(outputData, poseKeypoints, scaleNetToOutput);
-        }
-        catch (const std::exception& e)
-        {
-            error(e.what(), __LINE__, __FUNCTION__, __FILE__);
-            return std::make_pair(-1, "");
-        }
-    }
-
-    std::pair<int, std::string> PoseRenderer::renderPoseCpu(Array<float>& outputData, const Array<float>& poseKeypoints,
-                                                            const float scaleNetToOutput)
-    {
-        try
-        {
-            const auto elementRendered = spElementToRender->load();
-
-            std::string elementRenderedName;
-            // CPU rendering
-            // Draw poseKeypoints
-            if (elementRendered == 0)
-                renderPoseKeypointsCpu(outputData, poseKeypoints, mPoseModel, mRenderThreshold, mBlendOriginalFrame);
-            // Draw heat maps / PAFs
-            else
-            {
-                UNUSED(scaleNetToOutput);
-                error("CPU rendering only available for drawing keypoints, no heat maps nor PAFs.", __LINE__, __FUNCTION__, __FILE__);    
-            }
-            // Return result
-            return std::make_pair(elementRendered, elementRenderedName);
-        }
-        catch (const std::exception& e)
-        {
-            error(e.what(), __LINE__, __FUNCTION__, __FILE__);
-            return std::make_pair(-1, "");
-        }
-    }
-
-    std::pair<int, std::string> PoseRenderer::renderPoseGpu(Array<float>& outputData, const Array<float>& poseKeypoints,
-                                                            const float scaleNetToOutput)
-    {
-        try
-        {
-            const auto elementRendered = spElementToRender->load();
-
-            std::string elementRenderedName;
-            // GPU rendering
-            #ifndef CPU_ONLY
-                const auto numberPeople = poseKeypoints.getSize(0);
-                if (numberPeople > 0 || elementRendered != 0 || !mBlendOriginalFrame)
-                {
-                    cpuToGpuMemoryIfNotCopiedYet(outputData.getPtr());
-                    cudaCheck(__LINE__, __FUNCTION__, __FILE__);
-                    const auto numberBodyParts = POSE_NUMBER_BODY_PARTS[(int)mPoseModel];
-                    const auto numberBodyPartsPlusBkg = numberBodyParts+1;
-                    // Draw poseKeypoints
-                    if (elementRendered == 0)
-                    {
-                        if (!poseKeypoints.empty())
-                            cudaMemcpy(pGpuPose, poseKeypoints.getConstPtr(), numberPeople * numberBodyParts * 3 * sizeof(float),
-                                       cudaMemcpyHostToDevice);
-                        renderPoseKeypointsGpu(*spGpuMemoryPtr, mPoseModel, numberPeople, mOutputSize, pGpuPose,
-                                               mRenderThreshold, mShowGooglyEyes, mBlendOriginalFrame, getAlphaKeypoint());
-                    }
-                    else
-                    {
-                        if (scaleNetToOutput == -1.f)
-                            error("Non valid scaleNetToOutput.", __LINE__, __FUNCTION__, __FILE__);
-                        // Draw specific body part or bkg
-                        if (elementRendered <= numberBodyPartsPlusBkg)
-                        {
-                            elementRenderedName = mPartIndexToName.at(elementRendered-1);
-                            renderPoseHeatMapGpu(*spGpuMemoryPtr, mPoseModel, mOutputSize, spPoseExtractor->getHeatMapCpuConstPtr(),
-                                                 mHeatMapsSize, scaleNetToOutput, elementRendered,
-                                                 (mBlendOriginalFrame ? getAlphaHeatMap() : 1.f));
-                        }
-                        // Draw PAFs (Part Affinity Fields)
-                        else if (elementRendered == numberBodyPartsPlusBkg+1)
-                        {
-                            elementRenderedName = "Heatmaps";
-                            renderPoseHeatMapsGpu(*spGpuMemoryPtr, mPoseModel, mOutputSize, spPoseExtractor->getHeatMapCpuConstPtr(),
-                                                  mHeatMapsSize, scaleNetToOutput, (mBlendOriginalFrame ? getAlphaHeatMap() : 1.f));
-                        }
-                        // Draw PAFs (Part Affinity Fields)
-                        else if (elementRendered == numberBodyPartsPlusBkg+2)
-                        {
-                            elementRenderedName = "PAFs (Part Affinity Fields)";
-                            renderPosePAFsGpu(*spGpuMemoryPtr, mPoseModel, mOutputSize, spPoseExtractor->getHeatMapCpuConstPtr(),
-                                              mHeatMapsSize, scaleNetToOutput, (mBlendOriginalFrame ? getAlphaHeatMap() : 1.f));
-                        }
-                        // Draw affinity between 2 body parts
-                        else
-                        {
-                            const auto affinityPart = (elementRendered-numberBodyPartsPlusBkg-3)*2;
-                            const auto affinityPartMapped = POSE_MAP_IDX[(int)mPoseModel].at(affinityPart);
-                            elementRenderedName = mPartIndexToName.at(affinityPartMapped);
-                            elementRenderedName = elementRenderedName.substr(0, elementRenderedName.find("("));
-                            renderPosePAFGpu(*spGpuMemoryPtr, mPoseModel, mOutputSize, spPoseExtractor->getHeatMapCpuConstPtr(),
-                                             mHeatMapsSize, scaleNetToOutput, affinityPartMapped,
-                                             (mBlendOriginalFrame ? getAlphaHeatMap() : 1.f));
-                        }
-                    }
-                }
-                // GPU memory to CPU if last renderer
-                gpuToCpuMemoryIfLastRenderer(outputData.getPtr());
-                cudaCheck(__LINE__, __FUNCTION__, __FILE__);
-            // CPU_ONLY mode
-            #else
-                error("GPU rendering not available if `CPU_ONLY` is set.", __LINE__, __FUNCTION__, __FILE__);
-                UNUSED(elementRendered);
-                UNUSED(outputData);
-                UNUSED(poseKeypoints);
-                UNUSED(scaleNetToOutput);
-            #endif
-            // Return result
-            return std::make_pair(elementRendered, elementRenderedName);
-        }
-        catch (const std::exception& e)
-        {
-            error(e.what(), __LINE__, __FUNCTION__, __FILE__);
-            return std::make_pair(-1, "");
-        }
-    }
 }
--- a/windows/OpenPose/OpenPose.vcxproj
+++ b/windows/OpenPose/OpenPose.vcxproj
@@ -101,6 +101,7 @@
    <ClInclude Include="..\..\include\openpose\core\cvMatToOpOutput.hpp" />
    <ClInclude Include="..\..\include\openpose\core\datum.hpp" />
    <ClInclude Include="..\..\include\openpose\core\enumClasses.hpp" />
+    <ClInclude Include="..\..\include\openpose\core\gpuRenderer.hpp" />
    <ClInclude Include="..\..\include\openpose\core\headers.hpp" />
    <ClInclude Include="..\..\include\openpose\core\keypointScaler.hpp" />
    <ClInclude Include="..\..\include\openpose\core\macros.hpp" />
@@ -123,8 +124,10 @@
    <ClInclude Include="..\..\include\openpose\experimental\headers.hpp" />
    <ClInclude Include="..\..\include\openpose\experimental\producer\headers.hpp" />
    <ClInclude Include="..\..\include\openpose\experimental\producer\wPeoplePoseLoader.hpp" />
+    <ClInclude Include="..\..\include\openpose\face\faceCpuRenderer.hpp" />
    <ClInclude Include="..\..\include\openpose\face\faceDetector.hpp" />
    <ClInclude Include="..\..\include\openpose\face\faceExtractor.hpp" />
+    <ClInclude Include="..\..\include\openpose\face\faceGpuRenderer.hpp" />
    <ClInclude Include="..\..\include\openpose\face\faceParameters.hpp" />
    <ClInclude Include="..\..\include\openpose\face\faceRenderer.hpp" />
    <ClInclude Include="..\..\include\openpose\face\headers.hpp" />
@@ -158,9 +161,11 @@
    <ClInclude Include="..\..\include\openpose\gui\headers.hpp" />
    <ClInclude Include="..\..\include\openpose\gui\wGui.hpp" />
    <ClInclude Include="..\..\include\openpose\gui\wGuiInfoAdder.hpp" />
+    <ClInclude Include="..\..\include\openpose\hand\handCpuRenderer.hpp" />
    <ClInclude Include="..\..\include\openpose\hand\handDetector.hpp" />
    <ClInclude Include="..\..\include\openpose\hand\handDetectorFromTxt.hpp" />
    <ClInclude Include="..\..\include\openpose\hand\handExtractor.hpp" />
+    <ClInclude Include="..\..\include\openpose\hand\handGpuRenderer.hpp" />
    <ClInclude Include="..\..\include\openpose\hand\handParameters.hpp" />
    <ClInclude Include="..\..\include\openpose\hand\handRenderer.hpp" />
    <ClInclude Include="..\..\include\openpose\hand\headers.hpp" />
@@ -176,8 +181,10 @@
    <ClInclude Include="..\..\include\openpose\pose\bodyPartConnectorCaffe.hpp" />
    <ClInclude Include="..\..\include\openpose\pose\enumClasses.hpp" />
    <ClInclude Include="..\..\include\openpose\pose\headers.hpp" />
+    <ClInclude Include="..\..\include\openpose\pose\poseCpuRenderer.hpp" />
    <ClInclude Include="..\..\include\openpose\pose\poseExtractor.hpp" />
    <ClInclude Include="..\..\include\openpose\pose\poseExtractorCaffe.hpp" />
+    <ClInclude Include="..\..\include\openpose\pose\poseGpuRenderer.hpp" />
    <ClInclude Include="..\..\include\openpose\pose\poseParameters.hpp" />
    <ClInclude Include="..\..\include\openpose\pose\poseRenderer.hpp" />
    <ClInclude Include="..\..\include\openpose\pose\renderPose.hpp" />
@@ -236,6 +243,7 @@
    <ClCompile Include="..\..\src\openpose\core\cvMatToOpOutput.cpp" />
    <ClCompile Include="..\..\src\openpose\core\datum.cpp" />
    <ClCompile Include="..\..\src\openpose\core\defineTemplates.cpp" />
+    <ClCompile Include="..\..\src\openpose\core\gpuRenderer.cpp" />
    <ClCompile Include="..\..\src\openpose\core\keypointScaler.cpp" />
    <ClCompile Include="..\..\src\openpose\core\maximumBase.cpp" />
    <ClCompile Include="..\..\src\openpose\core\maximumCaffe.cpp" />
@@ -249,9 +257,10 @@
    <ClCompile Include="..\..\src\openpose\core\resizeAndMergeBase.cpp" />
    <ClCompile Include="..\..\src\openpose\core\resizeAndMergeCaffe.cpp" />
    <ClCompile Include="..\..\src\openpose\face\defineTemplates.cpp" />
+    <ClCompile Include="..\..\src\openpose\face\faceCpuRenderer.cpp" />
    <ClCompile Include="..\..\src\openpose\face\faceDetector.cpp" />
    <ClCompile Include="..\..\src\openpose\face\faceExtractor.cpp" />
-    <ClCompile Include="..\..\src\openpose\face\faceRenderer.cpp" />
+    <ClCompile Include="..\..\src\openpose\face\faceGpuRenderer.cpp" />
    <ClCompile Include="..\..\src\openpose\face\renderFace.cpp" />
    <ClCompile Include="..\..\src\openpose\filestream\cocoJsonSaver.cpp" />
    <ClCompile Include="..\..\src\openpose\filestream\defineTemplates.cpp" />
@@ -268,16 +277,19 @@
    <ClCompile Include="..\..\src\openpose\gui\gui.cpp" />
    <ClCompile Include="..\..\src\openpose\gui\guiInfoAdder.cpp" />
    <ClCompile Include="..\..\src\openpose\hand\defineTemplates.cpp" />
+    <ClCompile Include="..\..\src\openpose\hand\handCpuRenderer.cpp" />
    <ClCompile Include="..\..\src\openpose\hand\handDetector.cpp" />
    <ClCompile Include="..\..\src\openpose\hand\handDetectorFromTxt.cpp" />
    <ClCompile Include="..\..\src\openpose\hand\handExtractor.cpp" />
-    <ClCompile Include="..\..\src\openpose\hand\handRenderer.cpp" />
+    <ClCompile Include="..\..\src\openpose\hand\handGpuRenderer.cpp" />
    <ClCompile Include="..\..\src\openpose\hand\renderHand.cpp" />
    <ClCompile Include="..\..\src\openpose\pose\bodyPartConnectorBase.cpp" />
    <ClCompile Include="..\..\src\openpose\pose\bodyPartConnectorCaffe.cpp" />
    <ClCompile Include="..\..\src\openpose\pose\defineTemplates.cpp" />
+    <ClCompile Include="..\..\src\openpose\pose\poseCpuRenderer.cpp" />
    <ClCompile Include="..\..\src\openpose\pose\poseExtractor.cpp" />
    <ClCompile Include="..\..\src\openpose\pose\poseExtractorCaffe.cpp" />
+    <ClCompile Include="..\..\src\openpose\pose\poseGpuRenderer.cpp" />
    <ClCompile Include="..\..\src\openpose\pose\poseParameters.cpp" />
    <ClCompile Include="..\..\src\openpose\pose\poseRenderer.cpp" />
    <ClCompile Include="..\..\src\openpose\pose\renderPose.cpp" />
@@ -303,6 +315,10 @@
    <ClCompile Include="..\..\src\openpose\wrapper\wrapperStructOutput.cpp" />
    <ClCompile Include="..\..\src\openpose\wrapper\wrapperStructPose.cpp" />
  </ItemGroup>
+  <ItemGroup>
+    <None Include="..\..\include\openpose\utilities\cuda.hu" />
+    <None Include="..\..\include\openpose\utilities\render.hu" />
+  </ItemGroup>
  <ItemGroup>
    <CudaCompile Include="..\..\src\openpose\core\maximumBase.cu" />
    <CudaCompile Include="..\..\src\openpose\core\nmsBase.cu" />
@@ -312,10 +328,6 @@
    <CudaCompile Include="..\..\src\openpose\pose\bodyPartConnectorBase.cu" />
    <CudaCompile Include="..\..\src\openpose\pose\renderPose.cu" />
  </ItemGroup>
-  <ItemGroup>
-    <None Include="..\..\include\openpose\utilities\cuda.hu" />
-    <None Include="..\..\include\openpose\utilities\render.hu" />
-  </ItemGroup>
  <Import Project="$(VCTargetsPath)\Microsoft.Cpp.targets" />
  <ImportGroup Label="ExtensionTargets">
    <Import Project="$(VCTargetsPath)\BuildCustomizations\CUDA 8.0.targets" />

--- a/windows/OpenPose/OpenPose.vcxproj.filters
+++ b/windows/OpenPose/OpenPose.vcxproj.filters