未验证 提交 8518d053 编写于 作者: P pkulzc 提交者: GitHub

Open source MnasFPN and minor fixes to OD API (#8484)

310447280  by lzc:

    Internal change

310420845  by Zhichao Lu:

    Open source the internal Context RCNN code.

--
310362339  by Zhichao Lu:

    Internal change

310259448  by lzc:

    Update required TF version for OD API.

--
310252159  by Zhichao Lu:

    Port patch_ops_test to TF1/TF2 as TPUs.

--
310247180  by Zhichao Lu:

    Ignore keypoint heatmap loss in the regions/bounding boxes with target keypoint
    class but no valid keypoint annotations.

--
310178294  by Zhichao Lu:

    Opensource MnasFPN
    https://arxiv.org/abs/1912.01106

--
310094222  by lzc:

    Internal changes.

--
310085250  by lzc:

    Internal Change.

--
310016447  by huizhongc:

    Remove unrecognized classes from labeled_classes.

--
310009470  by rathodv:

    Mark batcher.py as TF1 only.

--
310001984  by rathodv:

    Update core/preprocessor.py to be compatible with TF1/TF2..

--
309455035  by Zhichao Lu:

    Makes the freezable_batch_norm_test run w/ v2 behavior.

    The main change is in v2 updates will happen right away when running batchnorm in training mode. So, we need to restore the weights between batchnorm calls to make sure the numerical checks all start from the same place.

--
309425881  by Zhichao Lu:

    Make TF1/TF2 optimizer builder tests explicit.

--
309408646  by Zhichao Lu:

    Make dataset builder tests TF1 and TF2 compatible.

--
309246305  by Zhichao Lu:

    Added the functionality of combining the person keypoints and object detection
    annotations in the binary that converts the COCO raw data to TfRecord.

--
309125076  by Zhichao Lu:

    Convert target_assigner_utils to TF1/TF2.

--
308966359  by huizhongc:

    Support SSD training with partially labeled groundtruth.

--
308937159  by rathodv:

    Update core/target_assigner.py to be compatible with TF1/TF2.

--
308774302  by Zhichao Lu:

    Internal

--
308732860  by rathodv:

    Make core/prefetcher.py  compatible with TF1 only.

--
308726984  by rathodv:

    Update core/multiclass_nms_test.py to be TF1/TF2 compatible.

--
308714718  by rathodv:

    Update core/region_similarity_calculator_test.py to be TF1/TF2 compatible.

--
308707960  by rathodv:

    Update core/minibatch_sampler_test.py to be TF1/TF2 compatible.

--
308700595  by rathodv:

    Update core/losses_test.py to be TF1/TF2 compatible and remove losses_test_v2.py

--
308361472  by rathodv:

    Update core/matcher_test.py to be TF1/TF2 compatible.

--
308335846  by Zhichao Lu:

    Updated the COCO evaluation logics and populated the groundturth area
    information through. This change matches the groundtruth format expected by the
    COCO keypoint evaluation.

--
308256924  by rathodv:

    Update core/keypoints_ops_test.py to be TF1/TF2 compatible.

--
308256826  by rathodv:

    Update class_agnostic_nms_test.py to be TF1/TF2 compatible.

--
308256112  by rathodv:

    Update box_list_ops_test.py to be TF1/TF2 compatible.

--
308159360  by Zhichao Lu:

    Internal change

308145008  by Zhichao Lu:

    Added 'image/class/confidence' field in the TFExample decoder.

--
307651875  by rathodv:

    Refactor core/box_list.py to support TF1/TF2.

--
307651798  by rathodv:

    Modify box_coder.py base class to work with with TF1/TF2

--
307651652  by rathodv:

    Refactor core/balanced_positive_negative_sampler.py to support TF1/TF2.

--
307651571  by rathodv:

    Modify BoxCoders tests to use test_case:execute method to allow testing with TF1.X and TF2.X

--
307651480  by rathodv:

    Modify Matcher tests to use test_case:execute method to allow testing with TF1.X and TF2.X

--
307651409  by rathodv:

    Modify AnchorGenerator tests to use test_case:execute method to allow testing with TF1.X and TF2.X

--
307651314  by rathodv:

    Refactor model_builder to support TF1 or TF2 models based on TensorFlow version.

--
307092053  by Zhichao Lu:

    Use manager to save checkpoint.

--
307071352  by ronnyvotel:

    Fixing keypoint visibilities. Now by default, the visibility is marked True if the keypoint is labeled (regardless of whether it is visible or not).
    Also, if visibilities are not present in the dataset, they will be created based on whether the keypoint coordinates are finite (vis = True) or NaN (vis = False).

--
307069557  by Zhichao Lu:

    Internal change to add few fields related to postprocessing parameters in
    center_net.proto and populate those parameters to the keypoint postprocessing
    functions.

--
307012091  by Zhichao Lu:

    Make Adam Optimizer's epsilon proto configurable.

    Potential issue: tf.compat.v1's AdamOptimizer has a default epsilon on 1e-08 ([doc-link](https://www.tensorflow.org/api_docs/python/tf/compat/v1/train/AdamOptimizer))  whereas tf.keras's AdamOptimizer has default epsilon 1e-07 ([doc-link](https://www.tensorflow.org/api_docs/python/tf/keras/optimizers/Adam))

--
306858598  by Zhichao Lu:

    Internal changes to update the CenterNet model:
    1) Modified eval job loss computation to avoid averaging over batches with zero loss.
    2) Updated CenterNet keypoint heatmap target assigner to apply box size to heatmap Guassian standard deviation.
    3) Updated the CenterNet meta arch keypoint losses computation to apply weights outside of loss function.

--
306731223  by jonathanhuang:

    Internal change.

--
306549183  by rathodv:

    Internal Update.

--
306542930  by rathodv:

    Internal Update

--
306322697  by rathodv:

    Internal.

--
305345036  by Zhichao Lu:

    Adding COCO Camera Traps Json to tf.Example beam code

--
304104869  by lzc:

    Internal changes.

--
304068971  by jonathanhuang:

    Internal change.

--
304050469  by Zhichao Lu:

    Internal change.

--
303880642  by huizhongc:

    Support parsing partially labeled groundtruth.

--
303841743  by Zhichao Lu:

    Deprecate nms_on_host in SSDMetaArch.

--
303803204  by rathodv:

    Internal change.

--
303793895  by jonathanhuang:

    Internal change.

--
303467631  by rathodv:

    Py3 update for detection inference test.

--
303444542  by rathodv:

    Py3 update to metrics module

--
303421960  by rathodv:

    Update json_utils to python3.

--
302787583  by ronnyvotel:

    Coco results generator for submission to the coco test server.

--
302719091  by Zhichao Lu:

    Internal change to add the ResNet50 image feature extractor for CenterNet model.

--
302116230  by Zhichao Lu:

    Added the functions to overlay the heatmaps with images in visualization util
    library.

--
301888316  by Zhichao Lu:

    Fix checkpoint_filepath not defined error.

--
301840312  by ronnyvotel:

    Adding keypoint_scores to visualizations.

--
301683475  by ronnyvotel:

    Introducing the ability to preprocess `keypoint_visibilities`.

    Some data augmentation ops such as random crop can filter instances and keypoints. It's important to also filter keypoint visibilities, so that the groundtruth tensors are always in alignment.

--
301532344  by Zhichao Lu:

    Don't use tf.divide since "Quantization not yet supported for op: DIV"

--
301480348  by ronnyvotel:

    Introducing keypoint evaluation into model lib v2.
    Also, making some fixes to coco keypoint evaluation.

--
301454018  by Zhichao Lu:

    Added the image summary to visualize the train/eval input images and eval's
    prediction/groundtruth side-by-side image.

--
301317527  by Zhichao Lu:

    Updated the random_absolute_pad_image function in the preprocessor library to
    support the keypoints argument.

--
301300324  by Zhichao Lu:

    Apply name change(experimental_run_v2 -> run) for all callers in Tensorflow.

--
301297115  by ronnyvotel:

    Utility function for setting keypoint visibilities based on keypoint coordinates.

--
301248885  by Zhichao Lu:

    Allow MultiworkerMirroredStrategy(MWMS) use by adding checkpoint handling with temporary directories in model_lib_v2. Added missing WeakKeyDictionary cfer_fn_cache field in CollectiveAllReduceStrategyExtended.

--
301224559  by Zhichao Lu:

    ...1) Fixes model_lib to also use keypoints while preparing model groundtruth.
    ...2) Tests model_lib with newly added keypoint metrics config.

--
300836556  by Zhichao Lu:

    Internal changes to add keypoint estimation parameters in CenterNet proto.

--
300795208  by Zhichao Lu:

    Updated the eval_util library to populate the keypoint groundtruth to
    eval_dict.

--
299474766  by Zhichao Lu:

    ...Modifies eval_util to create Keypoint Evaluator objects when configured in eval config.

--
299453920  by Zhichao Lu:

    Add swish activation as a hyperperams option.

--
299240093  by ronnyvotel:

    Keypoint postprocessing for CenterNetMetaArch.

--
299176395  by Zhichao Lu:

    Internal change.

--
299135608  by Zhichao Lu:

    Internal changes to refactor the CenterNet model in preparation for keypoint estimation tasks.

--
298915482  by Zhichao Lu:

    Make dataset_builder aware of input_context for distributed training.

--
298713595  by Zhichao Lu:

    Handling data with negative size boxes.

--
298695964  by Zhichao Lu:

    Expose change_coordinate_frame as a config parameter; fix multiclass_scores optional field.

--
298492150  by Zhichao Lu:

    Rename optimizer_builder_test_v2.py -> optimizer_builder_v2_test.py

--
298476471  by Zhichao Lu:

    Internal changes to support CenterNet keypoint estimation.

--
298365851  by ronnyvotel:

    Fixing a bug where groundtruth_keypoint_weights were being padded with a dynamic dimension.

--
297843700  by Zhichao Lu:

    Internal change.

--
297706988  by lzc:

    Internal change.

--
297705287  by ronnyvotel:

    Creating the "snapping" behavior in CenterNet, where regressed keypoints are refined with updated candidate keypoints from a heatmap.

--
297700447  by Zhichao Lu:

    Improve checkpoint checking logic with TF2 loop.

--
297686094  by Zhichao Lu:

    Convert "import tensorflow as tf" to "import tensorflow.compat.v1".

--
297670468  by lzc:

    Internal change.

--
297241327  by Zhichao Lu:

    Convert "import tensorflow as tf" to "import tensorflow.compat.v1".

--
297205959  by Zhichao Lu:

    Internal changes to support refactored the centernet object detection target assigner into a separate library.

--
297143806  by Zhichao Lu:

    Convert "import tensorflow as tf" to "import tensorflow.compat.v1".

--
297129625  by Zhichao Lu:

    Explicitly replace "import tensorflow" with "tensorflow.compat.v1" for TF2.x migration

--
297117070  by Zhichao Lu:

    Explicitly replace "import tensorflow" with "tensorflow.compat.v1" for TF2.x migration

--
297030190  by Zhichao Lu:

    Add configuration options for visualizing keypoint edges

--
296359649  by Zhichao Lu:

    Support DepthwiseConv2dNative (of separable conv) in weight equalization loss.

--
296290582  by Zhichao Lu:

    Internal change.

--
296093857  by Zhichao Lu:

    Internal changes to add general target assigner utilities.

--
295975116  by Zhichao Lu:

    Fix visualize_boxes_and_labels_on_image_array to show max_boxes_to_draw correctly.

--
295819711  by Zhichao Lu:

    Adds a flag to visualize_boxes_and_labels_on_image_array to skip the drawing of axis aligned bounding boxes.

--
295811929  by Zhichao Lu:

    Keypoint support in random_square_crop_by_scale.

--
295788458  by rathodv:

    Remove unused checkpoint to reduce repo size on github

--
295787184  by Zhichao Lu:

    Enable visualization of edges between keypoints

--
295763508  by Zhichao Lu:

    [Context RCNN] Add an option to enable / disable cropping feature in the post
    process step in the meta archtecture.

--
295605344  by Zhichao Lu:

    internal change.

--
294926050  by ronnyvotel:

    Adding per-keypoint groundtruth weights. These weights are intended to be used as multipliers in a keypoint loss function.

    Groundtruth keypoint weights are constructed as follows:
    - Initialize the weight for each keypoint type based on user-specified weights in the input_reader proto
    - Mask out (i.e. make zero) all keypoint weights that are not visible.

--
294829061  by lzc:

    Internal change.

--
294566503  by Zhichao Lu:

    Changed internal CenterNet Model configuration.

--
294346662  by ronnyvotel:

    Using NaN values in keypoint coordinates that are not visible.

--
294333339  by Zhichao Lu:

    Change experimetna_distribute_dataset -> experimental_distribute_dataset_from_function

--
293928752  by Zhichao Lu:

    Internal change

--
293909384  by Zhichao Lu:

    Add capabilities to train 1024x1024 CenterNet models.

--
293637554  by ronnyvotel:

    Adding keypoint visibilities to TfExampleDecoder.

--
293501558  by lzc:

    Internal change.

--
293252851  by Zhichao Lu:

    Change tf.gfile.GFile to tf.io.gfile.GFile.

--
292730217  by Zhichao Lu:

    Internal change.

--
292456563  by lzc:

    Internal changes.

--
292355612  by Zhichao Lu:

    Use tf.gather and tf.scatter_nd instead of matrix ops.

--
292245265  by rathodv:

    Internal

--
291989323  by richardmunoz:

    Refactor out building a DataDecoder from building a tf.data.Dataset.

--
291950147  by Zhichao Lu:

    Flip bounding boxes in arbitrary shaped tensors.

--
291401052  by huizhongc:

    Fix multiscale grid anchor generator to allow fully convolutional inference. When exporting model with identity_resizer as image_resizer, there is an incorrect box offset on the detection results. We add the anchor offset to address this problem.

--
291298871  by Zhichao Lu:

    Py3 compatibility changes.

--
290957957  by Zhichao Lu:

    Hourglass feature extractor for CenterNet.

--
290564372  by Zhichao Lu:

    Internal change.

--
290155278  by rathodv:

    Remove Dataset Explorer.

--
290155153  by Zhichao Lu:

    Internal change

--
290122054  by Zhichao Lu:

    Unify the format in the faster_rcnn.proto

--
290116084  by Zhichao Lu:

    Deprecate tensorflow.contrib.

--
290100672  by Zhichao Lu:

    Update MobilenetV3 SSD candidates

--
289926392  by Zhichao Lu:

    Internal change

--
289553440  by Zhichao Lu:

    [Object Detection API] Fix the comments about the dimension of the rpn_box_encodings from 4-D to 3-D.

--
288994128  by lzc:

    Internal changes.

--
288942194  by lzc:

    Internal change.

--
288746124  by Zhichao Lu:

    Configurable channel mean/std. dev in CenterNet feature extractors.

--
288552509  by rathodv:

    Internal.

--
288541285  by rathodv:

    Internal update.

--
288396396  by Zhichao Lu:

    Make object detection import contrib explicitly

--
288255791  by rathodv:

    Internal

--
288078600  by Zhichao Lu:

    Fix model_lib_v2 test

--
287952244  by rathodv:

    Internal

--
287921774  by Zhichao Lu:

    internal change

--
287906173  by Zhichao Lu:

    internal change

--
287889407  by jonathanhuang:

    PY3 compatibility

--
287889042  by rathodv:

    Internal

--
287876178  by Zhichao Lu:

    Internal change.

--
287770490  by Zhichao Lu:

    Add CenterNet proto and builder

--
287694213  by Zhichao Lu:

    Support for running multiple steps per tf.function call.

--
287377183  by jonathanhuang:

    PY3 compatibility

--
287371344  by rathodv:

    Support loading keypoint labels and ids.

--
287368213  by rathodv:

    Add protos supporting keypoint evaluation.

--
286673200  by rathodv:

    dataset_tools PY3 migration

--
286635106  by Zhichao Lu:

    Update code for upcoming tf.contrib removal

--
286479439  by Zhichao Lu:

    Internal change

--
286311711  by Zhichao Lu:

    Skeleton of context model within TFODAPI

--
286005546  by Zhichao Lu:

    Fix Faster-RCNN training when using keep_aspect_ratio_resizer with pad_to_max_dimension

--
285906400  by derekjchow:

    Internal change

--
285822795  by Zhichao Lu:

    Add CenterNet meta arch target assigners.

--
285447238  by Zhichao Lu:

    Internal changes.

--
285016927  by Zhichao Lu:

    Make _dummy_computation a tf.function. This fixes breakage caused by
    cl/284256438

--
284827274  by Zhichao Lu:

    Convert to python 3.

--
284645593  by rathodv:

    Internal change

--
284639893  by rathodv:

    Add missing documentation for keypoints in eval_util.py.

--
284323712  by Zhichao Lu:

    Internal changes.

--
284295290  by Zhichao Lu:

    Updating input config proto and dataset builder to include context fields

    Updating standard_fields and tf_example_decoder to include context features

--
284226821  by derekjchow:

    Update exporter.

--
284211030  by Zhichao Lu:

    API changes in CenterNet informed by the experiments with hourlgass network.

--
284190451  by Zhichao Lu:

    Add support for CenterNet losses in protos and builders.

--
284093961  by lzc:

    Internal changes.

--
284028174  by Zhichao Lu:

    Internal change

--
284014719  by derekjchow:

    Do not pad top_down feature maps unnecessarily.

--
284005765  by Zhichao Lu:

    Add new pad_to_multiple_resizer

--
283858233  by Zhichao Lu:

    Make target assigner work when under tf.function.

--
283836611  by Zhichao Lu:

    Make config getters more general.

--
283808990  by Zhichao Lu:

    Internal change

--
283754588  by Zhichao Lu:

    Internal changes.

--
282460301  by Zhichao Lu:

    Add ability to restore v2 style checkpoints.

--
281605842  by lzc:

    Add option to disable loss computation in OD API eval job.

--
280298212  by Zhichao Lu:

    Add backwards compatible change

--
280237857  by Zhichao Lu:

    internal change

--

PiperOrigin-RevId: 310447280
上级 ac5fff19
![TensorFlow Requirement: 1.x](https://img.shields.io/badge/TensorFlow%20Requirement-1.x-brightgreen)
![TensorFlow Requirement: 1.15](https://img.shields.io/badge/TensorFlow%20Requirement-1.15-brightgreen)
![TensorFlow 2 Not Supported](https://img.shields.io/badge/TensorFlow%202%20Not%20Supported-%E2%9C%95-red.svg)
# Tensorflow Object Detection API
......@@ -31,7 +31,7 @@ https://scholar.googleusercontent.com/scholar.bib?q=info:l291WsrB-hQJ:scholar.go
| Name | GitHub |
| --- | --- |
| Jonathan Huang | [jch1](https://github.com/jch1) |
| Jonathan Huang | [jch1](https://github.com/jch1) |
| Vivek Rathod | [tombstone](https://github.com/tombstone) |
| Ronny Votel | [ronnyvotel](https://github.com/ronnyvotel) |
| Derek Chow | [derekjchow](https://github.com/derekjchow) |
......@@ -40,7 +40,6 @@ https://scholar.googleusercontent.com/scholar.bib?q=info:l291WsrB-hQJ:scholar.go
| Alireza Fathi | [afathi3](https://github.com/afathi3) |
| Zhichao Lu | [pkulzc](https://github.com/pkulzc) |
## Table of contents
Setup:
......@@ -105,6 +104,25 @@ reporting an issue.
## Release information
### May 7th, 2020
We have released a mobile model with the
[MnasFPN head](https://arxiv.org/abs/1912.01106).
* MnasFPN with MobileNet-V2 backbone is the most accurate (26.6 mAP at 183ms on
Pixel 1) mobile detection model we have released to date. With depth-multiplier,
MnasFPN with MobileNet-V2 backbone is 1.8 mAP higher than MobileNet-V3-Large
with SSDLite (23.8 mAP vs 22.0 mAP) at similar latency (120ms) on Pixel 1.
We have released model definition, model checkpoints trained on
the COCO14 dataset and a converted TFLite model.
<b>Thanks to contributors</b>: Bo Chen, Golnaz Ghiasi, Hanxiao Liu,
Tsung-Yi Lin, Dmitry Kalenichenko, Hartwig Adam, Quoc Le, Zhichao Lu,
Jonathan Huang.
### Nov 13th, 2019
We have released MobileNetEdgeTPU SSDLite model.
......
......@@ -24,52 +24,51 @@ from object_detection.utils import test_case
class FlexibleGridAnchorGeneratorTest(test_case.TestCase):
def test_construct_single_anchor(self):
anchor_strides = [(32, 32),]
anchor_offsets = [(16, 16),]
base_sizes = [(128.0,)]
aspect_ratios = [(1.0,)]
im_height = 64
im_width = 64
feature_map_shape_list = [(2, 2)]
def graph_fn():
anchor_strides = [(32, 32),]
anchor_offsets = [(16, 16),]
base_sizes = [(128.0,)]
aspect_ratios = [(1.0,)]
im_height = 64
im_width = 64
feature_map_shape_list = [(2, 2)]
anchor_generator = fg.FlexibleGridAnchorGenerator(
base_sizes, aspect_ratios, anchor_strides, anchor_offsets,
normalize_coordinates=False)
anchors_list = anchor_generator.generate(
feature_map_shape_list, im_height=im_height, im_width=im_width)
anchor_corners = anchors_list[0].get()
return anchor_corners
anchor_corners_out = self.execute(graph_fn, [])
exp_anchor_corners = [[-48, -48, 80, 80],
[-48, -16, 80, 112],
[-16, -48, 112, 80],
[-16, -16, 112, 112]]
anchor_generator = fg.FlexibleGridAnchorGenerator(
base_sizes, aspect_ratios, anchor_strides, anchor_offsets,
normalize_coordinates=False)
anchors_list = anchor_generator.generate(
feature_map_shape_list, im_height=im_height, im_width=im_width)
anchor_corners = anchors_list[0].get()
with self.test_session():
anchor_corners_out = anchor_corners.eval()
self.assertAllClose(anchor_corners_out, exp_anchor_corners)
self.assertAllClose(anchor_corners_out, exp_anchor_corners)
def test_construct_single_anchor_unit_dimensions(self):
anchor_strides = [(32, 32),]
anchor_offsets = [(16, 16),]
base_sizes = [(32.0,)]
aspect_ratios = [(1.0,)]
im_height = 1
im_width = 1
feature_map_shape_list = [(2, 2)]
def graph_fn():
anchor_strides = [(32, 32),]
anchor_offsets = [(16, 16),]
base_sizes = [(32.0,)]
aspect_ratios = [(1.0,)]
im_height = 1
im_width = 1
feature_map_shape_list = [(2, 2)]
anchor_generator = fg.FlexibleGridAnchorGenerator(
base_sizes, aspect_ratios, anchor_strides, anchor_offsets,
normalize_coordinates=False)
anchors_list = anchor_generator.generate(
feature_map_shape_list, im_height=im_height, im_width=im_width)
anchor_corners = anchors_list[0].get()
return anchor_corners
# Positive offsets are produced.
exp_anchor_corners = [[0, 0, 32, 32],
[0, 32, 32, 64],
[32, 0, 64, 32],
[32, 32, 64, 64]]
anchor_generator = fg.FlexibleGridAnchorGenerator(
base_sizes, aspect_ratios, anchor_strides, anchor_offsets,
normalize_coordinates=False)
anchors_list = anchor_generator.generate(
feature_map_shape_list, im_height=im_height, im_width=im_width)
anchor_corners = anchors_list[0].get()
with self.test_session():
anchor_corners_out = anchor_corners.eval()
self.assertAllClose(anchor_corners_out, exp_anchor_corners)
anchor_corners_out = self.execute(graph_fn, [])
self.assertAllClose(anchor_corners_out, exp_anchor_corners)
def test_construct_normalized_anchors_fails_with_unit_dimensions(self):
anchor_generator = fg.FlexibleGridAnchorGenerator(
......@@ -80,27 +79,27 @@ class FlexibleGridAnchorGeneratorTest(test_case.TestCase):
feature_map_shape_list=[(2, 2)], im_height=1, im_width=1)
def test_construct_single_anchor_in_normalized_coordinates(self):
anchor_strides = [(32, 32),]
anchor_offsets = [(16, 16),]
base_sizes = [(128.0,)]
aspect_ratios = [(1.0,)]
im_height = 64
im_width = 128
feature_map_shape_list = [(2, 2)]
def graph_fn():
anchor_strides = [(32, 32),]
anchor_offsets = [(16, 16),]
base_sizes = [(128.0,)]
aspect_ratios = [(1.0,)]
im_height = 64
im_width = 128
feature_map_shape_list = [(2, 2)]
anchor_generator = fg.FlexibleGridAnchorGenerator(
base_sizes, aspect_ratios, anchor_strides, anchor_offsets,
normalize_coordinates=True)
anchors_list = anchor_generator.generate(
feature_map_shape_list, im_height=im_height, im_width=im_width)
anchor_corners = anchors_list[0].get()
return anchor_corners
exp_anchor_corners = [[-48./64, -48./128, 80./64, 80./128],
[-48./64, -16./128, 80./64, 112./128],
[-16./64, -48./128, 112./64, 80./128],
[-16./64, -16./128, 112./64, 112./128]]
anchor_generator = fg.FlexibleGridAnchorGenerator(
base_sizes, aspect_ratios, anchor_strides, anchor_offsets,
normalize_coordinates=True)
anchors_list = anchor_generator.generate(
feature_map_shape_list, im_height=im_height, im_width=im_width)
anchor_corners = anchors_list[0].get()
with self.test_session():
anchor_corners_out = anchor_corners.eval()
self.assertAllClose(anchor_corners_out, exp_anchor_corners)
anchor_corners_out = self.execute(graph_fn, [])
self.assertAllClose(anchor_corners_out, exp_anchor_corners)
def test_num_anchors_per_location(self):
anchor_strides = [(32, 32), (64, 64)]
......@@ -115,29 +114,28 @@ class FlexibleGridAnchorGeneratorTest(test_case.TestCase):
self.assertEqual(anchor_generator.num_anchors_per_location(), [6, 6])
def test_construct_single_anchor_dynamic_size(self):
anchor_strides = [(32, 32),]
anchor_offsets = [(0, 0),]
base_sizes = [(128.0,)]
aspect_ratios = [(1.0,)]
im_height = tf.constant(64)
im_width = tf.constant(64)
feature_map_shape_list = [(2, 2)]
def graph_fn():
anchor_strides = [(32, 32),]
anchor_offsets = [(0, 0),]
base_sizes = [(128.0,)]
aspect_ratios = [(1.0,)]
im_height = tf.constant(64)
im_width = tf.constant(64)
feature_map_shape_list = [(2, 2)]
anchor_generator = fg.FlexibleGridAnchorGenerator(
base_sizes, aspect_ratios, anchor_strides, anchor_offsets,
normalize_coordinates=False)
anchors_list = anchor_generator.generate(
feature_map_shape_list, im_height=im_height, im_width=im_width)
anchor_corners = anchors_list[0].get()
return anchor_corners
# Zero offsets are used.
exp_anchor_corners = [[-64, -64, 64, 64],
[-64, -32, 64, 96],
[-32, -64, 96, 64],
[-32, -32, 96, 96]]
anchor_generator = fg.FlexibleGridAnchorGenerator(
base_sizes, aspect_ratios, anchor_strides, anchor_offsets,
normalize_coordinates=False)
anchors_list = anchor_generator.generate(
feature_map_shape_list, im_height=im_height, im_width=im_width)
anchor_corners = anchors_list[0].get()
with self.test_session():
anchor_corners_out = anchor_corners.eval()
self.assertAllClose(anchor_corners_out, exp_anchor_corners)
anchor_corners_out = self.execute_cpu(graph_fn, [])
self.assertAllClose(anchor_corners_out, exp_anchor_corners)
def test_construct_single_anchor_with_odd_input_dimension(self):
......
......@@ -212,7 +212,7 @@ class MultipleGridAnchorGenerator(anchor_generator.AnchorGenerator):
min_im_shape = tf.minimum(im_height, im_width)
scale_height = min_im_shape / im_height
scale_width = min_im_shape / im_width
if not tf.contrib.framework.is_tensor(self._base_anchor_size):
if not tf.is_tensor(self._base_anchor_size):
base_anchor_size = [
scale_height * tf.constant(self._base_anchor_size[0],
dtype=tf.float32),
......
......@@ -20,6 +20,8 @@ described in:
T.-Y. Lin, P. Goyal, R. Girshick, K. He, P. Dollar
"""
import tensorflow as tf
from object_detection.anchor_generators import grid_anchor_generator
from object_detection.core import anchor_generator
from object_detection.core import box_list_ops
......@@ -85,8 +87,10 @@ class MultiscaleGridAnchorGenerator(anchor_generator.AnchorGenerator):
def _generate(self, feature_map_shape_list, im_height=1, im_width=1):
"""Generates a collection of bounding boxes to be used as anchors.
Currently we require the input image shape to be statically defined. That
is, im_height and im_width should be integers rather than tensors.
For training, we require the input image shape to be statically defined.
That is, im_height and im_width should be integers rather than tensors.
For inference, im_height and im_width can be either integers (for fixed
image size), or tensors (for arbitrary image size).
Args:
feature_map_shape_list: list of pairs of convnet layer resolutions in the
......@@ -124,6 +128,9 @@ class MultiscaleGridAnchorGenerator(anchor_generator.AnchorGenerator):
anchor_offset[0] = stride / 2.0
if im_width % 2.0**level == 0 or im_width == 1:
anchor_offset[1] = stride / 2.0
if tf.is_tensor(im_height) and tf.is_tensor(im_width):
anchor_offset[0] = stride / 2.0
anchor_offset[1] = stride / 2.0
ag = grid_anchor_generator.GridAnchorGenerator(
scales,
aspect_ratios,
......
......@@ -24,54 +24,55 @@ from object_detection.utils import test_case
class MultiscaleGridAnchorGeneratorTest(test_case.TestCase):
def test_construct_single_anchor(self):
min_level = 5
max_level = 5
anchor_scale = 4.0
aspect_ratios = [1.0]
scales_per_octave = 1
im_height = 64
im_width = 64
feature_map_shape_list = [(2, 2)]
def graph_fn():
min_level = 5
max_level = 5
anchor_scale = 4.0
aspect_ratios = [1.0]
scales_per_octave = 1
im_height = 64
im_width = 64
feature_map_shape_list = [(2, 2)]
anchor_generator = mg.MultiscaleGridAnchorGenerator(
min_level, max_level, anchor_scale, aspect_ratios, scales_per_octave,
normalize_coordinates=False)
anchors_list = anchor_generator.generate(
feature_map_shape_list, im_height=im_height, im_width=im_width)
anchor_corners = anchors_list[0].get()
return anchor_corners
exp_anchor_corners = [[-48, -48, 80, 80],
[-48, -16, 80, 112],
[-16, -48, 112, 80],
[-16, -16, 112, 112]]
anchor_generator = mg.MultiscaleGridAnchorGenerator(
min_level, max_level, anchor_scale, aspect_ratios, scales_per_octave,
normalize_coordinates=False)
anchors_list = anchor_generator.generate(
feature_map_shape_list, im_height=im_height, im_width=im_width)
anchor_corners = anchors_list[0].get()
with self.test_session():
anchor_corners_out = anchor_corners.eval()
self.assertAllClose(anchor_corners_out, exp_anchor_corners)
anchor_corners_out = self.execute(graph_fn, [])
self.assertAllClose(anchor_corners_out, exp_anchor_corners)
def test_construct_single_anchor_unit_dimensions(self):
min_level = 5
max_level = 5
anchor_scale = 1.0
aspect_ratios = [1.0]
scales_per_octave = 1
im_height = 1
im_width = 1
feature_map_shape_list = [(2, 2)]
def graph_fn():
min_level = 5
max_level = 5
anchor_scale = 1.0
aspect_ratios = [1.0]
scales_per_octave = 1
im_height = 1
im_width = 1
feature_map_shape_list = [(2, 2)]
anchor_generator = mg.MultiscaleGridAnchorGenerator(
min_level, max_level, anchor_scale, aspect_ratios, scales_per_octave,
normalize_coordinates=False)
anchors_list = anchor_generator.generate(
feature_map_shape_list, im_height=im_height, im_width=im_width)
anchor_corners = anchors_list[0].get()
return anchor_corners
# Positive offsets are produced.
exp_anchor_corners = [[0, 0, 32, 32],
[0, 32, 32, 64],
[32, 0, 64, 32],
[32, 32, 64, 64]]
anchor_generator = mg.MultiscaleGridAnchorGenerator(
min_level, max_level, anchor_scale, aspect_ratios, scales_per_octave,
normalize_coordinates=False)
anchors_list = anchor_generator.generate(
feature_map_shape_list, im_height=im_height, im_width=im_width)
anchor_corners = anchors_list[0].get()
with self.test_session():
anchor_corners_out = anchor_corners.eval()
self.assertAllClose(anchor_corners_out, exp_anchor_corners)
anchor_corners_out = self.execute(graph_fn, [])
self.assertAllClose(anchor_corners_out, exp_anchor_corners)
def test_construct_normalized_anchors_fails_with_unit_dimensions(self):
anchor_generator = mg.MultiscaleGridAnchorGenerator(
......@@ -82,28 +83,29 @@ class MultiscaleGridAnchorGeneratorTest(test_case.TestCase):
feature_map_shape_list=[(2, 2)], im_height=1, im_width=1)
def test_construct_single_anchor_in_normalized_coordinates(self):
min_level = 5
max_level = 5
anchor_scale = 4.0
aspect_ratios = [1.0]
scales_per_octave = 1
im_height = 64
im_width = 128
feature_map_shape_list = [(2, 2)]
def graph_fn():
min_level = 5
max_level = 5
anchor_scale = 4.0
aspect_ratios = [1.0]
scales_per_octave = 1
im_height = 64
im_width = 128
feature_map_shape_list = [(2, 2)]
anchor_generator = mg.MultiscaleGridAnchorGenerator(
min_level, max_level, anchor_scale, aspect_ratios, scales_per_octave,
normalize_coordinates=True)
anchors_list = anchor_generator.generate(
feature_map_shape_list, im_height=im_height, im_width=im_width)
anchor_corners = anchors_list[0].get()
return anchor_corners
exp_anchor_corners = [[-48./64, -48./128, 80./64, 80./128],
[-48./64, -16./128, 80./64, 112./128],
[-16./64, -48./128, 112./64, 80./128],
[-16./64, -16./128, 112./64, 112./128]]
anchor_generator = mg.MultiscaleGridAnchorGenerator(
min_level, max_level, anchor_scale, aspect_ratios, scales_per_octave,
normalize_coordinates=True)
anchors_list = anchor_generator.generate(
feature_map_shape_list, im_height=im_height, im_width=im_width)
anchor_corners = anchors_list[0].get()
with self.test_session():
anchor_corners_out = anchor_corners.eval()
self.assertAllClose(anchor_corners_out, exp_anchor_corners)
anchor_corners_out = self.execute(graph_fn, [])
self.assertAllClose(anchor_corners_out, exp_anchor_corners)
def test_num_anchors_per_location(self):
min_level = 5
......@@ -117,30 +119,34 @@ class MultiscaleGridAnchorGeneratorTest(test_case.TestCase):
self.assertEqual(anchor_generator.num_anchors_per_location(), [6, 6])
def test_construct_single_anchor_dynamic_size(self):
min_level = 5
max_level = 5
anchor_scale = 4.0
aspect_ratios = [1.0]
scales_per_octave = 1
im_height = tf.constant(64)
im_width = tf.constant(64)
feature_map_shape_list = [(2, 2)]
# Zero offsets are used.
def graph_fn():
min_level = 5
max_level = 5
anchor_scale = 4.0
aspect_ratios = [1.0]
scales_per_octave = 1
im_height = tf.constant(64)
im_width = tf.constant(64)
feature_map_shape_list = [(2, 2)]
anchor_generator = mg.MultiscaleGridAnchorGenerator(
min_level, max_level, anchor_scale, aspect_ratios, scales_per_octave,
normalize_coordinates=False)
anchors_list = anchor_generator.generate(
feature_map_shape_list, im_height=im_height, im_width=im_width)
anchor_corners = anchors_list[0].get()
return anchor_corners
exp_anchor_corners = [[-64, -64, 64, 64],
[-64, -32, 64, 96],
[-32, -64, 96, 64],
[-32, -32, 96, 96]]
anchor_generator = mg.MultiscaleGridAnchorGenerator(
min_level, max_level, anchor_scale, aspect_ratios, scales_per_octave,
normalize_coordinates=False)
anchors_list = anchor_generator.generate(
feature_map_shape_list, im_height=im_height, im_width=im_width)
anchor_corners = anchors_list[0].get()
with self.test_session():
anchor_corners_out = anchor_corners.eval()
self.assertAllClose(anchor_corners_out, exp_anchor_corners)
# Add anchor offset.
anchor_offset = 2.0**5 / 2.0
exp_anchor_corners = [
[b + anchor_offset for b in a] for a in exp_anchor_corners
]
anchor_corners_out = self.execute(graph_fn, [])
self.assertAllClose(anchor_corners_out, exp_anchor_corners)
def test_construct_single_anchor_with_odd_input_dimension(self):
......
......@@ -14,80 +14,99 @@
# ==============================================================================
"""Tests for object_detection.box_coder.faster_rcnn_box_coder."""
import numpy as np
import tensorflow as tf
from object_detection.box_coders import faster_rcnn_box_coder
from object_detection.core import box_list
from object_detection.utils import test_case
class FasterRcnnBoxCoderTest(tf.test.TestCase):
class FasterRcnnBoxCoderTest(test_case.TestCase):
def test_get_correct_relative_codes_after_encoding(self):
boxes = [[10.0, 10.0, 20.0, 15.0], [0.2, 0.1, 0.5, 0.4]]
anchors = [[15.0, 12.0, 30.0, 18.0], [0.1, 0.0, 0.7, 0.9]]
boxes = np.array([[10.0, 10.0, 20.0, 15.0], [0.2, 0.1, 0.5, 0.4]],
np.float32)
anchors = np.array([[15.0, 12.0, 30.0, 18.0], [0.1, 0.0, 0.7, 0.9]],
np.float32)
expected_rel_codes = [[-0.5, -0.416666, -0.405465, -0.182321],
[-0.083333, -0.222222, -0.693147, -1.098612]]
boxes = box_list.BoxList(tf.constant(boxes))
anchors = box_list.BoxList(tf.constant(anchors))
coder = faster_rcnn_box_coder.FasterRcnnBoxCoder()
rel_codes = coder.encode(boxes, anchors)
with self.test_session() as sess:
rel_codes_out, = sess.run([rel_codes])
self.assertAllClose(rel_codes_out, expected_rel_codes)
def graph_fn(boxes, anchors):
boxes = box_list.BoxList(boxes)
anchors = box_list.BoxList(anchors)
coder = faster_rcnn_box_coder.FasterRcnnBoxCoder()
rel_codes = coder.encode(boxes, anchors)
return rel_codes
rel_codes_out = self.execute(graph_fn, [boxes, anchors])
self.assertAllClose(rel_codes_out, expected_rel_codes, rtol=1e-04,
atol=1e-04)
def test_get_correct_relative_codes_after_encoding_with_scaling(self):
boxes = [[10.0, 10.0, 20.0, 15.0], [0.2, 0.1, 0.5, 0.4]]
anchors = [[15.0, 12.0, 30.0, 18.0], [0.1, 0.0, 0.7, 0.9]]
scale_factors = [2, 3, 4, 5]
boxes = np.array([[10.0, 10.0, 20.0, 15.0], [0.2, 0.1, 0.5, 0.4]],
np.float32)
anchors = np.array([[15.0, 12.0, 30.0, 18.0], [0.1, 0.0, 0.7, 0.9]],
np.float32)
expected_rel_codes = [[-1., -1.25, -1.62186, -0.911608],
[-0.166667, -0.666667, -2.772588, -5.493062]]
boxes = box_list.BoxList(tf.constant(boxes))
anchors = box_list.BoxList(tf.constant(anchors))
coder = faster_rcnn_box_coder.FasterRcnnBoxCoder(
scale_factors=scale_factors)
rel_codes = coder.encode(boxes, anchors)
with self.test_session() as sess:
rel_codes_out, = sess.run([rel_codes])
self.assertAllClose(rel_codes_out, expected_rel_codes)
def graph_fn(boxes, anchors):
scale_factors = [2, 3, 4, 5]
boxes = box_list.BoxList(boxes)
anchors = box_list.BoxList(anchors)
coder = faster_rcnn_box_coder.FasterRcnnBoxCoder(
scale_factors=scale_factors)
rel_codes = coder.encode(boxes, anchors)
return rel_codes
rel_codes_out = self.execute(graph_fn, [boxes, anchors])
self.assertAllClose(rel_codes_out, expected_rel_codes, rtol=1e-04,
atol=1e-04)
def test_get_correct_boxes_after_decoding(self):
anchors = [[15.0, 12.0, 30.0, 18.0], [0.1, 0.0, 0.7, 0.9]]
rel_codes = [[-0.5, -0.416666, -0.405465, -0.182321],
[-0.083333, -0.222222, -0.693147, -1.098612]]
anchors = np.array([[15.0, 12.0, 30.0, 18.0], [0.1, 0.0, 0.7, 0.9]],
np.float32)
rel_codes = np.array([[-0.5, -0.416666, -0.405465, -0.182321],
[-0.083333, -0.222222, -0.693147, -1.098612]],
np.float32)
expected_boxes = [[10.0, 10.0, 20.0, 15.0], [0.2, 0.1, 0.5, 0.4]]
anchors = box_list.BoxList(tf.constant(anchors))
coder = faster_rcnn_box_coder.FasterRcnnBoxCoder()
boxes = coder.decode(rel_codes, anchors)
with self.test_session() as sess:
boxes_out, = sess.run([boxes.get()])
self.assertAllClose(boxes_out, expected_boxes)
def graph_fn(rel_codes, anchors):
anchors = box_list.BoxList(anchors)
coder = faster_rcnn_box_coder.FasterRcnnBoxCoder()
boxes = coder.decode(rel_codes, anchors)
return boxes.get()
boxes_out = self.execute(graph_fn, [rel_codes, anchors])
self.assertAllClose(boxes_out, expected_boxes, rtol=1e-04,
atol=1e-04)
def test_get_correct_boxes_after_decoding_with_scaling(self):
anchors = [[15.0, 12.0, 30.0, 18.0], [0.1, 0.0, 0.7, 0.9]]
rel_codes = [[-1., -1.25, -1.62186, -0.911608],
[-0.166667, -0.666667, -2.772588, -5.493062]]
scale_factors = [2, 3, 4, 5]
anchors = np.array([[15.0, 12.0, 30.0, 18.0], [0.1, 0.0, 0.7, 0.9]],
np.float32)
rel_codes = np.array([[-1., -1.25, -1.62186, -0.911608],
[-0.166667, -0.666667, -2.772588, -5.493062]],
np.float32)
expected_boxes = [[10.0, 10.0, 20.0, 15.0], [0.2, 0.1, 0.5, 0.4]]
anchors = box_list.BoxList(tf.constant(anchors))
coder = faster_rcnn_box_coder.FasterRcnnBoxCoder(
scale_factors=scale_factors)
boxes = coder.decode(rel_codes, anchors)
with self.test_session() as sess:
boxes_out, = sess.run([boxes.get()])
self.assertAllClose(boxes_out, expected_boxes)
def graph_fn(rel_codes, anchors):
scale_factors = [2, 3, 4, 5]
anchors = box_list.BoxList(anchors)
coder = faster_rcnn_box_coder.FasterRcnnBoxCoder(
scale_factors=scale_factors)
boxes = coder.decode(rel_codes, anchors).get()
return boxes
boxes_out = self.execute(graph_fn, [rel_codes, anchors])
self.assertAllClose(expected_boxes, boxes_out, rtol=1e-04,
atol=1e-04)
def test_very_small_Width_nan_after_encoding(self):
boxes = [[10.0, 10.0, 10.0000001, 20.0]]
anchors = [[15.0, 12.0, 30.0, 18.0]]
boxes = np.array([[10.0, 10.0, 10.0000001, 20.0]], np.float32)
anchors = np.array([[15.0, 12.0, 30.0, 18.0]], np.float32)
expected_rel_codes = [[-0.833333, 0., -21.128731, 0.510826]]
boxes = box_list.BoxList(tf.constant(boxes))
anchors = box_list.BoxList(tf.constant(anchors))
coder = faster_rcnn_box_coder.FasterRcnnBoxCoder()
rel_codes = coder.encode(boxes, anchors)
with self.test_session() as sess:
rel_codes_out, = sess.run([rel_codes])
self.assertAllClose(rel_codes_out, expected_rel_codes)
def graph_fn(boxes, anchors):
boxes = box_list.BoxList(boxes)
anchors = box_list.BoxList(anchors)
coder = faster_rcnn_box_coder.FasterRcnnBoxCoder()
rel_codes = coder.encode(boxes, anchors)
return rel_codes
rel_codes_out = self.execute(graph_fn, [boxes, anchors])
self.assertAllClose(rel_codes_out, expected_rel_codes, rtol=1e-04,
atol=1e-04)
if __name__ == '__main__':
......
......@@ -14,126 +14,137 @@
# ==============================================================================
"""Tests for object_detection.box_coder.keypoint_box_coder."""
import numpy as np
import tensorflow as tf
from object_detection.box_coders import keypoint_box_coder
from object_detection.core import box_list
from object_detection.core import standard_fields as fields
from object_detection.utils import test_case
class KeypointBoxCoderTest(tf.test.TestCase):
class KeypointBoxCoderTest(test_case.TestCase):
def test_get_correct_relative_codes_after_encoding(self):
boxes = [[10., 10., 20., 15.],
[0.2, 0.1, 0.5, 0.4]]
keypoints = [[[15., 12.], [10., 15.]],
[[0.5, 0.3], [0.2, 0.4]]]
boxes = np.array([[10., 10., 20., 15.],
[0.2, 0.1, 0.5, 0.4]], np.float32)
keypoints = np.array([[[15., 12.], [10., 15.]],
[[0.5, 0.3], [0.2, 0.4]]], np.float32)
num_keypoints = len(keypoints[0])
anchors = [[15., 12., 30., 18.],
[0.1, 0.0, 0.7, 0.9]]
anchors = np.array([[15., 12., 30., 18.],
[0.1, 0.0, 0.7, 0.9]], np.float32)
expected_rel_codes = [
[-0.5, -0.416666, -0.405465, -0.182321,
-0.5, -0.5, -0.833333, 0.],
[-0.083333, -0.222222, -0.693147, -1.098612,
0.166667, -0.166667, -0.333333, -0.055556]
]
boxes = box_list.BoxList(tf.constant(boxes))
boxes.add_field(fields.BoxListFields.keypoints, tf.constant(keypoints))
anchors = box_list.BoxList(tf.constant(anchors))
coder = keypoint_box_coder.KeypointBoxCoder(num_keypoints)
rel_codes = coder.encode(boxes, anchors)
with self.test_session() as sess:
rel_codes_out, = sess.run([rel_codes])
self.assertAllClose(rel_codes_out, expected_rel_codes)
def graph_fn(boxes, keypoints, anchors):
boxes = box_list.BoxList(boxes)
boxes.add_field(fields.BoxListFields.keypoints, keypoints)
anchors = box_list.BoxList(anchors)
coder = keypoint_box_coder.KeypointBoxCoder(num_keypoints)
rel_codes = coder.encode(boxes, anchors)
return rel_codes
rel_codes_out = self.execute(graph_fn, [boxes, keypoints, anchors])
self.assertAllClose(rel_codes_out, expected_rel_codes, rtol=1e-04,
atol=1e-04)
def test_get_correct_relative_codes_after_encoding_with_scaling(self):
boxes = [[10., 10., 20., 15.],
[0.2, 0.1, 0.5, 0.4]]
keypoints = [[[15., 12.], [10., 15.]],
[[0.5, 0.3], [0.2, 0.4]]]
boxes = np.array([[10., 10., 20., 15.],
[0.2, 0.1, 0.5, 0.4]], np.float32)
keypoints = np.array([[[15., 12.], [10., 15.]],
[[0.5, 0.3], [0.2, 0.4]]], np.float32)
num_keypoints = len(keypoints[0])
anchors = [[15., 12., 30., 18.],
[0.1, 0.0, 0.7, 0.9]]
scale_factors = [2, 3, 4, 5]
anchors = np.array([[15., 12., 30., 18.],
[0.1, 0.0, 0.7, 0.9]], np.float32)
expected_rel_codes = [
[-1., -1.25, -1.62186, -0.911608,
-1.0, -1.5, -1.666667, 0.],
[-0.166667, -0.666667, -2.772588, -5.493062,
0.333333, -0.5, -0.666667, -0.166667]
]
boxes = box_list.BoxList(tf.constant(boxes))
boxes.add_field(fields.BoxListFields.keypoints, tf.constant(keypoints))
anchors = box_list.BoxList(tf.constant(anchors))
coder = keypoint_box_coder.KeypointBoxCoder(
num_keypoints, scale_factors=scale_factors)
rel_codes = coder.encode(boxes, anchors)
with self.test_session() as sess:
rel_codes_out, = sess.run([rel_codes])
self.assertAllClose(rel_codes_out, expected_rel_codes)
def graph_fn(boxes, keypoints, anchors):
scale_factors = [2, 3, 4, 5]
boxes = box_list.BoxList(boxes)
boxes.add_field(fields.BoxListFields.keypoints, keypoints)
anchors = box_list.BoxList(anchors)
coder = keypoint_box_coder.KeypointBoxCoder(
num_keypoints, scale_factors=scale_factors)
rel_codes = coder.encode(boxes, anchors)
return rel_codes
rel_codes_out = self.execute(graph_fn, [boxes, keypoints, anchors])
self.assertAllClose(rel_codes_out, expected_rel_codes, rtol=1e-04,
atol=1e-04)
def test_get_correct_boxes_after_decoding(self):
anchors = [[15., 12., 30., 18.],
[0.1, 0.0, 0.7, 0.9]]
rel_codes = [
anchors = np.array([[15., 12., 30., 18.],
[0.1, 0.0, 0.7, 0.9]], np.float32)
rel_codes = np.array([
[-0.5, -0.416666, -0.405465, -0.182321,
-0.5, -0.5, -0.833333, 0.],
[-0.083333, -0.222222, -0.693147, -1.098612,
0.166667, -0.166667, -0.333333, -0.055556]
]
], np.float32)
expected_boxes = [[10., 10., 20., 15.],
[0.2, 0.1, 0.5, 0.4]]
expected_keypoints = [[[15., 12.], [10., 15.]],
[[0.5, 0.3], [0.2, 0.4]]]
num_keypoints = len(expected_keypoints[0])
anchors = box_list.BoxList(tf.constant(anchors))
coder = keypoint_box_coder.KeypointBoxCoder(num_keypoints)
boxes = coder.decode(rel_codes, anchors)
with self.test_session() as sess:
boxes_out, keypoints_out = sess.run(
[boxes.get(), boxes.get_field(fields.BoxListFields.keypoints)])
self.assertAllClose(boxes_out, expected_boxes)
self.assertAllClose(keypoints_out, expected_keypoints)
def graph_fn(rel_codes, anchors):
anchors = box_list.BoxList(anchors)
coder = keypoint_box_coder.KeypointBoxCoder(num_keypoints)
boxes = coder.decode(rel_codes, anchors)
return boxes.get(), boxes.get_field(fields.BoxListFields.keypoints)
boxes_out, keypoints_out = self.execute(graph_fn, [rel_codes, anchors])
self.assertAllClose(keypoints_out, expected_keypoints, rtol=1e-04,
atol=1e-04)
self.assertAllClose(boxes_out, expected_boxes, rtol=1e-04,
atol=1e-04)
def test_get_correct_boxes_after_decoding_with_scaling(self):
anchors = [[15., 12., 30., 18.],
[0.1, 0.0, 0.7, 0.9]]
rel_codes = [
anchors = np.array([[15., 12., 30., 18.],
[0.1, 0.0, 0.7, 0.9]], np.float32)
rel_codes = np.array([
[-1., -1.25, -1.62186, -0.911608,
-1.0, -1.5, -1.666667, 0.],
[-0.166667, -0.666667, -2.772588, -5.493062,
0.333333, -0.5, -0.666667, -0.166667]
]
scale_factors = [2, 3, 4, 5]
], np.float32)
expected_boxes = [[10., 10., 20., 15.],
[0.2, 0.1, 0.5, 0.4]]
expected_keypoints = [[[15., 12.], [10., 15.]],
[[0.5, 0.3], [0.2, 0.4]]]
num_keypoints = len(expected_keypoints[0])
anchors = box_list.BoxList(tf.constant(anchors))
coder = keypoint_box_coder.KeypointBoxCoder(
num_keypoints, scale_factors=scale_factors)
boxes = coder.decode(rel_codes, anchors)
with self.test_session() as sess:
boxes_out, keypoints_out = sess.run(
[boxes.get(), boxes.get_field(fields.BoxListFields.keypoints)])
self.assertAllClose(boxes_out, expected_boxes)
self.assertAllClose(keypoints_out, expected_keypoints)
def graph_fn(rel_codes, anchors):
scale_factors = [2, 3, 4, 5]
anchors = box_list.BoxList(anchors)
coder = keypoint_box_coder.KeypointBoxCoder(
num_keypoints, scale_factors=scale_factors)
boxes = coder.decode(rel_codes, anchors)
return boxes.get(), boxes.get_field(fields.BoxListFields.keypoints)
boxes_out, keypoints_out = self.execute(graph_fn, [rel_codes, anchors])
self.assertAllClose(keypoints_out, expected_keypoints, rtol=1e-04,
atol=1e-04)
self.assertAllClose(boxes_out, expected_boxes, rtol=1e-04,
atol=1e-04)
def test_very_small_width_nan_after_encoding(self):
boxes = [[10., 10., 10.0000001, 20.]]
keypoints = [[[10., 10.], [10.0000001, 20.]]]
anchors = [[15., 12., 30., 18.]]
boxes = np.array([[10., 10., 10.0000001, 20.]], np.float32)
keypoints = np.array([[[10., 10.], [10.0000001, 20.]]], np.float32)
anchors = np.array([[15., 12., 30., 18.]], np.float32)
expected_rel_codes = [[-0.833333, 0., -21.128731, 0.510826,
-0.833333, -0.833333, -0.833333, 0.833333]]
boxes = box_list.BoxList(tf.constant(boxes))
boxes.add_field(fields.BoxListFields.keypoints, tf.constant(keypoints))
anchors = box_list.BoxList(tf.constant(anchors))
coder = keypoint_box_coder.KeypointBoxCoder(2)
rel_codes = coder.encode(boxes, anchors)
with self.test_session() as sess:
rel_codes_out, = sess.run([rel_codes])
self.assertAllClose(rel_codes_out, expected_rel_codes)
def graph_fn(boxes, keypoints, anchors):
boxes = box_list.BoxList(boxes)
boxes.add_field(fields.BoxListFields.keypoints, keypoints)
anchors = box_list.BoxList(anchors)
coder = keypoint_box_coder.KeypointBoxCoder(2)
rel_codes = coder.encode(boxes, anchors)
return rel_codes
rel_codes_out = self.execute(graph_fn, [boxes, keypoints, anchors])
self.assertAllClose(rel_codes_out, expected_rel_codes, rtol=1e-04,
atol=1e-04)
if __name__ == '__main__':
......
......@@ -14,40 +14,47 @@
# ==============================================================================
"""Tests for object_detection.box_coder.mean_stddev_boxcoder."""
import numpy as np
import tensorflow as tf
from object_detection.box_coders import mean_stddev_box_coder
from object_detection.core import box_list
from object_detection.utils import test_case
class MeanStddevBoxCoderTest(tf.test.TestCase):
class MeanStddevBoxCoderTest(test_case.TestCase):
def testGetCorrectRelativeCodesAfterEncoding(self):
box_corners = [[0.0, 0.0, 0.5, 0.5], [0.0, 0.0, 0.5, 0.5]]
boxes = box_list.BoxList(tf.constant(box_corners))
boxes = np.array([[0.0, 0.0, 0.5, 0.5], [0.0, 0.0, 0.5, 0.5]], np.float32)
anchors = np.array([[0.0, 0.0, 0.5, 0.5], [0.5, 0.5, 1.0, 0.8]], np.float32)
expected_rel_codes = [[0.0, 0.0, 0.0, 0.0], [-5.0, -5.0, -5.0, -3.0]]
prior_means = tf.constant([[0.0, 0.0, 0.5, 0.5], [0.5, 0.5, 1.0, 0.8]])
priors = box_list.BoxList(prior_means)
coder = mean_stddev_box_coder.MeanStddevBoxCoder(stddev=0.1)
rel_codes = coder.encode(boxes, priors)
with self.test_session() as sess:
rel_codes_out = sess.run(rel_codes)
self.assertAllClose(rel_codes_out, expected_rel_codes)
def graph_fn(boxes, anchors):
anchors = box_list.BoxList(anchors)
boxes = box_list.BoxList(boxes)
coder = mean_stddev_box_coder.MeanStddevBoxCoder(stddev=0.1)
rel_codes = coder.encode(boxes, anchors)
return rel_codes
rel_codes_out = self.execute(graph_fn, [boxes, anchors])
self.assertAllClose(rel_codes_out, expected_rel_codes, rtol=1e-04,
atol=1e-04)
def testGetCorrectBoxesAfterDecoding(self):
rel_codes = tf.constant([[0.0, 0.0, 0.0, 0.0], [-5.0, -5.0, -5.0, -3.0]])
rel_codes = np.array([[0.0, 0.0, 0.0, 0.0], [-5.0, -5.0, -5.0, -3.0]],
np.float32)
expected_box_corners = [[0.0, 0.0, 0.5, 0.5], [0.0, 0.0, 0.5, 0.5]]
prior_means = tf.constant([[0.0, 0.0, 0.5, 0.5], [0.5, 0.5, 1.0, 0.8]])
priors = box_list.BoxList(prior_means)
coder = mean_stddev_box_coder.MeanStddevBoxCoder(stddev=0.1)
decoded_boxes = coder.decode(rel_codes, priors)
decoded_box_corners = decoded_boxes.get()
with self.test_session() as sess:
decoded_out = sess.run(decoded_box_corners)
self.assertAllClose(decoded_out, expected_box_corners)
anchors = np.array([[0.0, 0.0, 0.5, 0.5], [0.5, 0.5, 1.0, 0.8]], np.float32)
def graph_fn(rel_codes, anchors):
anchors = box_list.BoxList(anchors)
coder = mean_stddev_box_coder.MeanStddevBoxCoder(stddev=0.1)
decoded_boxes = coder.decode(rel_codes, anchors).get()
return decoded_boxes
decoded_boxes_out = self.execute(graph_fn, [rel_codes, anchors])
self.assertAllClose(decoded_boxes_out, expected_box_corners, rtol=1e-04,
atol=1e-04)
if __name__ == '__main__':
......
......@@ -14,83 +14,100 @@
# ==============================================================================
"""Tests for object_detection.box_coder.square_box_coder."""
import numpy as np
import tensorflow as tf
from object_detection.box_coders import square_box_coder
from object_detection.core import box_list
from object_detection.utils import test_case
class SquareBoxCoderTest(tf.test.TestCase):
class SquareBoxCoderTest(test_case.TestCase):
def test_correct_relative_codes_with_default_scale(self):
boxes = [[10.0, 10.0, 20.0, 15.0], [0.2, 0.1, 0.5, 0.4]]
anchors = [[15.0, 12.0, 30.0, 18.0], [0.1, 0.0, 0.7, 0.9]]
scale_factors = None
boxes = np.array([[10.0, 10.0, 20.0, 15.0], [0.2, 0.1, 0.5, 0.4]],
np.float32)
anchors = np.array([[15.0, 12.0, 30.0, 18.0], [0.1, 0.0, 0.7, 0.9]],
np.float32)
expected_rel_codes = [[-0.790569, -0.263523, -0.293893],
[-0.068041, -0.272166, -0.89588]]
boxes = box_list.BoxList(tf.constant(boxes))
anchors = box_list.BoxList(tf.constant(anchors))
coder = square_box_coder.SquareBoxCoder(scale_factors=scale_factors)
rel_codes = coder.encode(boxes, anchors)
with self.test_session() as sess:
(rel_codes_out,) = sess.run([rel_codes])
self.assertAllClose(rel_codes_out, expected_rel_codes)
def graph_fn(boxes, anchors):
scale_factors = None
boxes = box_list.BoxList(boxes)
anchors = box_list.BoxList(anchors)
coder = square_box_coder.SquareBoxCoder(scale_factors=scale_factors)
rel_codes = coder.encode(boxes, anchors)
return rel_codes
rel_codes_out = self.execute(graph_fn, [boxes, anchors])
self.assertAllClose(rel_codes_out, expected_rel_codes, rtol=1e-04,
atol=1e-04)
def test_correct_relative_codes_with_non_default_scale(self):
boxes = [[10.0, 10.0, 20.0, 15.0], [0.2, 0.1, 0.5, 0.4]]
anchors = [[15.0, 12.0, 30.0, 18.0], [0.1, 0.0, 0.7, 0.9]]
scale_factors = [2, 3, 4]
boxes = np.array([[10.0, 10.0, 20.0, 15.0], [0.2, 0.1, 0.5, 0.4]],
np.float32)
anchors = np.array([[15.0, 12.0, 30.0, 18.0], [0.1, 0.0, 0.7, 0.9]],
np.float32)
expected_rel_codes = [[-1.581139, -0.790569, -1.175573],
[-0.136083, -0.816497, -3.583519]]
boxes = box_list.BoxList(tf.constant(boxes))
anchors = box_list.BoxList(tf.constant(anchors))
coder = square_box_coder.SquareBoxCoder(scale_factors=scale_factors)
rel_codes = coder.encode(boxes, anchors)
with self.test_session() as sess:
(rel_codes_out,) = sess.run([rel_codes])
self.assertAllClose(rel_codes_out, expected_rel_codes)
def graph_fn(boxes, anchors):
scale_factors = [2, 3, 4]
boxes = box_list.BoxList(boxes)
anchors = box_list.BoxList(anchors)
coder = square_box_coder.SquareBoxCoder(scale_factors=scale_factors)
rel_codes = coder.encode(boxes, anchors)
return rel_codes
rel_codes_out = self.execute(graph_fn, [boxes, anchors])
self.assertAllClose(rel_codes_out, expected_rel_codes, rtol=1e-03,
atol=1e-03)
def test_correct_relative_codes_with_small_width(self):
boxes = [[10.0, 10.0, 10.0000001, 20.0]]
anchors = [[15.0, 12.0, 30.0, 18.0]]
scale_factors = None
boxes = np.array([[10.0, 10.0, 10.0000001, 20.0]], np.float32)
anchors = np.array([[15.0, 12.0, 30.0, 18.0]], np.float32)
expected_rel_codes = [[-1.317616, 0., -20.670586]]
boxes = box_list.BoxList(tf.constant(boxes))
anchors = box_list.BoxList(tf.constant(anchors))
coder = square_box_coder.SquareBoxCoder(scale_factors=scale_factors)
rel_codes = coder.encode(boxes, anchors)
with self.test_session() as sess:
(rel_codes_out,) = sess.run([rel_codes])
self.assertAllClose(rel_codes_out, expected_rel_codes)
def graph_fn(boxes, anchors):
scale_factors = None
boxes = box_list.BoxList(boxes)
anchors = box_list.BoxList(anchors)
coder = square_box_coder.SquareBoxCoder(scale_factors=scale_factors)
rel_codes = coder.encode(boxes, anchors)
return rel_codes
rel_codes_out = self.execute(graph_fn, [boxes, anchors])
self.assertAllClose(rel_codes_out, expected_rel_codes, rtol=1e-04,
atol=1e-04)
def test_correct_boxes_with_default_scale(self):
anchors = [[15.0, 12.0, 30.0, 18.0], [0.1, 0.0, 0.7, 0.9]]
rel_codes = [[-0.5, -0.416666, -0.405465],
[-0.083333, -0.222222, -0.693147]]
scale_factors = None
anchors = np.array([[15.0, 12.0, 30.0, 18.0], [0.1, 0.0, 0.7, 0.9]],
np.float32)
rel_codes = np.array([[-0.5, -0.416666, -0.405465],
[-0.083333, -0.222222, -0.693147]], np.float32)
expected_boxes = [[14.594306, 7.884875, 20.918861, 14.209432],
[0.155051, 0.102989, 0.522474, 0.470412]]
anchors = box_list.BoxList(tf.constant(anchors))
coder = square_box_coder.SquareBoxCoder(scale_factors=scale_factors)
boxes = coder.decode(rel_codes, anchors)
with self.test_session() as sess:
(boxes_out,) = sess.run([boxes.get()])
self.assertAllClose(boxes_out, expected_boxes)
def graph_fn(rel_codes, anchors):
scale_factors = None
anchors = box_list.BoxList(anchors)
coder = square_box_coder.SquareBoxCoder(scale_factors=scale_factors)
boxes = coder.decode(rel_codes, anchors).get()
return boxes
boxes_out = self.execute(graph_fn, [rel_codes, anchors])
self.assertAllClose(boxes_out, expected_boxes, rtol=1e-04,
atol=1e-04)
def test_correct_boxes_with_non_default_scale(self):
anchors = [[15.0, 12.0, 30.0, 18.0], [0.1, 0.0, 0.7, 0.9]]
rel_codes = [[-1., -1.25, -1.62186], [-0.166667, -0.666667, -2.772588]]
scale_factors = [2, 3, 4]
anchors = np.array([[15.0, 12.0, 30.0, 18.0], [0.1, 0.0, 0.7, 0.9]],
np.float32)
rel_codes = np.array(
[[-1., -1.25, -1.62186], [-0.166667, -0.666667, -2.772588]], np.float32)
expected_boxes = [[14.594306, 7.884875, 20.918861, 14.209432],
[0.155051, 0.102989, 0.522474, 0.470412]]
anchors = box_list.BoxList(tf.constant(anchors))
coder = square_box_coder.SquareBoxCoder(scale_factors=scale_factors)
boxes = coder.decode(rel_codes, anchors)
with self.test_session() as sess:
(boxes_out,) = sess.run([boxes.get()])
self.assertAllClose(boxes_out, expected_boxes)
def graph_fn(rel_codes, anchors):
scale_factors = [2, 3, 4]
anchors = box_list.BoxList(anchors)
coder = square_box_coder.SquareBoxCoder(scale_factors=scale_factors)
boxes = coder.decode(rel_codes, anchors).get()
return boxes
boxes_out = self.execute(graph_fn, [rel_codes, anchors])
self.assertAllClose(boxes_out, expected_boxes, rtol=1e-04,
atol=1e-04)
if __name__ == '__main__':
......
# Lint as: python2, python3
# Copyright 2017 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
......@@ -15,6 +16,10 @@
"""A function to build an object detection anchor generator from config."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from six.moves import zip
from object_detection.anchor_generators import flexible_grid_anchor_generator
from object_detection.anchor_generators import grid_anchor_generator
from object_detection.anchor_generators import multiple_grid_anchor_generator
......@@ -58,12 +63,14 @@ def build(anchor_generator_config):
ssd_anchor_generator_config = anchor_generator_config.ssd_anchor_generator
anchor_strides = None
if ssd_anchor_generator_config.height_stride:
anchor_strides = zip(ssd_anchor_generator_config.height_stride,
ssd_anchor_generator_config.width_stride)
anchor_strides = list(
zip(ssd_anchor_generator_config.height_stride,
ssd_anchor_generator_config.width_stride))
anchor_offsets = None
if ssd_anchor_generator_config.height_offset:
anchor_offsets = zip(ssd_anchor_generator_config.height_offset,
ssd_anchor_generator_config.width_offset)
anchor_offsets = list(
zip(ssd_anchor_generator_config.height_offset,
ssd_anchor_generator_config.width_offset))
return multiple_grid_anchor_generator.create_ssd_anchors(
num_layers=ssd_anchor_generator_config.num_layers,
min_scale=ssd_anchor_generator_config.min_scale,
......
# Lint as: python2, python3
# Copyright 2017 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
......@@ -15,8 +16,14 @@
"""Tests for anchor_generator_builder."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import math
from six.moves import range
from six.moves import zip
import tensorflow as tf
from google.protobuf import text_format
......
# Lint as: python2, python3
# Copyright 2017 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
......@@ -487,8 +488,8 @@ class MaskRCNNBoxPredictorBuilderTest(tf.test.TestCase):
self.assertEqual(box_predictor.num_classes, 90)
self.assertTrue(box_predictor._is_training)
self.assertEqual(box_head._box_code_size, 4)
self.assertTrue(
mask_rcnn_box_predictor.MASK_PREDICTIONS in third_stage_heads)
self.assertIn(
mask_rcnn_box_predictor.MASK_PREDICTIONS, third_stage_heads)
self.assertEqual(
third_stage_heads[mask_rcnn_box_predictor.MASK_PREDICTIONS]
._mask_prediction_conv_depth, 512)
......@@ -527,8 +528,8 @@ class MaskRCNNBoxPredictorBuilderTest(tf.test.TestCase):
self.assertEqual(box_predictor.num_classes, 90)
self.assertTrue(box_predictor._is_training)
self.assertEqual(box_head._box_code_size, 4)
self.assertTrue(
mask_rcnn_box_predictor.MASK_PREDICTIONS in third_stage_heads)
self.assertIn(
mask_rcnn_box_predictor.MASK_PREDICTIONS, third_stage_heads)
self.assertEqual(
third_stage_heads[mask_rcnn_box_predictor.MASK_PREDICTIONS]
._mask_prediction_conv_depth, 512)
......
# Lint as: python2, python3
# Copyright 2019 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
......@@ -15,8 +16,12 @@
"""Tests for calibration_builder."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
from scipy import interpolate
from six.moves import zip
import tensorflow as tf
from object_detection.builders import calibration_builder
from object_detection.protos import calibration_pb2
......
# Lint as: python2, python3
# Copyright 2017 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
......@@ -21,10 +22,15 @@ Note: If users wishes to also use their own InputReaders with the Object
Detection configuration framework, they should define their own builder function
that wraps the build function.
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import functools
import tensorflow as tf
from object_detection.data_decoders import tf_example_decoder
from tensorflow.contrib import data as tf_data
from object_detection.builders import decoder_builder
from object_detection.protos import input_reader_pb2
......@@ -45,14 +51,20 @@ def make_initializable_iterator(dataset):
return iterator
def read_dataset(file_read_func, input_files, config):
def read_dataset(file_read_func, input_files, config,
filename_shard_fn=None):
"""Reads a dataset, and handles repetition and shuffling.
Args:
file_read_func: Function to use in tf.contrib.data.parallel_interleave, to
file_read_func: Function to use in tf_data.parallel_interleave, to
read every individual file into a tf.data.Dataset.
input_files: A list of file paths to read.
config: A input_reader_builder.InputReader object.
filename_shard_fn: optional, A funciton used to shard filenames across
replicas. This function takes as input a TF dataset of filenames and
is expected to return its sharded version. It is useful when the
dataset is being loaded on one of possibly many replicas and we want
to evenly shard the files between the replicas.
Returns:
A tf.data.Dataset of (undecoded) tf-records based on config.
......@@ -77,9 +89,12 @@ def read_dataset(file_read_func, input_files, config):
elif num_readers > 1:
tf.logging.warning('`shuffle` is false, but the input data stream is '
'still slightly shuffled since `num_readers` > 1.')
if filename_shard_fn:
filename_dataset = filename_shard_fn(filename_dataset)
filename_dataset = filename_dataset.repeat(config.num_epochs or None)
records_dataset = filename_dataset.apply(
tf.contrib.data.parallel_interleave(
tf_data.parallel_interleave(
file_read_func,
cycle_length=num_readers,
block_length=config.read_block_length,
......@@ -89,7 +104,21 @@ def read_dataset(file_read_func, input_files, config):
return records_dataset
def build(input_reader_config, batch_size=None, transform_input_data_fn=None):
def shard_function_for_context(input_context):
"""Returns a function that shards filenames based on the input context."""
if input_context is None:
return None
def shard_fn(dataset):
return dataset.shard(
input_context.num_input_pipelines, input_context.input_pipeline_id)
return shard_fn
def build(input_reader_config, batch_size=None, transform_input_data_fn=None,
input_context=None):
"""Builds a tf.data.Dataset.
Builds a tf.data.Dataset by applying the `transform_input_data_fn` on all
......@@ -100,6 +129,9 @@ def build(input_reader_config, batch_size=None, transform_input_data_fn=None):
batch_size: Batch size. If batch size is None, no batching is performed.
transform_input_data_fn: Function to apply transformation to all records,
or None if no extra decoding is required.
input_context: optional, A tf.distribute.InputContext object used to
shard filenames and compute per-replica batch_size when this function
is being called per-replica.
Returns:
A tf.data.Dataset based on the input_reader_config.
......@@ -112,23 +144,14 @@ def build(input_reader_config, batch_size=None, transform_input_data_fn=None):
raise ValueError('input_reader_config not of type '
'input_reader_pb2.InputReader.')
decoder = decoder_builder.build(input_reader_config)
if input_reader_config.WhichOneof('input_reader') == 'tf_record_input_reader':
config = input_reader_config.tf_record_input_reader
if not config.input_path:
raise ValueError('At least one input path must be specified in '
'`input_reader_config`.')
label_map_proto_file = None
if input_reader_config.HasField('label_map_path'):
label_map_proto_file = input_reader_config.label_map_path
decoder = tf_example_decoder.TfExampleDecoder(
load_instance_masks=input_reader_config.load_instance_masks,
load_multiclass_scores=input_reader_config.load_multiclass_scores,
instance_mask_type=input_reader_config.mask_type,
label_map_proto_file=label_map_proto_file,
use_display_name=input_reader_config.use_display_name,
num_additional_channels=input_reader_config.num_additional_channels)
def process_fn(value):
"""Sets up tf graph that decodes, transforms and pads input data."""
processed_tensors = decoder.decode(value)
......@@ -136,9 +159,13 @@ def build(input_reader_config, batch_size=None, transform_input_data_fn=None):
processed_tensors = transform_input_data_fn(processed_tensors)
return processed_tensors
shard_fn = shard_function_for_context(input_context)
if input_context is not None:
batch_size = input_context.get_per_replica_batch_size(batch_size)
dataset = read_dataset(
functools.partial(tf.data.TFRecordDataset, buffer_size=8 * 1000 * 1000),
config.input_path[:], input_reader_config)
config.input_path[:], input_reader_config, filename_shard_fn=shard_fn)
if input_reader_config.sample_1_of_n_examples > 1:
dataset = dataset.shard(input_reader_config.sample_1_of_n_examples, 0)
# TODO(rathodv): make batch size a required argument once the old binaries
......@@ -155,7 +182,7 @@ def build(input_reader_config, batch_size=None, transform_input_data_fn=None):
dataset = data_map_fn(process_fn, num_parallel_calls=num_parallel_calls)
if batch_size:
dataset = dataset.apply(
tf.contrib.data.batch_and_drop_remainder(batch_size))
tf_data.batch_and_drop_remainder(batch_size))
dataset = dataset.prefetch(input_reader_config.num_prefetch_batches)
return dataset
......
# Lint as: python2, python3
# Copyright 2020 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""DataDecoder builder.
Creates DataDecoders from InputReader configs.
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from object_detection.data_decoders import tf_example_decoder
from object_detection.protos import input_reader_pb2
def build(input_reader_config):
"""Builds a DataDecoder based only on the open source config proto.
Args:
input_reader_config: An input_reader_pb2.InputReader object.
Returns:
A DataDecoder based on the input_reader_config.
Raises:
ValueError: On invalid input reader proto.
"""
if not isinstance(input_reader_config, input_reader_pb2.InputReader):
raise ValueError('input_reader_config not of type '
'input_reader_pb2.InputReader.')
if input_reader_config.WhichOneof('input_reader') == 'tf_record_input_reader':
label_map_proto_file = None
if input_reader_config.HasField('label_map_path'):
label_map_proto_file = input_reader_config.label_map_path
decoder = tf_example_decoder.TfExampleDecoder(
load_instance_masks=input_reader_config.load_instance_masks,
load_multiclass_scores=input_reader_config.load_multiclass_scores,
load_context_features=input_reader_config.load_context_features,
instance_mask_type=input_reader_config.mask_type,
label_map_proto_file=label_map_proto_file,
use_display_name=input_reader_config.use_display_name,
num_additional_channels=input_reader_config.num_additional_channels,
num_keypoints=input_reader_config.num_keypoints)
return decoder
raise ValueError('Unsupported input_reader_config.')
# Lint as: python2, python3
# Copyright 2020 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Tests for decoder_builder."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
import tensorflow as tf
from google.protobuf import text_format
from object_detection.builders import decoder_builder
from object_detection.core import standard_fields as fields
from object_detection.protos import input_reader_pb2
from object_detection.utils import dataset_util
class DecoderBuilderTest(tf.test.TestCase):
def _make_serialized_tf_example(self, has_additional_channels=False):
image_tensor = np.random.randint(255, size=(4, 5, 3)).astype(np.uint8)
additional_channels_tensor = np.random.randint(
255, size=(4, 5, 1)).astype(np.uint8)
flat_mask = (4 * 5) * [1.0]
with self.test_session():
encoded_jpeg = tf.image.encode_jpeg(tf.constant(image_tensor)).eval()
encoded_additional_channels_jpeg = tf.image.encode_jpeg(
tf.constant(additional_channels_tensor)).eval()
features = {
'image/source_id': dataset_util.bytes_feature('0'.encode()),
'image/encoded': dataset_util.bytes_feature(encoded_jpeg),
'image/format': dataset_util.bytes_feature('jpeg'.encode('utf8')),
'image/height': dataset_util.int64_feature(4),
'image/width': dataset_util.int64_feature(5),
'image/object/bbox/xmin': dataset_util.float_list_feature([0.0]),
'image/object/bbox/xmax': dataset_util.float_list_feature([1.0]),
'image/object/bbox/ymin': dataset_util.float_list_feature([0.0]),
'image/object/bbox/ymax': dataset_util.float_list_feature([1.0]),
'image/object/class/label': dataset_util.int64_list_feature([2]),
'image/object/mask': dataset_util.float_list_feature(flat_mask),
}
if has_additional_channels:
additional_channels_key = 'image/additional_channels/encoded'
features[additional_channels_key] = dataset_util.bytes_list_feature(
[encoded_additional_channels_jpeg] * 2)
example = tf.train.Example(features=tf.train.Features(feature=features))
return example.SerializeToString()
def test_build_tf_record_input_reader(self):
input_reader_text_proto = 'tf_record_input_reader {}'
input_reader_proto = input_reader_pb2.InputReader()
text_format.Parse(input_reader_text_proto, input_reader_proto)
decoder = decoder_builder.build(input_reader_proto)
tensor_dict = decoder.decode(self._make_serialized_tf_example())
with tf.train.MonitoredSession() as sess:
output_dict = sess.run(tensor_dict)
self.assertNotIn(
fields.InputDataFields.groundtruth_instance_masks, output_dict)
self.assertEqual((4, 5, 3), output_dict[fields.InputDataFields.image].shape)
self.assertAllEqual([2],
output_dict[fields.InputDataFields.groundtruth_classes])
self.assertEqual(
(1, 4), output_dict[fields.InputDataFields.groundtruth_boxes].shape)
self.assertAllEqual(
[0.0, 0.0, 1.0, 1.0],
output_dict[fields.InputDataFields.groundtruth_boxes][0])
def test_build_tf_record_input_reader_and_load_instance_masks(self):
input_reader_text_proto = """
load_instance_masks: true
tf_record_input_reader {}
"""
input_reader_proto = input_reader_pb2.InputReader()
text_format.Parse(input_reader_text_proto, input_reader_proto)
decoder = decoder_builder.build(input_reader_proto)
tensor_dict = decoder.decode(self._make_serialized_tf_example())
with tf.train.MonitoredSession() as sess:
output_dict = sess.run(tensor_dict)
self.assertAllEqual(
(1, 4, 5),
output_dict[fields.InputDataFields.groundtruth_instance_masks].shape)
if __name__ == '__main__':
tf.test.main()
......@@ -16,6 +16,15 @@
import tensorflow as tf
# pylint: disable=g-import-not-at-top
try:
from tensorflow.contrib import layers as contrib_layers
from tensorflow.contrib import quantize as contrib_quantize
except ImportError:
# TF 2.0 doesn't ship with contrib.
pass
# pylint: enable=g-import-not-at-top
def build(graph_rewriter_config, is_training):
"""Returns a function that modifies default graph based on options.
......@@ -32,14 +41,15 @@ def build(graph_rewriter_config, is_training):
# Quantize the graph by inserting quantize ops for weights and activations
if is_training:
tf.contrib.quantize.experimental_create_training_graph(
contrib_quantize.experimental_create_training_graph(
input_graph=tf.get_default_graph(),
quant_delay=graph_rewriter_config.quantization.delay
)
else:
tf.contrib.quantize.experimental_create_eval_graph(
contrib_quantize.experimental_create_eval_graph(
input_graph=tf.get_default_graph()
)
tf.contrib.layers.summarize_collection('quant_vars')
contrib_layers.summarize_collection('quant_vars')
return graph_rewrite_fn
......@@ -18,14 +18,23 @@ import tensorflow as tf
from object_detection.builders import graph_rewriter_builder
from object_detection.protos import graph_rewriter_pb2
# pylint: disable=g-import-not-at-top
try:
from tensorflow.contrib import layers as contrib_layers
from tensorflow.contrib import quantize as contrib_quantize
except ImportError:
# TF 2.0 doesn't ship with contrib.
pass
# pylint: enable=g-import-not-at-top
class QuantizationBuilderTest(tf.test.TestCase):
def testQuantizationBuilderSetsUpCorrectTrainArguments(self):
with mock.patch.object(
tf.contrib.quantize,
contrib_quantize,
'experimental_create_training_graph') as mock_quant_fn:
with mock.patch.object(tf.contrib.layers,
with mock.patch.object(contrib_layers,
'summarize_collection') as mock_summarize_col:
graph_rewriter_proto = graph_rewriter_pb2.GraphRewriter()
graph_rewriter_proto.quantization.delay = 10
......@@ -40,9 +49,9 @@ class QuantizationBuilderTest(tf.test.TestCase):
mock_summarize_col.assert_called_with('quant_vars')
def testQuantizationBuilderSetsUpCorrectEvalArguments(self):
with mock.patch.object(tf.contrib.quantize,
with mock.patch.object(contrib_quantize,
'experimental_create_eval_graph') as mock_quant_fn:
with mock.patch.object(tf.contrib.layers,
with mock.patch.object(contrib_layers,
'summarize_collection') as mock_summarize_col:
graph_rewriter_proto = graph_rewriter_pb2.GraphRewriter()
graph_rewriter_proto.quantization.delay = 10
......
......@@ -20,7 +20,14 @@ from object_detection.core import freezable_batch_norm
from object_detection.protos import hyperparams_pb2
from object_detection.utils import context_manager
slim = tf.contrib.slim
# pylint: disable=g-import-not-at-top
try:
from tensorflow.contrib import slim
from tensorflow.contrib import layers as contrib_layers
except ImportError:
# TF 2.0 doesn't ship with contrib.
pass
# pylint: enable=g-import-not-at-top
class KerasLayerHyperparams(object):
......@@ -216,7 +223,7 @@ def build(hyperparams_config, is_training):
batch_norm_params = _build_batch_norm_params(
hyperparams_config.batch_norm, is_training)
if hyperparams_config.HasField('group_norm'):
normalizer_fn = tf.contrib.layers.group_norm
normalizer_fn = contrib_layers.group_norm
affected_ops = [slim.conv2d, slim.separable_conv2d, slim.conv2d_transpose]
if hyperparams_config.HasField('op') and (
hyperparams_config.op == hyperparams_pb2.Hyperparams.FC):
......@@ -256,6 +263,8 @@ def _build_activation_fn(activation_fn):
return tf.nn.relu
if activation_fn == hyperparams_pb2.Hyperparams.RELU_6:
return tf.nn.relu6
if activation_fn == hyperparams_pb2.Hyperparams.SWISH:
return tf.nn.swish
raise ValueError('Unknown activation function: {}'.format(activation_fn))
......@@ -301,6 +310,8 @@ def _build_keras_regularizer(regularizer):
# weight by a factor of 2
return tf.keras.regularizers.l2(
float(regularizer.l2_regularizer.weight * 0.5))
if regularizer_oneof is None:
return None
raise ValueError('Unknown regularizer function: {}'.format(regularizer_oneof))
......@@ -369,6 +380,8 @@ def _build_initializer(initializer, build_for_keras=False):
factor=initializer.variance_scaling_initializer.factor,
mode=mode,
uniform=initializer.variance_scaling_initializer.uniform)
if initializer_oneof is None:
return None
raise ValueError('Unknown initializer function: {}'.format(
initializer_oneof))
......
# Lint as: python2, python3
# Copyright 2017 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
......@@ -24,7 +25,13 @@ from object_detection.builders import hyperparams_builder
from object_detection.core import freezable_batch_norm
from object_detection.protos import hyperparams_pb2
slim = tf.contrib.slim
# pylint: disable=g-import-not-at-top
try:
from tensorflow.contrib import slim
except ImportError:
# TF 2.0 doesn't ship with contrib.
pass
# pylint: enable=g-import-not-at-top
def _get_scope_key(op):
......@@ -49,7 +56,7 @@ class HyperparamsBuilderTest(tf.test.TestCase):
scope_fn = hyperparams_builder.build(conv_hyperparams_proto,
is_training=True)
scope = scope_fn()
self.assertTrue(_get_scope_key(slim.conv2d) in scope)
self.assertIn(_get_scope_key(slim.conv2d), scope)
def test_default_arg_scope_has_separable_conv2d_op(self):
conv_hyperparams_text_proto = """
......@@ -67,7 +74,7 @@ class HyperparamsBuilderTest(tf.test.TestCase):
scope_fn = hyperparams_builder.build(conv_hyperparams_proto,
is_training=True)
scope = scope_fn()
self.assertTrue(_get_scope_key(slim.separable_conv2d) in scope)
self.assertIn(_get_scope_key(slim.separable_conv2d), scope)
def test_default_arg_scope_has_conv2d_transpose_op(self):
conv_hyperparams_text_proto = """
......@@ -85,7 +92,7 @@ class HyperparamsBuilderTest(tf.test.TestCase):
scope_fn = hyperparams_builder.build(conv_hyperparams_proto,
is_training=True)
scope = scope_fn()
self.assertTrue(_get_scope_key(slim.conv2d_transpose) in scope)
self.assertIn(_get_scope_key(slim.conv2d_transpose), scope)
def test_explicit_fc_op_arg_scope_has_fully_connected_op(self):
conv_hyperparams_text_proto = """
......@@ -104,7 +111,7 @@ class HyperparamsBuilderTest(tf.test.TestCase):
scope_fn = hyperparams_builder.build(conv_hyperparams_proto,
is_training=True)
scope = scope_fn()
self.assertTrue(_get_scope_key(slim.fully_connected) in scope)
self.assertIn(_get_scope_key(slim.fully_connected), scope)
def test_separable_conv2d_and_conv2d_and_transpose_have_same_parameters(self):
conv_hyperparams_text_proto = """
......@@ -143,7 +150,7 @@ class HyperparamsBuilderTest(tf.test.TestCase):
scope_fn = hyperparams_builder.build(conv_hyperparams_proto,
is_training=True)
scope = scope_fn()
conv_scope_arguments = scope.values()[0]
conv_scope_arguments = list(scope.values())[0]
regularizer = conv_scope_arguments['weights_regularizer']
weights = np.array([1., -1, 4., 2.])
with self.test_session() as sess:
......@@ -284,8 +291,8 @@ class HyperparamsBuilderTest(tf.test.TestCase):
self.assertTrue(batch_norm_params['scale'])
batch_norm_layer = keras_config.build_batch_norm()
self.assertTrue(isinstance(batch_norm_layer,
freezable_batch_norm.FreezableBatchNorm))
self.assertIsInstance(batch_norm_layer,
freezable_batch_norm.FreezableBatchNorm)
def test_return_non_default_batch_norm_params_keras_override(
self):
......@@ -420,8 +427,8 @@ class HyperparamsBuilderTest(tf.test.TestCase):
# The batch norm builder should build an identity Lambda layer
identity_layer = keras_config.build_batch_norm()
self.assertTrue(isinstance(identity_layer,
tf.keras.layers.Lambda))
self.assertIsInstance(identity_layer,
tf.keras.layers.Lambda)
def test_use_none_activation(self):
conv_hyperparams_text_proto = """
......@@ -463,7 +470,7 @@ class HyperparamsBuilderTest(tf.test.TestCase):
self.assertEqual(
keras_config.params(include_activation=True)['activation'], None)
activation_layer = keras_config.build_activation_layer()
self.assertTrue(isinstance(activation_layer, tf.keras.layers.Lambda))
self.assertIsInstance(activation_layer, tf.keras.layers.Lambda)
self.assertEqual(activation_layer.function, tf.identity)
def test_use_relu_activation(self):
......@@ -506,7 +513,7 @@ class HyperparamsBuilderTest(tf.test.TestCase):
self.assertEqual(
keras_config.params(include_activation=True)['activation'], tf.nn.relu)
activation_layer = keras_config.build_activation_layer()
self.assertTrue(isinstance(activation_layer, tf.keras.layers.Lambda))
self.assertIsInstance(activation_layer, tf.keras.layers.Lambda)
self.assertEqual(activation_layer.function, tf.nn.relu)
def test_use_relu_6_activation(self):
......@@ -549,9 +556,52 @@ class HyperparamsBuilderTest(tf.test.TestCase):
self.assertEqual(
keras_config.params(include_activation=True)['activation'], tf.nn.relu6)
activation_layer = keras_config.build_activation_layer()
self.assertTrue(isinstance(activation_layer, tf.keras.layers.Lambda))
self.assertIsInstance(activation_layer, tf.keras.layers.Lambda)
self.assertEqual(activation_layer.function, tf.nn.relu6)
def test_use_swish_activation(self):
conv_hyperparams_text_proto = """
regularizer {
l2_regularizer {
}
}
initializer {
truncated_normal_initializer {
}
}
activation: SWISH
"""
conv_hyperparams_proto = hyperparams_pb2.Hyperparams()
text_format.Merge(conv_hyperparams_text_proto, conv_hyperparams_proto)
scope_fn = hyperparams_builder.build(conv_hyperparams_proto,
is_training=True)
scope = scope_fn()
conv_scope_arguments = scope[_get_scope_key(slim.conv2d)]
self.assertEqual(conv_scope_arguments['activation_fn'], tf.nn.swish)
def test_use_swish_activation_keras(self):
conv_hyperparams_text_proto = """
regularizer {
l2_regularizer {
}
}
initializer {
truncated_normal_initializer {
}
}
activation: SWISH
"""
conv_hyperparams_proto = hyperparams_pb2.Hyperparams()
text_format.Merge(conv_hyperparams_text_proto, conv_hyperparams_proto)
keras_config = hyperparams_builder.KerasLayerHyperparams(
conv_hyperparams_proto)
self.assertEqual(keras_config.params()['activation'], None)
self.assertEqual(
keras_config.params(include_activation=True)['activation'], tf.nn.swish)
activation_layer = keras_config.build_activation_layer()
self.assertIsInstance(activation_layer, tf.keras.layers.Lambda)
self.assertEqual(activation_layer.function, tf.nn.swish)
def test_override_activation_keras(self):
conv_hyperparams_text_proto = """
regularizer {
......
......@@ -133,9 +133,22 @@ def build(image_resizer_config):
'Invalid image resizer condition option for '
'ConditionalShapeResizer: \'%s\'.'
% conditional_shape_resize_config.condition)
if not conditional_shape_resize_config.convert_to_grayscale:
return image_resizer_fn
elif image_resizer_oneof == 'pad_to_multiple_resizer':
pad_to_multiple_resizer_config = (
image_resizer_config.pad_to_multiple_resizer)
if pad_to_multiple_resizer_config.multiple < 0:
raise ValueError('`multiple` for pad_to_multiple_resizer should be > 0.')
else:
image_resizer_fn = functools.partial(
preprocessor.resize_pad_to_multiple,
multiple=pad_to_multiple_resizer_config.multiple)
if not pad_to_multiple_resizer_config.convert_to_grayscale:
return image_resizer_fn
else:
raise ValueError(
'Invalid image resizer option: \'%s\'.' % image_resizer_oneof)
......@@ -149,16 +162,16 @@ def build(image_resizer_config):
width] containing instance masks.
Returns:
Note that the position of the resized_image_shape changes based on whether
masks are present.
resized_image: A 3D tensor of shape [new_height, new_width, 1],
where the image has been resized (with bilinear interpolation) so that
min(new_height, new_width) == min_dimension or
max(new_height, new_width) == max_dimension.
resized_masks: If masks is not None, also outputs masks. A 3D tensor of
shape [num_instances, new_height, new_width].
resized_image_shape: A 1D tensor of shape [3] containing shape of the
resized image.
Note that the position of the resized_image_shape changes based on whether
masks are present.
resized_image: A 3D tensor of shape [new_height, new_width, 1],
where the image has been resized (with bilinear interpolation) so that
min(new_height, new_width) == min_dimension or
max(new_height, new_width) == max_dimension.
resized_masks: If masks is not None, also outputs masks. A 3D tensor of
shape [num_instances, new_height, new_width].
resized_image_shape: A 1D tensor of shape [3] containing shape of the
resized image.
"""
# image_resizer_fn returns [resized_image, resized_image_shape] if
# mask==None, otherwise it returns
......
......@@ -211,6 +211,31 @@ class ImageResizerBuilderTest(tf.test.TestCase):
with self.assertRaises(ValueError):
image_resizer_builder.build(invalid_image_resizer_text_proto)
def test_build_pad_to_multiple_resizer(self):
"""Test building a pad_to_multiple_resizer from proto."""
image_resizer_text_proto = """
pad_to_multiple_resizer {
multiple: 32
}
"""
input_shape = (60, 30, 3)
expected_output_shape = (64, 32, 3)
output_shape = self._shape_of_resized_random_image_given_text_proto(
input_shape, image_resizer_text_proto)
self.assertEqual(output_shape, expected_output_shape)
def test_build_pad_to_multiple_resizer_invalid_multiple(self):
"""Test that building a pad_to_multiple_resizer errors with invalid multiple."""
image_resizer_text_proto = """
pad_to_multiple_resizer {
multiple: -10
}
"""
with self.assertRaises(ValueError):
image_resizer_builder.build(image_resizer_text_proto)
if __name__ == '__main__':
tf.test.main()
# Lint as: python2, python3
# Copyright 2017 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
......@@ -23,12 +24,24 @@ Detection configuration framework, they should define their own builder function
that wraps the build function.
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import tensorflow as tf
from object_detection.data_decoders import tf_example_decoder
from object_detection.protos import input_reader_pb2
parallel_reader = tf.contrib.slim.parallel_reader
# pylint: disable=g-import-not-at-top
try:
from tensorflow.contrib import slim as contrib_slim
except ImportError:
# TF 2.0 doesn't ship with contrib.
pass
# pylint: enable=g-import-not-at-top
parallel_reader = contrib_slim.parallel_reader
def build(input_reader_config):
......@@ -70,7 +83,8 @@ def build(input_reader_config):
decoder = tf_example_decoder.TfExampleDecoder(
load_instance_masks=input_reader_config.load_instance_masks,
instance_mask_type=input_reader_config.mask_type,
label_map_proto_file=label_map_proto_file)
label_map_proto_file=label_map_proto_file,
load_context_features=input_reader_config.load_context_features)
return decoder.decode(string_tensor)
raise ValueError('Unsupported input_reader_config.')
......@@ -54,6 +54,48 @@ class InputReaderBuilderTest(tf.test.TestCase):
return path
def create_tf_record_with_context(self):
path = os.path.join(self.get_temp_dir(), 'tfrecord')
writer = tf.python_io.TFRecordWriter(path)
image_tensor = np.random.randint(255, size=(4, 5, 3)).astype(np.uint8)
flat_mask = (4 * 5) * [1.0]
context_features = (10 * 3) * [1.0]
with self.test_session():
encoded_jpeg = tf.image.encode_jpeg(tf.constant(image_tensor)).eval()
example = tf.train.Example(
features=tf.train.Features(
feature={
'image/encoded':
dataset_util.bytes_feature(encoded_jpeg),
'image/format':
dataset_util.bytes_feature('jpeg'.encode('utf8')),
'image/height':
dataset_util.int64_feature(4),
'image/width':
dataset_util.int64_feature(5),
'image/object/bbox/xmin':
dataset_util.float_list_feature([0.0]),
'image/object/bbox/xmax':
dataset_util.float_list_feature([1.0]),
'image/object/bbox/ymin':
dataset_util.float_list_feature([0.0]),
'image/object/bbox/ymax':
dataset_util.float_list_feature([1.0]),
'image/object/class/label':
dataset_util.int64_list_feature([2]),
'image/object/mask':
dataset_util.float_list_feature(flat_mask),
'image/context_features':
dataset_util.float_list_feature(context_features),
'image/context_feature_length':
dataset_util.int64_list_feature([10]),
}))
writer.write(example.SerializeToString())
writer.close()
return path
def test_build_tf_record_input_reader(self):
tf_record_path = self.create_tf_record()
......@@ -71,18 +113,53 @@ class InputReaderBuilderTest(tf.test.TestCase):
with tf.train.MonitoredSession() as sess:
output_dict = sess.run(tensor_dict)
self.assertTrue(fields.InputDataFields.groundtruth_instance_masks
not in output_dict)
self.assertEquals(
(4, 5, 3), output_dict[fields.InputDataFields.image].shape)
self.assertEquals(
[2], output_dict[fields.InputDataFields.groundtruth_classes])
self.assertEquals(
self.assertNotIn(fields.InputDataFields.groundtruth_instance_masks,
output_dict)
self.assertEqual((4, 5, 3), output_dict[fields.InputDataFields.image].shape)
self.assertEqual([2],
output_dict[fields.InputDataFields.groundtruth_classes])
self.assertEqual(
(1, 4), output_dict[fields.InputDataFields.groundtruth_boxes].shape)
self.assertAllEqual(
[0.0, 0.0, 1.0, 1.0],
output_dict[fields.InputDataFields.groundtruth_boxes][0])
def test_build_tf_record_input_reader_with_context(self):
tf_record_path = self.create_tf_record_with_context()
input_reader_text_proto = """
shuffle: false
num_readers: 1
tf_record_input_reader {{
input_path: '{0}'
}}
""".format(tf_record_path)
input_reader_proto = input_reader_pb2.InputReader()
text_format.Merge(input_reader_text_proto, input_reader_proto)
input_reader_proto.load_context_features = True
tensor_dict = input_reader_builder.build(input_reader_proto)
with tf.train.MonitoredSession() as sess:
output_dict = sess.run(tensor_dict)
self.assertNotIn(fields.InputDataFields.groundtruth_instance_masks,
output_dict)
self.assertEqual((4, 5, 3), output_dict[fields.InputDataFields.image].shape)
self.assertEqual([2],
output_dict[fields.InputDataFields.groundtruth_classes])
self.assertEqual(
(1, 4), output_dict[fields.InputDataFields.groundtruth_boxes].shape)
self.assertAllEqual(
[0.0, 0.0, 1.0, 1.0],
output_dict[fields.InputDataFields.groundtruth_boxes][0])
self.assertAllEqual(
[0.0, 0.0, 1.0, 1.0],
output_dict[fields.InputDataFields.groundtruth_boxes][0])
self.assertAllEqual(
(3, 10), output_dict[fields.InputDataFields.context_features].shape)
self.assertAllEqual(
(10), output_dict[fields.InputDataFields.context_feature_length])
def test_build_tf_record_input_reader_and_load_instance_masks(self):
tf_record_path = self.create_tf_record()
......@@ -101,11 +178,10 @@ class InputReaderBuilderTest(tf.test.TestCase):
with tf.train.MonitoredSession() as sess:
output_dict = sess.run(tensor_dict)
self.assertEquals(
(4, 5, 3), output_dict[fields.InputDataFields.image].shape)
self.assertEquals(
[2], output_dict[fields.InputDataFields.groundtruth_classes])
self.assertEquals(
self.assertEqual((4, 5, 3), output_dict[fields.InputDataFields.image].shape)
self.assertEqual([2],
output_dict[fields.InputDataFields.groundtruth_classes])
self.assertEqual(
(1, 4), output_dict[fields.InputDataFields.groundtruth_boxes].shape)
self.assertAllEqual(
[0.0, 0.0, 1.0, 1.0],
......
......@@ -201,6 +201,9 @@ def _build_localization_loss(loss_config):
if loss_type == 'weighted_iou':
return losses.WeightedIOULocalizationLoss()
if loss_type == 'l1_localization_loss':
return losses.L1LocalizationLoss()
raise ValueError('Empty loss config.')
......@@ -249,4 +252,9 @@ def _build_classification_loss(loss_config):
alpha=config.alpha,
bootstrap_type=('hard' if config.hard_bootstrap else 'soft'))
if loss_type == 'penalty_reduced_logistic_focal_loss':
config = loss_config.penalty_reduced_logistic_focal_loss
return losses.PenaltyReducedLogisticFocalLoss(
alpha=config.alpha, beta=config.beta)
raise ValueError('Empty loss config.')
......@@ -40,8 +40,8 @@ class LocalizationLossBuilderTest(tf.test.TestCase):
losses_proto = losses_pb2.Loss()
text_format.Merge(losses_text_proto, losses_proto)
_, localization_loss, _, _, _, _, _ = losses_builder.build(losses_proto)
self.assertTrue(isinstance(localization_loss,
losses.WeightedL2LocalizationLoss))
self.assertIsInstance(localization_loss,
losses.WeightedL2LocalizationLoss)
def test_build_weighted_smooth_l1_localization_loss_default_delta(self):
losses_text_proto = """
......@@ -57,8 +57,8 @@ class LocalizationLossBuilderTest(tf.test.TestCase):
losses_proto = losses_pb2.Loss()
text_format.Merge(losses_text_proto, losses_proto)
_, localization_loss, _, _, _, _, _ = losses_builder.build(losses_proto)
self.assertTrue(isinstance(localization_loss,
losses.WeightedSmoothL1LocalizationLoss))
self.assertIsInstance(localization_loss,
losses.WeightedSmoothL1LocalizationLoss)
self.assertAlmostEqual(localization_loss._delta, 1.0)
def test_build_weighted_smooth_l1_localization_loss_non_default_delta(self):
......@@ -76,8 +76,8 @@ class LocalizationLossBuilderTest(tf.test.TestCase):
losses_proto = losses_pb2.Loss()
text_format.Merge(losses_text_proto, losses_proto)
_, localization_loss, _, _, _, _, _ = losses_builder.build(losses_proto)
self.assertTrue(isinstance(localization_loss,
losses.WeightedSmoothL1LocalizationLoss))
self.assertIsInstance(localization_loss,
losses.WeightedSmoothL1LocalizationLoss)
self.assertAlmostEqual(localization_loss._delta, 0.1)
def test_build_weighted_iou_localization_loss(self):
......@@ -94,8 +94,8 @@ class LocalizationLossBuilderTest(tf.test.TestCase):
losses_proto = losses_pb2.Loss()
text_format.Merge(losses_text_proto, losses_proto)
_, localization_loss, _, _, _, _, _ = losses_builder.build(losses_proto)
self.assertTrue(isinstance(localization_loss,
losses.WeightedIOULocalizationLoss))
self.assertIsInstance(localization_loss,
losses.WeightedIOULocalizationLoss)
def test_anchorwise_output(self):
losses_text_proto = """
......@@ -111,8 +111,8 @@ class LocalizationLossBuilderTest(tf.test.TestCase):
losses_proto = losses_pb2.Loss()
text_format.Merge(losses_text_proto, losses_proto)
_, localization_loss, _, _, _, _, _ = losses_builder.build(losses_proto)
self.assertTrue(isinstance(localization_loss,
losses.WeightedSmoothL1LocalizationLoss))
self.assertIsInstance(localization_loss,
losses.WeightedSmoothL1LocalizationLoss)
predictions = tf.constant([[[0.0, 0.0, 1.0, 1.0], [0.0, 0.0, 1.0, 1.0]]])
targets = tf.constant([[[0.0, 0.0, 1.0, 1.0], [0.0, 0.0, 1.0, 1.0]]])
weights = tf.constant([[1.0, 1.0]])
......@@ -132,6 +132,7 @@ class LocalizationLossBuilderTest(tf.test.TestCase):
losses_builder._build_localization_loss(losses_proto)
class ClassificationLossBuilderTest(tf.test.TestCase):
def test_build_weighted_sigmoid_classification_loss(self):
......@@ -148,8 +149,8 @@ class ClassificationLossBuilderTest(tf.test.TestCase):
losses_proto = losses_pb2.Loss()
text_format.Merge(losses_text_proto, losses_proto)
classification_loss, _, _, _, _, _, _ = losses_builder.build(losses_proto)
self.assertTrue(isinstance(classification_loss,
losses.WeightedSigmoidClassificationLoss))
self.assertIsInstance(classification_loss,
losses.WeightedSigmoidClassificationLoss)
def test_build_weighted_sigmoid_focal_classification_loss(self):
losses_text_proto = """
......@@ -165,8 +166,8 @@ class ClassificationLossBuilderTest(tf.test.TestCase):
losses_proto = losses_pb2.Loss()
text_format.Merge(losses_text_proto, losses_proto)
classification_loss, _, _, _, _, _, _ = losses_builder.build(losses_proto)
self.assertTrue(isinstance(classification_loss,
losses.SigmoidFocalClassificationLoss))
self.assertIsInstance(classification_loss,
losses.SigmoidFocalClassificationLoss)
self.assertAlmostEqual(classification_loss._alpha, None)
self.assertAlmostEqual(classification_loss._gamma, 2.0)
......@@ -186,8 +187,8 @@ class ClassificationLossBuilderTest(tf.test.TestCase):
losses_proto = losses_pb2.Loss()
text_format.Merge(losses_text_proto, losses_proto)
classification_loss, _, _, _, _, _, _ = losses_builder.build(losses_proto)
self.assertTrue(isinstance(classification_loss,
losses.SigmoidFocalClassificationLoss))
self.assertIsInstance(classification_loss,
losses.SigmoidFocalClassificationLoss)
self.assertAlmostEqual(classification_loss._alpha, 0.25)
self.assertAlmostEqual(classification_loss._gamma, 3.0)
......@@ -205,8 +206,8 @@ class ClassificationLossBuilderTest(tf.test.TestCase):
losses_proto = losses_pb2.Loss()
text_format.Merge(losses_text_proto, losses_proto)
classification_loss, _, _, _, _, _, _ = losses_builder.build(losses_proto)
self.assertTrue(isinstance(classification_loss,
losses.WeightedSoftmaxClassificationLoss))
self.assertIsInstance(classification_loss,
losses.WeightedSoftmaxClassificationLoss)
def test_build_weighted_logits_softmax_classification_loss(self):
losses_text_proto = """
......@@ -222,9 +223,9 @@ class ClassificationLossBuilderTest(tf.test.TestCase):
losses_proto = losses_pb2.Loss()
text_format.Merge(losses_text_proto, losses_proto)
classification_loss, _, _, _, _, _, _ = losses_builder.build(losses_proto)
self.assertTrue(
isinstance(classification_loss,
losses.WeightedSoftmaxClassificationAgainstLogitsLoss))
self.assertIsInstance(
classification_loss,
losses.WeightedSoftmaxClassificationAgainstLogitsLoss)
def test_build_weighted_softmax_classification_loss_with_logit_scale(self):
losses_text_proto = """
......@@ -241,8 +242,8 @@ class ClassificationLossBuilderTest(tf.test.TestCase):
losses_proto = losses_pb2.Loss()
text_format.Merge(losses_text_proto, losses_proto)
classification_loss, _, _, _, _, _, _ = losses_builder.build(losses_proto)
self.assertTrue(isinstance(classification_loss,
losses.WeightedSoftmaxClassificationLoss))
self.assertIsInstance(classification_loss,
losses.WeightedSoftmaxClassificationLoss)
def test_build_bootstrapped_sigmoid_classification_loss(self):
losses_text_proto = """
......@@ -259,8 +260,8 @@ class ClassificationLossBuilderTest(tf.test.TestCase):
losses_proto = losses_pb2.Loss()
text_format.Merge(losses_text_proto, losses_proto)
classification_loss, _, _, _, _, _, _ = losses_builder.build(losses_proto)
self.assertTrue(isinstance(classification_loss,
losses.BootstrappedSigmoidClassificationLoss))
self.assertIsInstance(classification_loss,
losses.BootstrappedSigmoidClassificationLoss)
def test_anchorwise_output(self):
losses_text_proto = """
......@@ -277,8 +278,8 @@ class ClassificationLossBuilderTest(tf.test.TestCase):
losses_proto = losses_pb2.Loss()
text_format.Merge(losses_text_proto, losses_proto)
classification_loss, _, _, _, _, _, _ = losses_builder.build(losses_proto)
self.assertTrue(isinstance(classification_loss,
losses.WeightedSigmoidClassificationLoss))
self.assertIsInstance(classification_loss,
losses.WeightedSigmoidClassificationLoss)
predictions = tf.constant([[[0.0, 1.0, 0.0], [0.0, 0.5, 0.5]]])
targets = tf.constant([[[0.0, 1.0, 0.0], [0.0, 0.0, 1.0]]])
weights = tf.constant([[[1.0, 1.0, 1.0], [1.0, 1.0, 1.0]]])
......@@ -298,6 +299,7 @@ class ClassificationLossBuilderTest(tf.test.TestCase):
losses_builder.build(losses_proto)
class HardExampleMinerBuilderTest(tf.test.TestCase):
def test_do_not_build_hard_example_miner_by_default(self):
......@@ -333,7 +335,7 @@ class HardExampleMinerBuilderTest(tf.test.TestCase):
losses_proto = losses_pb2.Loss()
text_format.Merge(losses_text_proto, losses_proto)
_, _, _, _, hard_example_miner, _, _ = losses_builder.build(losses_proto)
self.assertTrue(isinstance(hard_example_miner, losses.HardExampleMiner))
self.assertIsInstance(hard_example_miner, losses.HardExampleMiner)
self.assertEqual(hard_example_miner._loss_type, 'cls')
def test_build_hard_example_miner_for_localization_loss(self):
......@@ -353,7 +355,7 @@ class HardExampleMinerBuilderTest(tf.test.TestCase):
losses_proto = losses_pb2.Loss()
text_format.Merge(losses_text_proto, losses_proto)
_, _, _, _, hard_example_miner, _, _ = losses_builder.build(losses_proto)
self.assertTrue(isinstance(hard_example_miner, losses.HardExampleMiner))
self.assertIsInstance(hard_example_miner, losses.HardExampleMiner)
self.assertEqual(hard_example_miner._loss_type, 'loc')
def test_build_hard_example_miner_with_non_default_values(self):
......@@ -377,7 +379,7 @@ class HardExampleMinerBuilderTest(tf.test.TestCase):
losses_proto = losses_pb2.Loss()
text_format.Merge(losses_text_proto, losses_proto)
_, _, _, _, hard_example_miner, _, _ = losses_builder.build(losses_proto)
self.assertTrue(isinstance(hard_example_miner, losses.HardExampleMiner))
self.assertIsInstance(hard_example_miner, losses.HardExampleMiner)
self.assertEqual(hard_example_miner._num_hard_examples, 32)
self.assertAlmostEqual(hard_example_miner._iou_threshold, 0.5)
self.assertEqual(hard_example_miner._max_negatives_per_positive, 10)
......@@ -406,11 +408,11 @@ class LossBuilderTest(tf.test.TestCase):
(classification_loss, localization_loss, classification_weight,
localization_weight, hard_example_miner, _,
_) = losses_builder.build(losses_proto)
self.assertTrue(isinstance(hard_example_miner, losses.HardExampleMiner))
self.assertTrue(isinstance(classification_loss,
losses.WeightedSoftmaxClassificationLoss))
self.assertTrue(isinstance(localization_loss,
losses.WeightedL2LocalizationLoss))
self.assertIsInstance(hard_example_miner, losses.HardExampleMiner)
self.assertIsInstance(classification_loss,
losses.WeightedSoftmaxClassificationLoss)
self.assertIsInstance(localization_loss,
losses.WeightedL2LocalizationLoss)
self.assertAlmostEqual(classification_weight, 0.8)
self.assertAlmostEqual(localization_weight, 0.2)
......@@ -434,12 +436,10 @@ class LossBuilderTest(tf.test.TestCase):
(classification_loss, localization_loss, classification_weight,
localization_weight, hard_example_miner, _,
_) = losses_builder.build(losses_proto)
self.assertTrue(isinstance(hard_example_miner, losses.HardExampleMiner))
self.assertTrue(
isinstance(classification_loss,
losses.WeightedSoftmaxClassificationLoss))
self.assertTrue(
isinstance(localization_loss, losses.WeightedL2LocalizationLoss))
self.assertIsInstance(hard_example_miner, losses.HardExampleMiner)
self.assertIsInstance(classification_loss,
losses.WeightedSoftmaxClassificationLoss)
self.assertIsInstance(localization_loss, losses.WeightedL2LocalizationLoss)
self.assertAlmostEqual(classification_weight, 0.8)
self.assertAlmostEqual(localization_weight, 0.2)
......@@ -464,12 +464,10 @@ class LossBuilderTest(tf.test.TestCase):
(classification_loss, localization_loss, classification_weight,
localization_weight, hard_example_miner, _,
_) = losses_builder.build(losses_proto)
self.assertTrue(isinstance(hard_example_miner, losses.HardExampleMiner))
self.assertTrue(
isinstance(classification_loss,
losses.WeightedSoftmaxClassificationLoss))
self.assertTrue(
isinstance(localization_loss, losses.WeightedL2LocalizationLoss))
self.assertIsInstance(hard_example_miner, losses.HardExampleMiner)
self.assertIsInstance(classification_loss,
losses.WeightedSoftmaxClassificationLoss)
self.assertIsInstance(localization_loss, losses.WeightedL2LocalizationLoss)
self.assertAlmostEqual(classification_weight, 0.8)
self.assertAlmostEqual(localization_weight, 0.2)
......@@ -505,8 +503,8 @@ class FasterRcnnClassificationLossBuilderTest(tf.test.TestCase):
text_format.Merge(losses_text_proto, losses_proto)
classification_loss = losses_builder.build_faster_rcnn_classification_loss(
losses_proto)
self.assertTrue(isinstance(classification_loss,
losses.WeightedSigmoidClassificationLoss))
self.assertIsInstance(classification_loss,
losses.WeightedSigmoidClassificationLoss)
def test_build_softmax_loss(self):
losses_text_proto = """
......@@ -517,8 +515,8 @@ class FasterRcnnClassificationLossBuilderTest(tf.test.TestCase):
text_format.Merge(losses_text_proto, losses_proto)
classification_loss = losses_builder.build_faster_rcnn_classification_loss(
losses_proto)
self.assertTrue(isinstance(classification_loss,
losses.WeightedSoftmaxClassificationLoss))
self.assertIsInstance(classification_loss,
losses.WeightedSoftmaxClassificationLoss)
def test_build_logits_softmax_loss(self):
losses_text_proto = """
......@@ -542,9 +540,8 @@ class FasterRcnnClassificationLossBuilderTest(tf.test.TestCase):
text_format.Merge(losses_text_proto, losses_proto)
classification_loss = losses_builder.build_faster_rcnn_classification_loss(
losses_proto)
self.assertTrue(
isinstance(classification_loss,
losses.SigmoidFocalClassificationLoss))
self.assertIsInstance(classification_loss,
losses.SigmoidFocalClassificationLoss)
def test_build_softmax_loss_by_default(self):
losses_text_proto = """
......@@ -553,8 +550,8 @@ class FasterRcnnClassificationLossBuilderTest(tf.test.TestCase):
text_format.Merge(losses_text_proto, losses_proto)
classification_loss = losses_builder.build_faster_rcnn_classification_loss(
losses_proto)
self.assertTrue(isinstance(classification_loss,
losses.WeightedSoftmaxClassificationLoss))
self.assertIsInstance(classification_loss,
losses.WeightedSoftmaxClassificationLoss)
if __name__ == '__main__':
......
# Lint as: python2, python3
# Copyright 2017 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
......@@ -12,25 +13,34 @@
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Tests for object_detection.models.model_builder."""
from absl.testing import parameterized
import tensorflow as tf
from google.protobuf import text_format
from object_detection.builders import model_builder
from object_detection.meta_architectures import faster_rcnn_meta_arch
from object_detection.meta_architectures import rfcn_meta_arch
from object_detection.meta_architectures import ssd_meta_arch
from object_detection.models import ssd_resnet_v1_fpn_feature_extractor as ssd_resnet_v1_fpn
from object_detection.protos import hyperparams_pb2
from object_detection.protos import losses_pb2
from object_detection.protos import model_pb2
from object_detection.utils import test_case
class ModelBuilderTest(test_case.TestCase, parameterized.TestCase):
def default_ssd_feature_extractor(self):
raise NotImplementedError
class ModelBuilderTest(tf.test.TestCase, parameterized.TestCase):
def default_faster_rcnn_feature_extractor(self):
raise NotImplementedError
def ssd_feature_extractors(self):
raise NotImplementedError
def faster_rcnn_feature_extractors(self):
raise NotImplementedError
def create_model(self, model_config, is_training=True):
"""Builds a DetectionModel based on the model config.
......@@ -50,7 +60,6 @@ class ModelBuilderTest(tf.test.TestCase, parameterized.TestCase):
model_text_proto = """
ssd {
feature_extractor {
type: 'ssd_inception_v2'
conv_hyperparams {
regularizer {
l2_regularizer {
......@@ -113,6 +122,8 @@ class ModelBuilderTest(tf.test.TestCase, parameterized.TestCase):
}"""
model_proto = model_pb2.DetectionModel()
text_format.Merge(model_text_proto, model_proto)
model_proto.ssd.feature_extractor.type = (self.
default_ssd_feature_extractor())
return model_proto
def create_default_faster_rcnn_model_proto(self):
......@@ -127,9 +138,6 @@ class ModelBuilderTest(tf.test.TestCase, parameterized.TestCase):
max_dimension: 1024
}
}
feature_extractor {
type: 'faster_rcnn_resnet101'
}
first_stage_anchor_generator {
grid_anchor_generator {
scales: [0.25, 0.5, 1.0, 2.0]
......@@ -188,17 +196,14 @@ class ModelBuilderTest(tf.test.TestCase, parameterized.TestCase):
}"""
model_proto = model_pb2.DetectionModel()
text_format.Merge(model_text_proto, model_proto)
(model_proto.faster_rcnn.feature_extractor.type
) = self.default_faster_rcnn_feature_extractor()
return model_proto
def test_create_ssd_models_from_config(self):
model_proto = self.create_default_ssd_model_proto()
ssd_feature_extractor_map = {}
ssd_feature_extractor_map.update(
model_builder.SSD_FEATURE_EXTRACTOR_CLASS_MAP)
ssd_feature_extractor_map.update(
model_builder.SSD_KERAS_FEATURE_EXTRACTOR_CLASS_MAP)
for extractor_type, extractor_class in ssd_feature_extractor_map.items():
for extractor_type, extractor_class in self.ssd_feature_extractors().items(
):
model_proto.ssd.feature_extractor.type = extractor_type
model = model_builder.build(model_proto, is_training=True)
self.assertIsInstance(model, ssd_meta_arch.SSDMetaArch)
......@@ -206,12 +211,9 @@ class ModelBuilderTest(tf.test.TestCase, parameterized.TestCase):
def test_create_ssd_fpn_model_from_config(self):
model_proto = self.create_default_ssd_model_proto()
model_proto.ssd.feature_extractor.type = 'ssd_resnet101_v1_fpn'
model_proto.ssd.feature_extractor.fpn.min_level = 3
model_proto.ssd.feature_extractor.fpn.max_level = 7
model = model_builder.build(model_proto, is_training=True)
self.assertIsInstance(model._feature_extractor,
ssd_resnet_v1_fpn.SSDResnet101V1FpnFeatureExtractor)
self.assertEqual(model._feature_extractor._fpn_min_level, 3)
self.assertEqual(model._feature_extractor._fpn_max_level, 7)
......@@ -238,8 +240,9 @@ class ModelBuilderTest(tf.test.TestCase, parameterized.TestCase):
'enable_mask_prediction': False
},
)
def test_create_faster_rcnn_models_from_config(
self, use_matmul_crop_and_resize, enable_mask_prediction):
def test_create_faster_rcnn_models_from_config(self,
use_matmul_crop_and_resize,
enable_mask_prediction):
model_proto = self.create_default_faster_rcnn_model_proto()
faster_rcnn_config = model_proto.faster_rcnn
faster_rcnn_config.use_matmul_crop_and_resize = use_matmul_crop_and_resize
......@@ -250,7 +253,7 @@ class ModelBuilderTest(tf.test.TestCase, parameterized.TestCase):
mask_predictor_config.predict_instance_masks = True
for extractor_type, extractor_class in (
model_builder.FASTER_RCNN_FEATURE_EXTRACTOR_CLASS_MAP.items()):
self.faster_rcnn_feature_extractors().items()):
faster_rcnn_config.feature_extractor.type = extractor_type
model = model_builder.build(model_proto, is_training=True)
self.assertIsInstance(model, faster_rcnn_meta_arch.FasterRCNNMetaArch)
......@@ -270,52 +273,59 @@ class ModelBuilderTest(tf.test.TestCase, parameterized.TestCase):
model_proto.faster_rcnn.second_stage_box_predictor.rfcn_box_predictor)
rfcn_predictor_config.conv_hyperparams.op = hyperparams_pb2.Hyperparams.CONV
for extractor_type, extractor_class in (
model_builder.FASTER_RCNN_FEATURE_EXTRACTOR_CLASS_MAP.items()):
self.faster_rcnn_feature_extractors().items()):
model_proto.faster_rcnn.feature_extractor.type = extractor_type
model = model_builder.build(model_proto, is_training=True)
self.assertIsInstance(model, rfcn_meta_arch.RFCNMetaArch)
self.assertIsInstance(model._feature_extractor, extractor_class)
@parameterized.parameters(True, False)
def test_create_faster_rcnn_from_config_with_crop_feature(
self, output_final_box_features):
model_proto = self.create_default_faster_rcnn_model_proto()
model_proto.faster_rcnn.output_final_box_features = (
output_final_box_features)
_ = model_builder.build(model_proto, is_training=True)
def test_invalid_model_config_proto(self):
model_proto = ''
with self.assertRaisesRegexp(
with self.assertRaisesRegex(
ValueError, 'model_config not of type model_pb2.DetectionModel.'):
model_builder.build(model_proto, is_training=True)
def test_unknown_meta_architecture(self):
model_proto = model_pb2.DetectionModel()
with self.assertRaisesRegexp(ValueError, 'Unknown meta architecture'):
with self.assertRaisesRegex(ValueError, 'Unknown meta architecture'):
model_builder.build(model_proto, is_training=True)
def test_unknown_ssd_feature_extractor(self):
model_proto = self.create_default_ssd_model_proto()
model_proto.ssd.feature_extractor.type = 'unknown_feature_extractor'
with self.assertRaisesRegexp(ValueError, 'Unknown ssd feature_extractor'):
with self.assertRaises(ValueError):
model_builder.build(model_proto, is_training=True)
def test_unknown_faster_rcnn_feature_extractor(self):
model_proto = self.create_default_faster_rcnn_model_proto()
model_proto.faster_rcnn.feature_extractor.type = 'unknown_feature_extractor'
with self.assertRaisesRegexp(ValueError,
'Unknown Faster R-CNN feature_extractor'):
with self.assertRaises(ValueError):
model_builder.build(model_proto, is_training=True)
def test_invalid_first_stage_nms_iou_threshold(self):
model_proto = self.create_default_faster_rcnn_model_proto()
model_proto.faster_rcnn.first_stage_nms_iou_threshold = 1.1
with self.assertRaisesRegexp(ValueError,
r'iou_threshold not in \[0, 1\.0\]'):
with self.assertRaisesRegex(ValueError,
r'iou_threshold not in \[0, 1\.0\]'):
model_builder.build(model_proto, is_training=True)
model_proto.faster_rcnn.first_stage_nms_iou_threshold = -0.1
with self.assertRaisesRegexp(ValueError,
r'iou_threshold not in \[0, 1\.0\]'):
with self.assertRaisesRegex(ValueError,
r'iou_threshold not in \[0, 1\.0\]'):
model_builder.build(model_proto, is_training=True)
def test_invalid_second_stage_batch_size(self):
model_proto = self.create_default_faster_rcnn_model_proto()
model_proto.faster_rcnn.first_stage_max_proposals = 1
model_proto.faster_rcnn.second_stage_batch_size = 2
with self.assertRaisesRegexp(
with self.assertRaisesRegex(
ValueError, 'second_stage_batch_size should be no greater '
'than first_stage_max_proposals.'):
model_builder.build(model_proto, is_training=True)
......@@ -323,8 +333,8 @@ class ModelBuilderTest(tf.test.TestCase, parameterized.TestCase):
def test_invalid_faster_rcnn_batchnorm_update(self):
model_proto = self.create_default_faster_rcnn_model_proto()
model_proto.faster_rcnn.inplace_batchnorm_update = True
with self.assertRaisesRegexp(ValueError,
'inplace batchnorm updates not supported'):
with self.assertRaisesRegex(ValueError,
'inplace batchnorm updates not supported'):
model_builder.build(model_proto, is_training=True)
def test_create_experimental_model(self):
......@@ -340,7 +350,3 @@ class ModelBuilderTest(tf.test.TestCase, parameterized.TestCase):
text_format.Merge(model_text_proto, model_proto)
self.assertEqual(model_builder.build(model_proto, is_training=True), 42)
if __name__ == '__main__':
tf.test.main()
# Lint as: python2, python3
# Copyright 2020 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Tests for model_builder under TensorFlow 1.X."""
from absl.testing import parameterized
import tensorflow as tf
from object_detection.builders import model_builder
from object_detection.builders import model_builder_test
from object_detection.meta_architectures import ssd_meta_arch
from object_detection.protos import losses_pb2
class ModelBuilderTF1Test(model_builder_test.ModelBuilderTest):
def default_ssd_feature_extractor(self):
return 'ssd_resnet50_v1_fpn'
def default_faster_rcnn_feature_extractor(self):
return 'faster_rcnn_resnet101'
def ssd_feature_extractors(self):
return model_builder.SSD_FEATURE_EXTRACTOR_CLASS_MAP
def faster_rcnn_feature_extractors(self):
return model_builder.FASTER_RCNN_FEATURE_EXTRACTOR_CLASS_MAP
if __name__ == '__main__':
tf.test.main()
......@@ -18,6 +18,7 @@
import tensorflow as tf
from tensorflow.contrib import opt as tf_opt
from object_detection.utils import learning_schedules
......@@ -64,14 +65,14 @@ def build_optimizers_tf_v1(optimizer_config, global_step=None):
learning_rate = _create_learning_rate(config.learning_rate,
global_step=global_step)
summary_vars.append(learning_rate)
optimizer = tf.train.AdamOptimizer(learning_rate)
optimizer = tf.train.AdamOptimizer(learning_rate, epsilon=config.epsilon)
if optimizer is None:
raise ValueError('Optimizer %s not supported.' % optimizer_type)
if optimizer_config.use_moving_average:
optimizer = tf.contrib.opt.MovingAverageOptimizer(
optimizer = tf_opt.MovingAverageOptimizer(
optimizer, average_decay=optimizer_config.moving_average_decay)
return optimizer, summary_vars
......@@ -120,7 +121,7 @@ def build_optimizers_tf_v2(optimizer_config, global_step=None):
learning_rate = _create_learning_rate(config.learning_rate,
global_step=global_step)
summary_vars.append(learning_rate)
optimizer = tf.keras.optimizers.Adam(learning_rate)
optimizer = tf.keras.optimizers.Adam(learning_rate, epsilon=config.epsilon)
if optimizer is None:
raise ValueError('Optimizer %s not supported.' % optimizer_type)
......
# Lint as: python2, python3
# Copyright 2017 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
......@@ -15,6 +16,11 @@
"""Tests for optimizer_builder."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import six
import tensorflow as tf
from google.protobuf import text_format
......@@ -22,6 +28,14 @@ from google.protobuf import text_format
from object_detection.builders import optimizer_builder
from object_detection.protos import optimizer_pb2
# pylint: disable=g-import-not-at-top
try:
from tensorflow.contrib import opt as contrib_opt
except ImportError:
# TF 2.0 doesn't ship with contrib.
pass
# pylint: enable=g-import-not-at-top
class LearningRateBuilderTest(tf.test.TestCase):
......@@ -35,7 +49,8 @@ class LearningRateBuilderTest(tf.test.TestCase):
text_format.Merge(learning_rate_text_proto, learning_rate_proto)
learning_rate = optimizer_builder._create_learning_rate(
learning_rate_proto)
self.assertTrue(learning_rate.op.name.endswith('learning_rate'))
self.assertTrue(
six.ensure_str(learning_rate.op.name).endswith('learning_rate'))
with self.test_session():
learning_rate_out = learning_rate.eval()
self.assertAlmostEqual(learning_rate_out, 0.004)
......@@ -53,8 +68,9 @@ class LearningRateBuilderTest(tf.test.TestCase):
text_format.Merge(learning_rate_text_proto, learning_rate_proto)
learning_rate = optimizer_builder._create_learning_rate(
learning_rate_proto)
self.assertTrue(learning_rate.op.name.endswith('learning_rate'))
self.assertTrue(isinstance(learning_rate, tf.Tensor))
self.assertTrue(
six.ensure_str(learning_rate.op.name).endswith('learning_rate'))
self.assertIsInstance(learning_rate, tf.Tensor)
def testBuildManualStepLearningRate(self):
learning_rate_text_proto = """
......@@ -75,7 +91,7 @@ class LearningRateBuilderTest(tf.test.TestCase):
text_format.Merge(learning_rate_text_proto, learning_rate_proto)
learning_rate = optimizer_builder._create_learning_rate(
learning_rate_proto)
self.assertTrue(isinstance(learning_rate, tf.Tensor))
self.assertIsInstance(learning_rate, tf.Tensor)
def testBuildCosineDecayLearningRate(self):
learning_rate_text_proto = """
......@@ -91,7 +107,7 @@ class LearningRateBuilderTest(tf.test.TestCase):
text_format.Merge(learning_rate_text_proto, learning_rate_proto)
learning_rate = optimizer_builder._create_learning_rate(
learning_rate_proto)
self.assertTrue(isinstance(learning_rate, tf.Tensor))
self.assertIsInstance(learning_rate, tf.Tensor)
def testRaiseErrorOnEmptyLearningRate(self):
learning_rate_text_proto = """
......@@ -123,7 +139,7 @@ class OptimizerBuilderTest(tf.test.TestCase):
optimizer_proto = optimizer_pb2.Optimizer()
text_format.Merge(optimizer_text_proto, optimizer_proto)
optimizer, _ = optimizer_builder.build(optimizer_proto)
self.assertTrue(isinstance(optimizer, tf.train.RMSPropOptimizer))
self.assertIsInstance(optimizer, tf.train.RMSPropOptimizer)
def testBuildMomentumOptimizer(self):
optimizer_text_proto = """
......@@ -140,11 +156,12 @@ class OptimizerBuilderTest(tf.test.TestCase):
optimizer_proto = optimizer_pb2.Optimizer()
text_format.Merge(optimizer_text_proto, optimizer_proto)
optimizer, _ = optimizer_builder.build(optimizer_proto)
self.assertTrue(isinstance(optimizer, tf.train.MomentumOptimizer))
self.assertIsInstance(optimizer, tf.train.MomentumOptimizer)
def testBuildAdamOptimizer(self):
optimizer_text_proto = """
adam_optimizer: {
epsilon: 1e-6
learning_rate: {
constant_learning_rate {
learning_rate: 0.002
......@@ -156,7 +173,7 @@ class OptimizerBuilderTest(tf.test.TestCase):
optimizer_proto = optimizer_pb2.Optimizer()
text_format.Merge(optimizer_text_proto, optimizer_proto)
optimizer, _ = optimizer_builder.build(optimizer_proto)
self.assertTrue(isinstance(optimizer, tf.train.AdamOptimizer))
self.assertIsInstance(optimizer, tf.train.AdamOptimizer)
def testBuildMovingAverageOptimizer(self):
optimizer_text_proto = """
......@@ -172,8 +189,7 @@ class OptimizerBuilderTest(tf.test.TestCase):
optimizer_proto = optimizer_pb2.Optimizer()
text_format.Merge(optimizer_text_proto, optimizer_proto)
optimizer, _ = optimizer_builder.build(optimizer_proto)
self.assertTrue(
isinstance(optimizer, tf.contrib.opt.MovingAverageOptimizer))
self.assertIsInstance(optimizer, contrib_opt.MovingAverageOptimizer)
def testBuildMovingAverageOptimizerWithNonDefaultDecay(self):
optimizer_text_proto = """
......@@ -190,8 +206,7 @@ class OptimizerBuilderTest(tf.test.TestCase):
optimizer_proto = optimizer_pb2.Optimizer()
text_format.Merge(optimizer_text_proto, optimizer_proto)
optimizer, _ = optimizer_builder.build(optimizer_proto)
self.assertTrue(
isinstance(optimizer, tf.contrib.opt.MovingAverageOptimizer))
self.assertIsInstance(optimizer, contrib_opt.MovingAverageOptimizer)
# TODO(rathodv): Find a way to not depend on the private members.
self.assertAlmostEqual(optimizer._ema._decay, 0.2)
......
......@@ -102,7 +102,7 @@ def _build_non_max_suppressor(nms_config):
soft_nms_sigma=nms_config.soft_nms_sigma,
use_partitioned_nms=nms_config.use_partitioned_nms,
use_combined_nms=nms_config.use_combined_nms,
change_coordinate_frame=True)
change_coordinate_frame=nms_config.change_coordinate_frame)
return non_max_suppressor_fn
......@@ -110,7 +110,7 @@ def _build_non_max_suppressor(nms_config):
def _score_converter_fn_with_logit_scale(tf_score_converter_fn, logit_scale):
"""Create a function to scale logits then apply a Tensorflow function."""
def score_converter_fn(logits):
scaled_logits = tf.divide(logits, logit_scale, name='scale_logits')
scaled_logits = tf.multiply(logits, 1.0 / logit_scale, name='scale_logits')
return tf_score_converter_fn(scaled_logits, name='convert_scores')
score_converter_fn.__name__ = '%s_with_logit_scale' % (
tf_score_converter_fn.__name__)
......
......@@ -150,7 +150,7 @@ def build(preprocessor_step_config):
return (preprocessor.random_horizontal_flip,
{
'keypoint_flip_permutation': tuple(
config.keypoint_flip_permutation),
config.keypoint_flip_permutation) or None,
})
if step_type == 'random_vertical_flip':
......@@ -158,7 +158,7 @@ def build(preprocessor_step_config):
return (preprocessor.random_vertical_flip,
{
'keypoint_flip_permutation': tuple(
config.keypoint_flip_permutation),
config.keypoint_flip_permutation) or None,
})
if step_type == 'random_rotation90':
......@@ -400,4 +400,13 @@ def build(preprocessor_step_config):
kwargs['random_coef'] = [op.random_coef for op in config.operations]
return (preprocessor.ssd_random_crop_pad_fixed_aspect_ratio, kwargs)
if step_type == 'random_square_crop_by_scale':
config = preprocessor_step_config.random_square_crop_by_scale
return preprocessor.random_square_crop_by_scale, {
'scale_min': config.scale_min,
'scale_max': config.scale_max,
'max_border': config.max_border,
'num_scales': config.num_scales
}
raise ValueError('Unknown preprocessing step.')
......@@ -723,6 +723,25 @@ class PreprocessorBuilderTest(tf.test.TestCase):
self.assertEqual(function, preprocessor.convert_class_logits_to_softmax)
self.assertEqual(args, {'temperature': 2})
def test_random_crop_by_scale(self):
preprocessor_text_proto = """
random_square_crop_by_scale {
scale_min: 0.25
scale_max: 2.0
num_scales: 8
}
"""
preprocessor_proto = preprocessor_pb2.PreprocessingStep()
text_format.Merge(preprocessor_text_proto, preprocessor_proto)
function, args = preprocessor_builder.build(preprocessor_proto)
self.assertEqual(function, preprocessor.random_square_crop_by_scale)
self.assertEqual(args, {
'scale_min': 0.25,
'scale_max': 2.0,
'num_scales': 8,
'max_border': 128
})
if __name__ == '__main__':
tf.test.main()
......@@ -33,7 +33,6 @@ when number of examples set to True in indicator is less than batch_size.
import tensorflow as tf
from object_detection.core import minibatch_sampler
from object_detection.utils import ops
class BalancedPositiveNegativeSampler(minibatch_sampler.MinibatchSampler):
......@@ -158,19 +157,17 @@ class BalancedPositiveNegativeSampler(minibatch_sampler.MinibatchSampler):
# Shuffle indicator and label. Need to store the permutation to restore the
# order post sampling.
permutation = tf.random_shuffle(tf.range(input_length))
indicator = ops.matmul_gather_on_zeroth_axis(
tf.cast(indicator, tf.float32), permutation)
labels = ops.matmul_gather_on_zeroth_axis(
tf.cast(labels, tf.float32), permutation)
indicator = tf.gather(indicator, permutation, axis=0)
labels = tf.gather(labels, permutation, axis=0)
# index (starting from 1) when indicator is True, 0 when False
indicator_idx = tf.where(
tf.cast(indicator, tf.bool), tf.range(1, input_length + 1),
indicator, tf.range(1, input_length + 1),
tf.zeros(input_length, tf.int32))
# Replace -1 for negative, +1 for positive labels
signed_label = tf.where(
tf.cast(labels, tf.bool), tf.ones(input_length, tf.int32),
labels, tf.ones(input_length, tf.int32),
tf.scalar_mul(-1, tf.ones(input_length, tf.int32)))
# negative of index for negative label, positive index for positive label,
# 0 when indicator is False.
......@@ -198,11 +195,10 @@ class BalancedPositiveNegativeSampler(minibatch_sampler.MinibatchSampler):
axis=0), tf.bool)
# project back the order based on stored permutations
reprojections = tf.one_hot(permutation, depth=input_length,
dtype=tf.float32)
return tf.cast(tf.tensordot(
tf.cast(sampled_idx_indicator, tf.float32),
reprojections, axes=[0, 0]), tf.bool)
idx_indicator = tf.scatter_nd(
tf.expand_dims(permutation, -1), sampled_idx_indicator,
shape=(input_length,))
return idx_indicator
def subsample(self, indicator, batch_size, labels, scope=None):
"""Returns subsampled minibatch.
......
......@@ -24,24 +24,27 @@ from object_detection.utils import test_case
class BalancedPositiveNegativeSamplerTest(test_case.TestCase):
def test_subsample_all_examples_dynamic(self):
def test_subsample_all_examples(self):
if self.has_tpu(): return
numpy_labels = np.random.permutation(300)
indicator = tf.constant(np.ones(300) == 1)
indicator = np.array(np.ones(300) == 1, np.bool)
numpy_labels = (numpy_labels - 200) > 0
labels = tf.constant(numpy_labels)
labels = np.array(numpy_labels, np.bool)
sampler = (
balanced_positive_negative_sampler.BalancedPositiveNegativeSampler())
is_sampled = sampler.subsample(indicator, 64, labels)
with self.test_session() as sess:
is_sampled = sess.run(is_sampled)
self.assertTrue(sum(is_sampled) == 64)
self.assertTrue(sum(np.logical_and(numpy_labels, is_sampled)) == 32)
self.assertTrue(sum(np.logical_and(
np.logical_not(numpy_labels), is_sampled)) == 32)
def graph_fn(indicator, labels):
sampler = (
balanced_positive_negative_sampler.BalancedPositiveNegativeSampler())
return sampler.subsample(indicator, 64, labels)
is_sampled = self.execute_cpu(graph_fn, [indicator, labels])
self.assertEqual(sum(is_sampled), 64)
self.assertEqual(sum(np.logical_and(numpy_labels, is_sampled)), 32)
self.assertEqual(sum(np.logical_and(
np.logical_not(numpy_labels), is_sampled)), 32)
def test_subsample_all_examples_static(self):
if not self.has_tpu(): return
numpy_labels = np.random.permutation(300)
indicator = np.array(np.ones(300) == 1, np.bool)
numpy_labels = (numpy_labels - 200) > 0
......@@ -54,35 +57,37 @@ class BalancedPositiveNegativeSamplerTest(test_case.TestCase):
is_static=True))
return sampler.subsample(indicator, 64, labels)
is_sampled = self.execute(graph_fn, [indicator, labels])
self.assertTrue(sum(is_sampled) == 64)
self.assertTrue(sum(np.logical_and(numpy_labels, is_sampled)) == 32)
self.assertTrue(sum(np.logical_and(
np.logical_not(numpy_labels), is_sampled)) == 32)
is_sampled = self.execute_tpu(graph_fn, [indicator, labels])
self.assertEqual(sum(is_sampled), 64)
self.assertEqual(sum(np.logical_and(numpy_labels, is_sampled)), 32)
self.assertEqual(sum(np.logical_and(
np.logical_not(numpy_labels), is_sampled)), 32)
def test_subsample_selection_dynamic(self):
def test_subsample_selection(self):
if self.has_tpu(): return
# Test random sampling when only some examples can be sampled:
# 100 samples, 20 positives, 10 positives cannot be sampled
# 100 samples, 20 positives, 10 positives cannot be sampled.
numpy_labels = np.arange(100)
numpy_indicator = numpy_labels < 90
indicator = tf.constant(numpy_indicator)
indicator = np.array(numpy_indicator, np.bool)
numpy_labels = (numpy_labels - 80) >= 0
labels = tf.constant(numpy_labels)
labels = np.array(numpy_labels, np.bool)
sampler = (
balanced_positive_negative_sampler.BalancedPositiveNegativeSampler())
is_sampled = sampler.subsample(indicator, 64, labels)
with self.test_session() as sess:
is_sampled = sess.run(is_sampled)
self.assertTrue(sum(is_sampled) == 64)
self.assertTrue(sum(np.logical_and(numpy_labels, is_sampled)) == 10)
self.assertTrue(sum(np.logical_and(
np.logical_not(numpy_labels), is_sampled)) == 54)
self.assertAllEqual(is_sampled, np.logical_and(is_sampled,
numpy_indicator))
def graph_fn(indicator, labels):
sampler = (
balanced_positive_negative_sampler.BalancedPositiveNegativeSampler())
return sampler.subsample(indicator, 64, labels)
is_sampled = self.execute_cpu(graph_fn, [indicator, labels])
self.assertEqual(sum(is_sampled), 64)
self.assertEqual(sum(np.logical_and(numpy_labels, is_sampled)), 10)
self.assertEqual(sum(np.logical_and(
np.logical_not(numpy_labels), is_sampled)), 54)
self.assertAllEqual(is_sampled, np.logical_and(is_sampled, numpy_indicator))
def test_subsample_selection_static(self):
if not self.has_tpu(): return
# Test random sampling when only some examples can be sampled:
# 100 samples, 20 positives, 10 positives cannot be sampled.
numpy_labels = np.arange(100)
......@@ -98,37 +103,41 @@ class BalancedPositiveNegativeSamplerTest(test_case.TestCase):
is_static=True))
return sampler.subsample(indicator, 64, labels)
is_sampled = self.execute(graph_fn, [indicator, labels])
self.assertTrue(sum(is_sampled) == 64)
self.assertTrue(sum(np.logical_and(numpy_labels, is_sampled)) == 10)
self.assertTrue(sum(np.logical_and(
np.logical_not(numpy_labels), is_sampled)) == 54)
is_sampled = self.execute_tpu(graph_fn, [indicator, labels])
self.assertEqual(sum(is_sampled), 64)
self.assertEqual(sum(np.logical_and(numpy_labels, is_sampled)), 10)
self.assertEqual(sum(np.logical_and(
np.logical_not(numpy_labels), is_sampled)), 54)
self.assertAllEqual(is_sampled, np.logical_and(is_sampled, numpy_indicator))
def test_subsample_selection_larger_batch_size_dynamic(self):
def test_subsample_selection_larger_batch_size(self):
if self.has_tpu(): return
# Test random sampling when total number of examples that can be sampled are
# less than batch size:
# 100 samples, 50 positives, 40 positives cannot be sampled, batch size 64.
# It should still return 64 samples, with 4 of them that couldn't have been
# sampled.
numpy_labels = np.arange(100)
numpy_indicator = numpy_labels < 60
indicator = tf.constant(numpy_indicator)
indicator = np.array(numpy_indicator, np.bool)
numpy_labels = (numpy_labels - 50) >= 0
labels = tf.constant(numpy_labels)
labels = np.array(numpy_labels, np.bool)
sampler = (
balanced_positive_negative_sampler.BalancedPositiveNegativeSampler())
is_sampled = sampler.subsample(indicator, 64, labels)
with self.test_session() as sess:
is_sampled = sess.run(is_sampled)
self.assertTrue(sum(is_sampled) == 60)
self.assertTrue(sum(np.logical_and(numpy_labels, is_sampled)) == 10)
self.assertTrue(
sum(np.logical_and(np.logical_not(numpy_labels), is_sampled)) == 50)
self.assertAllEqual(is_sampled, np.logical_and(is_sampled,
numpy_indicator))
def graph_fn(indicator, labels):
sampler = (
balanced_positive_negative_sampler.BalancedPositiveNegativeSampler())
return sampler.subsample(indicator, 64, labels)
is_sampled = self.execute_cpu(graph_fn, [indicator, labels])
self.assertEqual(sum(is_sampled), 60)
self.assertGreaterEqual(sum(np.logical_and(numpy_labels, is_sampled)), 10)
self.assertGreaterEqual(
sum(np.logical_and(np.logical_not(numpy_labels), is_sampled)), 50)
self.assertEqual(sum(np.logical_and(is_sampled, numpy_indicator)), 60)
def test_subsample_selection_larger_batch_size_static(self):
if not self.has_tpu(): return
# Test random sampling when total number of examples that can be sampled are
# less than batch size:
# 100 samples, 50 positives, 40 positives cannot be sampled, batch size 64.
......@@ -147,34 +156,33 @@ class BalancedPositiveNegativeSamplerTest(test_case.TestCase):
is_static=True))
return sampler.subsample(indicator, 64, labels)
is_sampled = self.execute(graph_fn, [indicator, labels])
self.assertTrue(sum(is_sampled) == 64)
self.assertTrue(sum(np.logical_and(numpy_labels, is_sampled)) >= 10)
self.assertTrue(
sum(np.logical_and(np.logical_not(numpy_labels), is_sampled)) >= 50)
self.assertTrue(sum(np.logical_and(is_sampled, numpy_indicator)) == 60)
is_sampled = self.execute_tpu(graph_fn, [indicator, labels])
self.assertEqual(sum(is_sampled), 64)
self.assertGreaterEqual(sum(np.logical_and(numpy_labels, is_sampled)), 10)
self.assertGreaterEqual(
sum(np.logical_and(np.logical_not(numpy_labels), is_sampled)), 50)
self.assertEqual(sum(np.logical_and(is_sampled, numpy_indicator)), 60)
def test_subsample_selection_no_batch_size(self):
if self.has_tpu(): return
# Test random sampling when only some examples can be sampled:
# 1000 samples, 6 positives (5 can be sampled).
numpy_labels = np.arange(1000)
numpy_indicator = numpy_labels < 999
indicator = tf.constant(numpy_indicator)
numpy_labels = (numpy_labels - 994) >= 0
labels = tf.constant(numpy_labels)
sampler = (balanced_positive_negative_sampler.
BalancedPositiveNegativeSampler(0.01))
is_sampled = sampler.subsample(indicator, None, labels)
with self.test_session() as sess:
is_sampled = sess.run(is_sampled)
self.assertTrue(sum(is_sampled) == 500)
self.assertTrue(sum(np.logical_and(numpy_labels, is_sampled)) == 5)
self.assertTrue(sum(np.logical_and(
np.logical_not(numpy_labels), is_sampled)) == 495)
self.assertAllEqual(is_sampled, np.logical_and(is_sampled,
numpy_indicator))
def graph_fn(indicator, labels):
sampler = (balanced_positive_negative_sampler.
BalancedPositiveNegativeSampler(0.01))
is_sampled = sampler.subsample(indicator, None, labels)
return is_sampled
is_sampled_out = self.execute_cpu(graph_fn, [numpy_indicator, numpy_labels])
self.assertEqual(sum(is_sampled_out), 500)
self.assertEqual(sum(np.logical_and(numpy_labels, is_sampled_out)), 5)
self.assertEqual(sum(np.logical_and(
np.logical_not(numpy_labels), is_sampled_out)), 495)
self.assertAllEqual(is_sampled_out, np.logical_and(is_sampled_out,
numpy_indicator))
def test_subsample_selection_no_batch_size_static(self):
labels = tf.constant([[True, False, False]])
......
......@@ -24,6 +24,10 @@ from six.moves import range
import tensorflow as tf
from object_detection.core import prefetcher
from object_detection.utils import tf_version
if not tf_version.is_tf1():
raise ValueError('`batcher.py` is only supported in Tensorflow 1.X')
rt_shape_str = '_runtime_shapes'
......
......@@ -22,10 +22,11 @@ from __future__ import print_function
import numpy as np
from six.moves import range
import tensorflow as tf
from tensorflow.contrib import slim as contrib_slim
from object_detection.core import batcher
slim = tf.contrib.slim
slim = contrib_slim
class BatcherTest(tf.test.TestCase):
......
......@@ -14,11 +14,11 @@
# ==============================================================================
"""Tests for object_detection.core.box_coder."""
import tensorflow as tf
from object_detection.core import box_coder
from object_detection.core import box_list
from object_detection.utils import test_case
class MockBoxCoder(box_coder.BoxCoder):
......@@ -34,27 +34,28 @@ class MockBoxCoder(box_coder.BoxCoder):
return box_list.BoxList(rel_codes / 2.0)
class BoxCoderTest(tf.test.TestCase):
class BoxCoderTest(test_case.TestCase):
def test_batch_decode(self):
mock_anchor_corners = tf.constant(
[[0, 0.1, 0.2, 0.3], [0.2, 0.4, 0.4, 0.6]], tf.float32)
mock_anchors = box_list.BoxList(mock_anchor_corners)
mock_box_coder = MockBoxCoder()
expected_boxes = [[[0.0, 0.1, 0.5, 0.6], [0.5, 0.6, 0.7, 0.8]],
[[0.1, 0.2, 0.3, 0.4], [0.7, 0.8, 0.9, 1.0]]]
encoded_boxes_list = [mock_box_coder.encode(
box_list.BoxList(tf.constant(boxes)), mock_anchors)
for boxes in expected_boxes]
encoded_boxes = tf.stack(encoded_boxes_list)
decoded_boxes = box_coder.batch_decode(
encoded_boxes, mock_box_coder, mock_anchors)
with self.test_session() as sess:
decoded_boxes_result = sess.run(decoded_boxes)
self.assertAllClose(expected_boxes, decoded_boxes_result)
def graph_fn():
mock_anchor_corners = tf.constant(
[[0, 0.1, 0.2, 0.3], [0.2, 0.4, 0.4, 0.6]], tf.float32)
mock_anchors = box_list.BoxList(mock_anchor_corners)
mock_box_coder = MockBoxCoder()
encoded_boxes_list = [mock_box_coder.encode(
box_list.BoxList(tf.constant(boxes)), mock_anchors)
for boxes in expected_boxes]
encoded_boxes = tf.stack(encoded_boxes_list)
decoded_boxes = box_coder.batch_decode(
encoded_boxes, mock_box_coder, mock_anchors)
return decoded_boxes
decoded_boxes_result = self.execute(graph_fn, [])
self.assertAllClose(expected_boxes, decoded_boxes_result)
if __name__ == '__main__':
......
......@@ -105,6 +105,31 @@ def scale(boxlist, y_scale, x_scale, scope=None):
return _copy_extra_fields(scaled_boxlist, boxlist)
def scale_height_width(boxlist, y_scale, x_scale, scope=None):
"""Scale the height and width of boxes, leaving centers unchanged.
Args:
boxlist: BoxList holding N boxes
y_scale: (float) scalar tensor
x_scale: (float) scalar tensor
scope: name scope.
Returns:
boxlist: BoxList holding N boxes
"""
with tf.name_scope(scope, 'ScaleHeightWidth'):
y_scale = tf.cast(y_scale, tf.float32)
x_scale = tf.cast(x_scale, tf.float32)
yc, xc, height_orig, width_orig = boxlist.get_center_coordinates_and_sizes()
y_min = yc - 0.5 * y_scale * height_orig
y_max = yc + 0.5 * y_scale * height_orig
x_min = xc - 0.5 * x_scale * width_orig
x_max = xc + 0.5 * x_scale * width_orig
scaled_boxlist = box_list.BoxList(
tf.stack([y_min, x_min, y_max, x_max], 1))
return _copy_extra_fields(scaled_boxlist, boxlist)
def clip_to_window(boxlist, window, filter_nonoverlapping=True, scope=None):
"""Clip bounding boxes to a window.
......
......@@ -14,53 +14,56 @@
# ==============================================================================
"""Tests for object_detection.core.box_list."""
import numpy as np
import tensorflow as tf
from object_detection.core import box_list
from object_detection.utils import test_case
class BoxListTest(tf.test.TestCase):
class BoxListTest(test_case.TestCase):
"""Tests for BoxList class."""
def test_num_boxes(self):
data = tf.constant([[0, 0, 1, 1], [1, 1, 2, 3], [3, 4, 5, 5]], tf.float32)
expected_num_boxes = 3
boxes = box_list.BoxList(data)
with self.test_session() as sess:
num_boxes_output = sess.run(boxes.num_boxes())
self.assertEquals(num_boxes_output, expected_num_boxes)
def graph_fn():
data = tf.constant([[0, 0, 1, 1], [1, 1, 2, 3], [3, 4, 5, 5]], tf.float32)
boxes = box_list.BoxList(data)
return boxes.num_boxes()
num_boxes_out = self.execute(graph_fn, [])
self.assertEqual(num_boxes_out, 3)
def test_get_correct_center_coordinates_and_sizes(self):
boxes = [[10.0, 10.0, 20.0, 15.0], [0.2, 0.1, 0.5, 0.4]]
boxes = box_list.BoxList(tf.constant(boxes))
centers_sizes = boxes.get_center_coordinates_and_sizes()
boxes = np.array([[10.0, 10.0, 20.0, 15.0], [0.2, 0.1, 0.5, 0.4]],
np.float32)
def graph_fn(boxes):
boxes = box_list.BoxList(boxes)
centers_sizes = boxes.get_center_coordinates_and_sizes()
return centers_sizes
centers_sizes_out = self.execute(graph_fn, [boxes])
expected_centers_sizes = [[15, 0.35], [12.5, 0.25], [10, 0.3], [5, 0.3]]
with self.test_session() as sess:
centers_sizes_out = sess.run(centers_sizes)
self.assertAllClose(centers_sizes_out, expected_centers_sizes)
self.assertAllClose(centers_sizes_out, expected_centers_sizes)
def test_create_box_list_with_dynamic_shape(self):
data = tf.constant([[0, 0, 1, 1], [1, 1, 2, 3], [3, 4, 5, 5]], tf.float32)
indices = tf.reshape(tf.where(tf.greater([1, 0, 1], 0)), [-1])
data = tf.gather(data, indices)
assert data.get_shape().as_list() == [None, 4]
expected_num_boxes = 2
boxes = box_list.BoxList(data)
with self.test_session() as sess:
num_boxes_output = sess.run(boxes.num_boxes())
self.assertEquals(num_boxes_output, expected_num_boxes)
def graph_fn():
data = tf.constant([[0, 0, 1, 1], [1, 1, 2, 3], [3, 4, 5, 5]], tf.float32)
indices = tf.reshape(tf.where(tf.greater([1, 0, 1], 0)), [-1])
data = tf.gather(data, indices)
assert data.get_shape().as_list() == [None, 4]
boxes = box_list.BoxList(data)
return boxes.num_boxes()
num_boxes = self.execute(graph_fn, [])
self.assertEqual(num_boxes, 2)
def test_transpose_coordinates(self):
boxes = [[10.0, 10.0, 20.0, 15.0], [0.2, 0.1, 0.5, 0.4]]
boxes = box_list.BoxList(tf.constant(boxes))
boxes.transpose_coordinates()
boxes = np.array([[10.0, 10.0, 20.0, 15.0], [0.2, 0.1, 0.5, 0.4]],
np.float32)
def graph_fn(boxes):
boxes = box_list.BoxList(boxes)
boxes.transpose_coordinates()
return boxes.get()
transpoded_boxes = self.execute(graph_fn, [boxes])
expected_corners = [[10.0, 10.0, 15.0, 20.0], [0.1, 0.2, 0.4, 0.5]]
with self.test_session() as sess:
corners_out = sess.run(boxes.get())
self.assertAllClose(corners_out, expected_corners)
self.assertAllClose(transpoded_boxes, expected_corners)
def test_box_list_invalid_inputs(self):
data0 = tf.constant([[[0, 0, 1, 1], [3, 4, 5, 5]]], tf.float32)
......@@ -77,49 +80,33 @@ class BoxListTest(tf.test.TestCase):
def test_num_boxes_static(self):
box_corners = [[10.0, 10.0, 20.0, 15.0], [0.2, 0.1, 0.5, 0.4]]
boxes = box_list.BoxList(tf.constant(box_corners))
self.assertEquals(boxes.num_boxes_static(), 2)
self.assertEquals(type(boxes.num_boxes_static()), int)
def test_num_boxes_static_for_uninferrable_shape(self):
placeholder = tf.placeholder(tf.float32, shape=[None, 4])
boxes = box_list.BoxList(placeholder)
self.assertEquals(boxes.num_boxes_static(), None)
self.assertEqual(boxes.num_boxes_static(), 2)
self.assertEqual(type(boxes.num_boxes_static()), int)
def test_as_tensor_dict(self):
boxlist = box_list.BoxList(
tf.constant([[0.1, 0.1, 0.4, 0.4], [0.1, 0.1, 0.5, 0.5]], tf.float32))
boxlist.add_field('classes', tf.constant([0, 1]))
boxlist.add_field('scores', tf.constant([0.75, 0.2]))
boxes = tf.constant([[0.1, 0.1, 0.4, 0.4], [0.1, 0.1, 0.5, 0.5]],
tf.float32)
boxlist = box_list.BoxList(boxes)
classes = tf.constant([0, 1])
boxlist.add_field('classes', classes)
scores = tf.constant([0.75, 0.2])
boxlist.add_field('scores', scores)
tensor_dict = boxlist.as_tensor_dict()
expected_boxes = [[0.1, 0.1, 0.4, 0.4], [0.1, 0.1, 0.5, 0.5]]
expected_classes = [0, 1]
expected_scores = [0.75, 0.2]
with self.test_session() as sess:
tensor_dict_out = sess.run(tensor_dict)
self.assertAllEqual(3, len(tensor_dict_out))
self.assertAllClose(expected_boxes, tensor_dict_out['boxes'])
self.assertAllEqual(expected_classes, tensor_dict_out['classes'])
self.assertAllClose(expected_scores, tensor_dict_out['scores'])
self.assertDictEqual(tensor_dict, {'scores': scores, 'classes': classes,
'boxes': boxes})
def test_as_tensor_dict_with_features(self):
boxlist = box_list.BoxList(
tf.constant([[0.1, 0.1, 0.4, 0.4], [0.1, 0.1, 0.5, 0.5]], tf.float32))
boxlist.add_field('classes', tf.constant([0, 1]))
boxlist.add_field('scores', tf.constant([0.75, 0.2]))
tensor_dict = boxlist.as_tensor_dict(['boxes', 'classes', 'scores'])
expected_boxes = [[0.1, 0.1, 0.4, 0.4], [0.1, 0.1, 0.5, 0.5]]
expected_classes = [0, 1]
expected_scores = [0.75, 0.2]
with self.test_session() as sess:
tensor_dict_out = sess.run(tensor_dict)
self.assertAllEqual(3, len(tensor_dict_out))
self.assertAllClose(expected_boxes, tensor_dict_out['boxes'])
self.assertAllEqual(expected_classes, tensor_dict_out['classes'])
self.assertAllClose(expected_scores, tensor_dict_out['scores'])
boxes = tf.constant([[0.1, 0.1, 0.4, 0.4], [0.1, 0.1, 0.5, 0.5]],
tf.float32)
boxlist = box_list.BoxList(boxes)
classes = tf.constant([0, 1])
boxlist.add_field('classes', classes)
scores = tf.constant([0.75, 0.2])
boxlist.add_field('scores', scores)
tensor_dict = boxlist.as_tensor_dict(['scores', 'classes'])
self.assertDictEqual(tensor_dict, {'scores': scores, 'classes': classes})
def test_as_tensor_dict_missing_field(self):
boxlist = box_list.BoxList(
......
......@@ -24,82 +24,74 @@ class ClassAgnosticNonMaxSuppressionTest(test_case.TestCase,
parameterized.TestCase):
def test_class_agnostic_nms_select_with_shared_boxes(self):
boxes = tf.constant(
[[[0, 0, 1, 1]], [[0, 0.1, 1, 1.1]], [[0, -0.1, 1, 0.9]],
[[0, 10, 1, 11]], [[0, 10.1, 1, 11.1]], [[0, 100, 1, 101]],
[[0, 1000, 1, 1002]], [[0, 1000, 1, 1002.1]]], tf.float32)
scores = tf.constant([[.9, 0.01], [.75, 0.05], [.6, 0.01], [.95, 0],
[.5, 0.01], [.3, 0.01], [.01, .85], [.01, .5]])
score_thresh = 0.1
iou_thresh = .5
max_classes_per_detection = 1
max_output_size = 4
def graph_fn():
boxes = tf.constant(
[[[0, 0, 1, 1]], [[0, 0.1, 1, 1.1]], [[0, -0.1, 1, 0.9]],
[[0, 10, 1, 11]], [[0, 10.1, 1, 11.1]], [[0, 100, 1, 101]],
[[0, 1000, 1, 1002]], [[0, 1000, 1, 1002.1]]], tf.float32)
scores = tf.constant([[.9, 0.01], [.75, 0.05], [.6, 0.01], [.95, 0],
[.5, 0.01], [.3, 0.01], [.01, .85], [.01, .5]])
score_thresh = 0.1
iou_thresh = .5
max_classes_per_detection = 1
max_output_size = 4
nms, _ = post_processing.class_agnostic_non_max_suppression(
boxes, scores, score_thresh, iou_thresh, max_classes_per_detection,
max_output_size)
return (nms.get(), nms.get_field(fields.BoxListFields.scores),
nms.get_field(fields.BoxListFields.classes))
exp_nms_corners = [[0, 10, 1, 11], [0, 0, 1, 1], [0, 1000, 1, 1002],
[0, 100, 1, 101]]
exp_nms_scores = [.95, .9, .85, .3]
exp_nms_classes = [0, 0, 1, 0]
nms, _ = post_processing.class_agnostic_non_max_suppression(
boxes, scores, score_thresh, iou_thresh, max_classes_per_detection,
max_output_size)
with self.test_session() as sess:
nms_corners_output, nms_scores_output, nms_classes_output = sess.run([
nms.get(),
nms.get_field(fields.BoxListFields.scores),
nms.get_field(fields.BoxListFields.classes)
])
self.assertAllClose(nms_corners_output, exp_nms_corners)
self.assertAllClose(nms_scores_output, exp_nms_scores)
self.assertAllClose(nms_classes_output, exp_nms_classes)
(nms_corners_output, nms_scores_output,
nms_classes_output) = self.execute_cpu(graph_fn, [])
self.assertAllClose(nms_corners_output, exp_nms_corners)
self.assertAllClose(nms_scores_output, exp_nms_scores)
self.assertAllClose(nms_classes_output, exp_nms_classes)
def test_class_agnostic_nms_select_with_per_class_boxes(self):
boxes = tf.constant(
[[[4, 5, 9, 10], [0, 0, 1, 1]],
[[0, 0.1, 1, 1.1], [4, 5, 9, 10]],
[[0, -0.1, 1, 0.9], [4, 5, 9, 10]],
[[0, 10, 1, 11], [4, 5, 9, 10]],
[[0, 10.1, 1, 11.1], [4, 5, 9, 10]],
[[0, 100, 1, 101], [4, 5, 9, 10]],
[[4, 5, 9, 10], [0, 1000, 1, 1002]],
[[4, 5, 9, 10], [0, 1000, 1, 1002.1]]], tf.float32)
scores = tf.constant([[.01, 0.9],
[.75, 0.05],
[.6, 0.01],
[.95, 0],
[.5, 0.01],
[.3, 0.01],
[.01, .85],
[.01, .5]])
score_thresh = 0.1
iou_thresh = .5
max_classes_per_detection = 1
max_output_size = 4
def graph_fn():
boxes = tf.constant(
[[[4, 5, 9, 10], [0, 0, 1, 1]],
[[0, 0.1, 1, 1.1], [4, 5, 9, 10]],
[[0, -0.1, 1, 0.9], [4, 5, 9, 10]],
[[0, 10, 1, 11], [4, 5, 9, 10]],
[[0, 10.1, 1, 11.1], [4, 5, 9, 10]],
[[0, 100, 1, 101], [4, 5, 9, 10]],
[[4, 5, 9, 10], [0, 1000, 1, 1002]],
[[4, 5, 9, 10], [0, 1000, 1, 1002.1]]], tf.float32)
scores = tf.constant([[.01, 0.9],
[.75, 0.05],
[.6, 0.01],
[.95, 0],
[.5, 0.01],
[.3, 0.01],
[.01, .85],
[.01, .5]])
score_thresh = 0.1
iou_thresh = .5
max_classes_per_detection = 1
max_output_size = 4
nms, _ = post_processing.class_agnostic_non_max_suppression(
boxes, scores, score_thresh, iou_thresh, max_classes_per_detection,
max_output_size)
return (nms.get(), nms.get_field(fields.BoxListFields.scores),
nms.get_field(fields.BoxListFields.classes))
(nms_corners_output, nms_scores_output,
nms_classes_output) = self.execute_cpu(graph_fn, [])
exp_nms_corners = [[0, 10, 1, 11],
[0, 0, 1, 1],
[0, 1000, 1, 1002],
[0, 100, 1, 101]]
exp_nms_scores = [.95, .9, .85, .3]
exp_nms_classes = [0, 1, 1, 0]
nms, _ = post_processing.class_agnostic_non_max_suppression(
boxes, scores, score_thresh, iou_thresh, max_classes_per_detection,
max_output_size)
with self.test_session() as sess:
nms_corners_output, nms_scores_output, nms_classes_output = sess.run([
nms.get(),
nms.get_field(fields.BoxListFields.scores),
nms.get_field(fields.BoxListFields.classes)
])
self.assertAllClose(nms_corners_output, exp_nms_corners)
self.assertAllClose(nms_scores_output, exp_nms_scores)
self.assertAllClose(nms_classes_output, exp_nms_classes)
self.assertAllClose(nms_corners_output, exp_nms_corners)
self.assertAllClose(nms_scores_output, exp_nms_scores)
self.assertAllClose(nms_classes_output, exp_nms_classes)
# Two cases will be tested here: using / not using static shapes.
# Named the two test cases for easier control during testing, with a flag of
......@@ -109,46 +101,43 @@ class ClassAgnosticNonMaxSuppressionTest(test_case.TestCase,
@parameterized.named_parameters(('', False), ('_use_static_shapes', True))
def test_batch_classagnostic_nms_with_batch_size_1(self,
use_static_shapes=False):
boxes = tf.constant(
[[[[0, 0, 1, 1]], [[0, 0.1, 1, 1.1]], [[0, -0.1, 1, 0.9]],
[[0, 10, 1, 11]], [[0, 10.1, 1, 11.1]], [[0, 100, 1, 101]],
[[0, 1000, 1, 1002]], [[0, 1000, 1, 1002.1]]]], tf.float32)
scores = tf.constant([[[.9, 0.01], [.75, 0.05], [.6, 0.01], [.95, 0],
[.5, 0.01], [.3, 0.01], [.01, .85], [.01, .5]]])
score_thresh = 0.1
iou_thresh = .5
max_output_size = 4
max_classes_per_detection = 1
use_class_agnostic_nms = True
def graph_fn():
boxes = tf.constant(
[[[[0, 0, 1, 1]], [[0, 0.1, 1, 1.1]], [[0, -0.1, 1, 0.9]],
[[0, 10, 1, 11]], [[0, 10.1, 1, 11.1]], [[0, 100, 1, 101]],
[[0, 1000, 1, 1002]], [[0, 1000, 1, 1002.1]]]], tf.float32)
scores = tf.constant([[[.9, 0.01], [.75, 0.05], [.6, 0.01], [.95, 0],
[.5, 0.01], [.3, 0.01], [.01, .85], [.01, .5]]])
score_thresh = 0.1
iou_thresh = .5
max_output_size = 4
max_classes_per_detection = 1
use_class_agnostic_nms = True
(nmsed_boxes, nmsed_scores, nmsed_classes, nmsed_masks,
nmsed_additional_fields,
num_detections) = post_processing.batch_multiclass_non_max_suppression(
boxes,
scores,
score_thresh,
iou_thresh,
max_size_per_class=max_output_size,
max_total_size=max_output_size,
use_class_agnostic_nms=use_class_agnostic_nms,
use_static_shapes=use_static_shapes,
max_classes_per_detection=max_classes_per_detection)
self.assertIsNone(nmsed_masks)
self.assertIsNone(nmsed_additional_fields)
return (nmsed_boxes, nmsed_scores, nmsed_classes, num_detections)
exp_nms_corners = [[[0, 10, 1, 11], [0, 0, 1, 1], [0, 1000, 1, 1002],
[0, 100, 1, 101]]]
exp_nms_scores = [[.95, .9, .85, .3]]
exp_nms_classes = [[0, 0, 1, 0]]
(nmsed_boxes, nmsed_scores, nmsed_classes, nmsed_masks,
nmsed_additional_fields,
num_detections) = post_processing.batch_multiclass_non_max_suppression(
boxes,
scores,
score_thresh,
iou_thresh,
max_size_per_class=max_output_size,
max_total_size=max_output_size,
use_class_agnostic_nms=use_class_agnostic_nms,
use_static_shapes=use_static_shapes,
max_classes_per_detection=max_classes_per_detection)
self.assertIsNone(nmsed_masks)
self.assertIsNone(nmsed_additional_fields)
with self.test_session() as sess:
(nmsed_boxes, nmsed_scores, nmsed_classes, num_detections) = sess.run(
[nmsed_boxes, nmsed_scores, nmsed_classes, num_detections])
self.assertAllClose(nmsed_boxes, exp_nms_corners)
self.assertAllClose(nmsed_scores, exp_nms_scores)
self.assertAllClose(nmsed_classes, exp_nms_classes)
self.assertEqual(num_detections, [4])
(nmsed_boxes, nmsed_scores, nmsed_classes,
num_detections) = self.execute_cpu(graph_fn, [])
self.assertAllClose(nmsed_boxes, exp_nms_corners)
self.assertAllClose(nmsed_scores, exp_nms_scores)
self.assertAllClose(nmsed_classes, exp_nms_classes)
self.assertEqual(num_detections, [4])
if __name__ == '__main__':
......
......@@ -22,6 +22,7 @@ import numpy as np
from six.moves import zip
import tensorflow as tf
from object_detection.core import freezable_batch_norm
......@@ -36,6 +37,10 @@ class FreezableBatchNormTest(tf.test.TestCase):
model.add(norm)
return model, norm
def _copy_weights(self, source_weights, target_weights):
for source, target in zip(source_weights, target_weights):
target.assign(source)
def _train_freezable_batch_norm(self, training_mean, training_var):
model, _ = self._build_model()
model.compile(loss='mse', optimizer='sgd')
......@@ -53,136 +58,138 @@ class FreezableBatchNormTest(tf.test.TestCase):
testing_mean, testing_var, training_arg, training_mean, training_var):
out_tensor = norm(tf.convert_to_tensor(test_data, dtype=tf.float32),
training=training_arg)
out = tf.keras.backend.eval(out_tensor)
out -= tf.keras.backend.eval(norm.beta)
out /= tf.keras.backend.eval(norm.gamma)
out = out_tensor
out -= norm.beta
out /= norm.gamma
if not should_be_training:
out *= training_var
out += (training_mean - testing_mean)
out /= testing_var
np.testing.assert_allclose(out.mean(), 0.0, atol=1.5e-1)
np.testing.assert_allclose(out.std(), 1.0, atol=1.5e-1)
np.testing.assert_allclose(out.numpy().mean(), 0.0, atol=1.5e-1)
np.testing.assert_allclose(out.numpy().std(), 1.0, atol=1.5e-1)
def test_batchnorm_freezing_training_none(self):
with self.test_session():
training_mean = 5.0
training_var = 10.0
testing_mean = -10.0
testing_var = 5.0
# Initially train the batch norm, and save the weights
trained_weights = self._train_freezable_batch_norm(training_mean,
training_var)
# Load the batch norm weights, freezing training to True.
# Apply the batch norm layer to testing data and ensure it is normalized
# according to the batch statistics.
model, norm = self._build_model(training=True)
for trained_weight, blank_weight in zip(trained_weights, model.weights):
weight_copy = blank_weight.assign(tf.keras.backend.eval(trained_weight))
tf.keras.backend.eval(weight_copy)
# centered on testing_mean, variance testing_var
test_data = np.random.normal(
loc=testing_mean,
scale=testing_var,
size=(1000, 10))
# Test with training=True passed to the call method:
training_arg = True
should_be_training = True
self._test_batchnorm_layer(norm, should_be_training, test_data,
testing_mean, testing_var, training_arg,
training_mean, training_var)
# Test with training=False passed to the call method:
training_arg = False
should_be_training = False
self._test_batchnorm_layer(norm, should_be_training, test_data,
testing_mean, testing_var, training_arg,
training_mean, training_var)
# Test the layer in various Keras learning phase scopes:
training_arg = None
should_be_training = False
self._test_batchnorm_layer(norm, should_be_training, test_data,
testing_mean, testing_var, training_arg,
training_mean, training_var)
tf.keras.backend.set_learning_phase(True)
should_be_training = True
self._test_batchnorm_layer(norm, should_be_training, test_data,
testing_mean, testing_var, training_arg,
training_mean, training_var)
tf.keras.backend.set_learning_phase(False)
should_be_training = False
self._test_batchnorm_layer(norm, should_be_training, test_data,
testing_mean, testing_var, training_arg,
training_mean, training_var)
training_mean = 5.0
training_var = 10.0
testing_mean = -10.0
testing_var = 5.0
# Initially train the batch norm, and save the weights
trained_weights = self._train_freezable_batch_norm(training_mean,
training_var)
# Load the batch norm weights, freezing training to True.
# Apply the batch norm layer to testing data and ensure it is normalized
# according to the batch statistics.
model, norm = self._build_model(training=True)
self._copy_weights(trained_weights, model.weights)
# centered on testing_mean, variance testing_var
test_data = np.random.normal(
loc=testing_mean,
scale=testing_var,
size=(1000, 10))
# Test with training=True passed to the call method:
training_arg = True
should_be_training = True
self._test_batchnorm_layer(norm, should_be_training, test_data,
testing_mean, testing_var, training_arg,
training_mean, training_var)
# Reset the weights, because they may have been updating by
# running with training=True
self._copy_weights(trained_weights, model.weights)
# Test with training=False passed to the call method:
training_arg = False
should_be_training = False
self._test_batchnorm_layer(norm, should_be_training, test_data,
testing_mean, testing_var, training_arg,
training_mean, training_var)
# Test the layer in various Keras learning phase scopes:
training_arg = None
should_be_training = False
self._test_batchnorm_layer(norm, should_be_training, test_data,
testing_mean, testing_var, training_arg,
training_mean, training_var)
tf.keras.backend.set_learning_phase(True)
should_be_training = True
self._test_batchnorm_layer(norm, should_be_training, test_data,
testing_mean, testing_var, training_arg,
training_mean, training_var)
# Reset the weights, because they may have been updating by
# running with training=True
self._copy_weights(trained_weights, model.weights)
tf.keras.backend.set_learning_phase(False)
should_be_training = False
self._test_batchnorm_layer(norm, should_be_training, test_data,
testing_mean, testing_var, training_arg,
training_mean, training_var)
def test_batchnorm_freezing_training_false(self):
with self.test_session():
training_mean = 5.0
training_var = 10.0
testing_mean = -10.0
testing_var = 5.0
# Initially train the batch norm, and save the weights
trained_weights = self._train_freezable_batch_norm(training_mean,
training_var)
# Load the batch norm back up, freezing training to False.
# Apply the batch norm layer to testing data and ensure it is normalized
# according to the training data's statistics.
model, norm = self._build_model(training=False)
for trained_weight, blank_weight in zip(trained_weights, model.weights):
weight_copy = blank_weight.assign(tf.keras.backend.eval(trained_weight))
tf.keras.backend.eval(weight_copy)
# centered on testing_mean, variance testing_var
test_data = np.random.normal(
loc=testing_mean,
scale=testing_var,
size=(1000, 10))
# Make sure that the layer is never training
# Test with training=True passed to the call method:
training_arg = True
should_be_training = False
self._test_batchnorm_layer(norm, should_be_training, test_data,
testing_mean, testing_var, training_arg,
training_mean, training_var)
# Test with training=False passed to the call method:
training_arg = False
should_be_training = False
self._test_batchnorm_layer(norm, should_be_training, test_data,
testing_mean, testing_var, training_arg,
training_mean, training_var)
# Test the layer in various Keras learning phase scopes:
training_arg = None
should_be_training = False
self._test_batchnorm_layer(norm, should_be_training, test_data,
testing_mean, testing_var, training_arg,
training_mean, training_var)
tf.keras.backend.set_learning_phase(True)
should_be_training = False
self._test_batchnorm_layer(norm, should_be_training, test_data,
testing_mean, testing_var, training_arg,
training_mean, training_var)
tf.keras.backend.set_learning_phase(False)
should_be_training = False
self._test_batchnorm_layer(norm, should_be_training, test_data,
testing_mean, testing_var, training_arg,
training_mean, training_var)
training_mean = 5.0
training_var = 10.0
testing_mean = -10.0
testing_var = 5.0
# Initially train the batch norm, and save the weights
trained_weights = self._train_freezable_batch_norm(training_mean,
training_var)
# Load the batch norm back up, freezing training to False.
# Apply the batch norm layer to testing data and ensure it is normalized
# according to the training data's statistics.
model, norm = self._build_model(training=False)
self._copy_weights(trained_weights, model.weights)
# centered on testing_mean, variance testing_var
test_data = np.random.normal(
loc=testing_mean,
scale=testing_var,
size=(1000, 10))
# Make sure that the layer is never training
# Test with training=True passed to the call method:
training_arg = True
should_be_training = False
self._test_batchnorm_layer(norm, should_be_training, test_data,
testing_mean, testing_var, training_arg,
training_mean, training_var)
# Test with training=False passed to the call method:
training_arg = False
should_be_training = False
self._test_batchnorm_layer(norm, should_be_training, test_data,
testing_mean, testing_var, training_arg,
training_mean, training_var)
# Test the layer in various Keras learning phase scopes:
training_arg = None
should_be_training = False
self._test_batchnorm_layer(norm, should_be_training, test_data,
testing_mean, testing_var, training_arg,
training_mean, training_var)
tf.keras.backend.set_learning_phase(True)
should_be_training = False
self._test_batchnorm_layer(norm, should_be_training, test_data,
testing_mean, testing_var, training_arg,
training_mean, training_var)
tf.keras.backend.set_learning_phase(False)
should_be_training = False
self._test_batchnorm_layer(norm, should_be_training, test_data,
testing_mean, testing_var, training_arg,
training_mean, training_var)
if __name__ == '__main__':
......
......@@ -125,6 +125,24 @@ def change_coordinate_frame(keypoints, window, scope=None):
return new_keypoints
def keypoints_to_enclosing_bounding_boxes(keypoints):
"""Creates enclosing bounding boxes from keypoints.
Args:
keypoints: a [num_instances, num_keypoints, 2] float32 tensor with keypoints
in [y, x] format.
Returns:
A [num_instances, 4] float32 tensor that tightly covers all the keypoints
for each instance.
"""
ymin = tf.math.reduce_min(keypoints[:, :, 0], axis=1)
xmin = tf.math.reduce_min(keypoints[:, :, 1], axis=1)
ymax = tf.math.reduce_max(keypoints[:, :, 0], axis=1)
xmax = tf.math.reduce_max(keypoints[:, :, 1], axis=1)
return tf.stack([ymin, xmin, ymax, xmax], axis=1)
def to_normalized_coordinates(keypoints, height, width,
check_range=True, scope=None):
"""Converts absolute keypoint coordinates to normalized coordinates in [0, 1].
......@@ -280,3 +298,69 @@ def rot90(keypoints, scope=None):
new_keypoints = tf.concat([v, u], 2)
new_keypoints = tf.transpose(new_keypoints, [1, 0, 2])
return new_keypoints
def keypoint_weights_from_visibilities(keypoint_visibilities,
per_keypoint_weights=None):
"""Returns a keypoint weights tensor.
During training, it is often beneficial to consider only those keypoints that
are labeled. This function returns a weights tensor that combines default
per-keypoint weights, as well as the visibilities of individual keypoints.
The returned tensor satisfies:
keypoint_weights[i, k] = per_keypoint_weights[k] * keypoint_visibilities[i, k]
where per_keypoint_weights[k] is set to 1 if not provided.
Args:
keypoint_visibilities: A [num_instances, num_keypoints] boolean tensor
indicating whether a keypoint is labeled (and perhaps even visible).
per_keypoint_weights: A list or 1-d tensor of length `num_keypoints` with
per-keypoint weights. If None, will use 1 for each visible keypoint
weight.
Returns:
A [num_instances, num_keypoints] float32 tensor with keypoint weights. Those
keypoints deemed visible will have the provided per-keypoint weight, and
all others will be set to zero.
"""
if per_keypoint_weights is None:
num_keypoints = keypoint_visibilities.shape.as_list()[1]
per_keypoint_weight_mult = tf.ones((1, num_keypoints,), dtype=tf.float32)
else:
per_keypoint_weight_mult = tf.expand_dims(per_keypoint_weights, axis=0)
return per_keypoint_weight_mult * tf.cast(keypoint_visibilities, tf.float32)
def set_keypoint_visibilities(keypoints, initial_keypoint_visibilities=None):
"""Sets keypoint visibilities based on valid/invalid keypoints.
Some keypoint operations set invisible keypoints (e.g. cropped keypoints) to
NaN, without affecting any keypoint "visibility" variables. This function is
used to update (or create) keypoint visibilities to agree with visible /
invisible keypoint coordinates.
Args:
keypoints: a float32 tensor of shape [num_instances, num_keypoints, 2].
initial_keypoint_visibilities: a boolean tensor of shape
[num_instances, num_keypoints]. If provided, will maintain the visibility
designation of a keypoint, so long as the corresponding coordinates are
not NaN. If not provided, will create keypoint visibilities directly from
the values in `keypoints` (i.e. NaN coordinates map to False, otherwise
they map to True).
Returns:
keypoint_visibilities: a bool tensor of shape [num_instances, num_keypoints]
indicating whether a keypoint is visible or not.
"""
if initial_keypoint_visibilities is not None:
keypoint_visibilities = tf.cast(initial_keypoint_visibilities, tf.bool)
else:
keypoint_visibilities = tf.ones_like(keypoints[:, :, 0], dtype=tf.bool)
keypoints_with_nan = tf.math.reduce_any(tf.math.is_nan(keypoints), axis=2)
keypoint_visibilities = tf.where(
keypoints_with_nan,
tf.zeros_like(keypoint_visibilities, dtype=tf.bool),
keypoint_visibilities)
return keypoint_visibilities
......@@ -33,12 +33,13 @@ from __future__ import print_function
import abc
import six
import tensorflow as tf
from tensorflow.contrib import slim as contrib_slim
from object_detection.core import box_list
from object_detection.core import box_list_ops
from object_detection.utils import ops
slim = tf.contrib.slim
slim = contrib_slim
class Loss(six.with_metaclass(abc.ABCMeta, object)):
......
......@@ -962,18 +962,18 @@ def batch_multiclass_non_max_suppression(boxes,
if use_class_agnostic_nms:
raise ValueError('class-agnostic NMS is not supported by combined_nms.')
if clip_window is not None:
tf.compat.v1.logging.warning(
tf.logging.warning(
'clip_window is not supported by combined_nms unless it is'
' [0. 0. 1. 1.] for each image.')
if additional_fields is not None:
tf.compat.v1.logging.warning(
tf.logging.warning(
'additional_fields is not supported by combined_nms.')
if parallel_iterations != 32:
tf.compat.v1.logging.warning(
tf.logging.warning(
'Number of batch items to be processed in parallel is'
' not configurable by combined_nms.')
if max_classes_per_detection > 1:
tf.compat.v1.logging.warning(
tf.logging.warning(
'max_classes_per_detection is not configurable by combined_nms.')
with tf.name_scope(scope, 'CombinedNonMaxSuppression'):
......@@ -1013,7 +1013,7 @@ def batch_multiclass_non_max_suppression(boxes,
else:
ordered_additional_fields = collections.OrderedDict(
sorted(additional_fields.items(), key=lambda item: item[0]))
del additional_fields
with tf.name_scope(scope, 'BatchMultiClassNonMaxSuppression'):
boxes_shape = boxes.shape
batch_size = shape_utils.get_dim_as_int(boxes_shape[0])
......
......@@ -16,6 +16,10 @@
"""Provides functions to prefetch tensors to feed into models."""
import tensorflow as tf
from object_detection.utils import tf_version
if not tf_version.is_tf1():
raise ValueError('`prefetcher.py` is only supported in Tensorflow 1.X')
def prefetch(tensor_dict, capacity):
"""Creates a prefetch queue for tensors.
......
......@@ -21,12 +21,15 @@ from __future__ import print_function
from six.moves import range
import tensorflow as tf
# pylint: disable=g-bad-import-order,
from object_detection.core import prefetcher
slim = tf.contrib.slim
from tensorflow.contrib import slim as contrib_slim
slim = contrib_slim
# pylint: disable=g-bad-import-order
class PrefetcherTest(tf.test.TestCase):
"""Test class for prefetcher."""
def test_prefetch_tensors_with_fully_defined_shapes(self):
with self.test_session() as sess:
......
# Lint as: python2, python3
# Copyright 2018 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
......
# Lint as: python2, python3
# Copyright 2018 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
......
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册