Training custom coco-like dataset ppyolo_2x with non-square data (#1302) · Issue · PaddlePaddle / PaddleDetection

Training custom coco-like dataset ppyolo_2x with non-square data

Created by: LukeAI

I have a custom coco-like dataset with 1500 training images and 699 validation images with 3 different classes that I wish to train ppyolo_2x on.

I want to start with the weights pretrained on coco.

The images in the custom dataset are all 3-channel, width=256, height=768 I want to train at these dimensions, jitter augmentation is fine, but not distorting to 3x608x608

I have done the following:

copy ppyolo_2x.yml => ppyolo_2x_custom.yml

Change: max_iters: 500000 to max_iters: 5000 snapshot_iter: 10000 to snapshot_iter: 500 num_classes: 80 to num_classes: 3

changed these parameters to the maximum possible size of an object I expect to detect:

IouLoss:
  loss_weight: 2.5
  max_height: 700
  max_width: 230

IouAwareLoss:                                                                                                                                                                                                       
  loss_weight: 1.0
  max_height: 700
  max_width: 230

My training data is very small, have changed learning-rate params:

ORIGINAL

LearningRate:
  base_lr: 0.01
  schedulers:
  - !PiecewiseDecay
    gamma: 0.1
    milestones:
    - 400000
    - 450000
  - !LinearWarmup
    start_factor: 0.
    steps: 4000

MODIFIED

LearningRate:
  base_lr: 0.001
  schedulers:
  - !PiecewiseDecay
    gamma: 0.1
    milestones:
    - 2500
    - 3000
  - !LinearWarmup
    start_factor: 0.
    steps: 400

Changed: _READER_: 'ppyolo_custom_reader.yml'

Next I copy ppyolo_reader.yml => ppyolo_custom_reader.yml

Change ppyolo_custom_reader.yml to point to my custom dataset:

TrainReader:

  dataset:
    !COCODataSet
      image_dir: train
      anno_path: train_annotations.json
      dataset_dir: dataset/custom_OD
      with_background: false

EvalReader and TestReader:

  dataset:
    !COCODataSet
      image_dir: test
      anno_path: test_annotations.json
      dataset_dir: dataset/custom_OD
      with_background: false

I would like to know if this all seems correct and if there are other parameters I should be paying attention to.

What is the difference between testreader and evalreader?

TestReader has this param. What is the order of dimensions here? Is it (channels, width, height) or (channels, height, width) ? If I simply delete this parameter will the dimensions of the raw image be used instead?

    image_shape: [3, 608, 608]

TrainReader has this section. It seems this is stretching the image as an augmentation, how should I modify this, given that I want to maintain roughly the same aspect ratio of 256 x 768 throughout training.

  batch_transforms:
  - !RandomShape
    sizes: [320, 352, 384, 416, 448, 480, 512, 544, 576, 608]
    random_inter: True

EvalReader and TestReader both have a section like:

    - !ResizeImage
      target_size: 608
      interp: 2

Will this stretch images into a 608x608 square for inference? Can I stop this by simply disabling it?

Is num_max_boxes: 50 the maximum number of objects expected in a frame? will setting this lower improve speed or accuracy?

Is there some way I can recalculate anchors for my dataset? Is this recommended?

Finally, I start training, not nothing happens. where have I gone wrong? Is this my config or could my annotations be incorrect?

python -u tools/train.py -c configs/ppyolo/ppyolo_custom.yml --eval

Output:

(PP) luke@luke-computer:~/PaddleDetection$ python -u tools/train.py -c configs/ppyolo/ppyolo_2x_custom.yml --eval
2020-08-27 13:47:43,262-INFO: If regularizer of a Parameter has been set by 'fluid.ParamAttr' or 'fluid.WeightNormParamAttr' already. The Regularization[L2Decay, regularization_coeff=0.000500] in Optimizer will not take effect, and it will only be applied to other Parameters!
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
2020-08-27 13:47:45,634-INFO: places would be ommited when DataLoader is not iterable
W0827 13:47:45.671777 20383 device_context.cc:252] Please NOTE: device: 0, CUDA Capability: 75, Driver API Version: 10.0, Runtime API Version: 10.0
W0827 13:47:45.674274 20383 device_context.cc:260] device: 0, cuDNN Version: 7.6.
2020-08-27 13:47:47,351-INFO: Downloading ResNet50_vd_ssld_pretrained.tar from https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_vd_ssld_pretrained.tar
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 92837/92837 [00:23<00:00, 3963.24KB/s]
2020-08-27 13:48:15,006-INFO: Decompressing /home/luke/.cache/paddle/weights/ResNet50_vd_ssld_pretrained.tar...
2020-08-27 13:48:15,629-WARNING: /home/luke/.cache/paddle/weights/ResNet50_vd_ssld_pretrained.pdparams not found, try to load model file saved with [ save_params, save_persistables, save_vars ]
/home/luke/.pyenv/versions/PP/lib/python3.7/site-packages/paddle/fluid/io.py:1998: UserWarning: This list is not set, Because of Paramerter not found in program. There are: fc_0.b_0 fc_0.w_0
  format(" ".join(unused_para_list)))
loading annotations into memory...
Done (t=0.01s)
creating index...
index created!
2020-08-27 13:48:17,914-INFO: places would be ommited when DataLoader is not iterable

PaddlePaddle / PaddleDetection 接近 2 年 前同步成功

Training custom coco-like dataset ppyolo_2x with non-square data

PaddlePaddle / PaddleDetection
接近 2 年前同步成功