fix vehicle yaml bugs and add vehicle docs

3ff2f9a6 · dongshuilong · 38f0f5f4 · 3ff2f9a6 · 3ff2f9a6 · 3ff2f9a6
6 changed file
--- a/docs/images/vehicle/CompCars.png
+++ b/docs/images/vehicle/CompCars.png
--- a/docs/images/vehicle/cars.JPG
+++ b/docs/images/vehicle/cars.JPG
--- a/docs/zh_CN/application/vehicle_fine_grained_classfication.md
+++ b/docs/zh_CN/application/vehicle_fine_grained_classfication.md
+# 车辆细粒度分类
+
+细粒度分类，是对属于某一类基础类别的图像进行子类别的细粉，如各种鸟、各种花、各种矿石之间。顾名思义，车辆细粒度分类是对车辆的不同子类别进行分类。
+
+其训练过程与车辆ReID相比，有以下不同：
+
+- 数据集不同
+- Loss设置不同
+
+其他部分请详见[车辆ReID](./vehicle_reid.md)
+
+## 数据集
+
+在此demo中，使用[CompCars](http://mmlab.ie.cuhk.edu.hk/datasets/comp_cars/index.html)作为训练数据集
+
+<img src="../../images/vehicle/CompCars.png" style="zoom:50%;" />
+
+图像主要来自网络和监控数据，其中网络数据包含163个汽车制造商、1716个汽车型号的汽车。共**136,726**张全车图像，**27,618**张部分车图像。其中网络汽车数据包含bounding box、视角、5个属性（最大速度、排量、车门数、车座数、汽车类型）。监控数据包含**50,000**张前视角图像。
+
+值得注意的是，此数据集中需要根据自己的需要生成不同的label，如本demo中，将不同年份生产的相同型号的车辆视为同一类，因此，类别总数为：431类。
+
+## Loss设置
+
+与车辆ReID不同，在此分类中，Loss使用的是[TtripLet Loss](../../../ppcls/loss/triplet.py) + [ArcLoss](../../../ppcls/arch/gears/arcmargin.py)，权重比例1:1。
+
+整体配置文件：[ResNet50.yaml](../../../ppcls/configs/Vehicle/ResNet50.yaml)
--- a/docs/zh_CN/application/vehicle_reid.md
+++ b/docs/zh_CN/application/vehicle_reid.md
+# 车辆ReID
+
+ ReID，也就是 Re-identification，其定义是利用算法，在图像库中找到要搜索的目标的技术，所以它是属于图像检索的一个子问题。而车辆ReID就是给定一张车辆图像，找出同一摄像头不同的拍摄图像，或者不同摄像头下拍摄的同一车辆图像的过程。在此过程中，如何提取鲁棒特征，尤为重要。因此，此文档主要对车辆ReID中训练特征提取网络部分做相关介绍，内容如下：
+
+-  数据集及预处理方式
+- Backbone的具体设置
+- Loss函数的相关设置
+
+## 数据集及预处理
+
+### VERI-Wild数据集
+
+<img src="../../images/vehicle/cars.JPG" style="zoom:50%;" />
+
+此数据集是在一个大型闭路电视监控系统，在无约束的场景下，一个月内（30*24小时）中捕获的。该系统由174个摄像头组成，其摄像机分布在200多平方公里的大型区域。原始车辆图像集包含1200万个车辆图像，经过数据清理和标注，采集了416314张40671个不同的车辆图像。[具体详见论文](https://github.com/PKU-IMRE/VERI-Wild)
+
+### 数据预处理
+
+由于原始的数据集中，车辆图像已经是由检测器检测后crop出的车辆图像，因此无需像训练`ImageNet`中图像crop操作。整体的数据增强方式，按照顺序如下：
+
+- 图像`Resize`到224
+- 随机水平翻转
+- [AugMix](https://arxiv.org/abs/1912.02781v1)
+- Normlize：归一化到0～1
+- [RandomErasing](https://arxiv.org/pdf/1708.04896v2.pdf)
+
+## Backbone的具体设置
+
+具体是用`ResNet50`作为backbone，但在`ResNet50`基础上做了如下修改：
+
+- 对Last Stage（第4个stage），没有做下采样，即第4个stage的feature map和第3个stage的feature map大小一致，都是14x14。
+- 在最后加入一个embedding 层，即1x1的卷积层，特征维度为512
+
+具体代码
+
+## Loss的设置
+
+车辆ReID中，使用了[SupConLoss](https://arxiv.org/abs/2004.11362) + [ArcLoss](https://arxiv.org/abs/1801.07698)，其中权重比例为1:1
+
+具体代码详见：[SupConLoss代码](../../../ppcls/loss/supconloss.py)、[ArcLoss代码](../../../ppcls/arch/gears/arcmargin.py)
+
+
+
+全部的超参数及具体配置：[ResNet50_ReID.yaml](../../../ppcls/configs/Vehicle/ResNet50_ReID.yaml)。
--- a/ppcls/configs/Vehicle/ResNet50.yaml
+++ b/ppcls/configs/Vehicle/ResNet50.yaml
@@ -18,6 +18,8 @@ Global:
 # model architecture
 Arch:
  name: "RecModel"
+  infer_output_key: "features"
+  infer_add_softmax: False
  Backbone: 
    name: "ResNet50_last_stage_stride1"
    pretrained: True
@@ -66,10 +68,10 @@ DataLoader:
  Train:
    dataset:
        name: "CompCars"
-        image_root: "/work/dataset/CompCars/image/"
-        label_root: "/work/dataset/CompCars/label/"
+        image_root: "./dataset/CompCars/image/"
+        label_root: "./dataset/CompCars/label/"
        bbox_crop: True
-        cls_label_path: "/work/dataset/CompCars/train_test_split/classification/train_label.txt"
+        cls_label_path: "./dataset/CompCars/train_test_split/classification/train_label.txt"
        transform_ops:
          - ResizeImage:
              size: 224
@@ -103,9 +105,9 @@ DataLoader:
    # TOTO: modify to the latest trainer
    dataset: 
        name: "CompCars"
-        image_root: "/work/dataset/CompCars/image/"
-        label_root: "/work/dataset/CompCars/label/"
-        cls_label_path: "/work/dataset/CompCars/train_test_split/classification/test_label.txt"
+        image_root: "./dataset/CompCars/image/"
+        label_root: "./dataset/CompCars/label/"
+        cls_label_path: "./dataset/CompCars/train_test_split/classification/test_label.txt"
        bbox_crop: True
        transform_ops:
          - ResizeImage:

--- a/ppcls/configs/Vehicle/ResNet50_ReID.yaml
+++ b/ppcls/configs/Vehicle/ResNet50_ReID.yaml
@@ -19,6 +19,8 @@ Global:
 # model architecture
 Arch:
  name: "RecModel"
+  infer_output_key: "features"
+  infer_add_softmax: False
  Backbone: 
    name: "ResNet50_last_stage_stride1"
    pretrained: True
@@ -66,8 +68,8 @@ DataLoader:
  Train:
    dataset:
        name: "VeriWild"
-        image_root: "/work/dataset/VeRI-Wild/images/"
-        cls_label_path: "/work/dataset/VeRI-Wild/train_test_split/train_list_start0.txt"
+        image_root: "./dataset/VeRI-Wild/images/"
+        cls_label_path: "./dataset/VeRI-Wild/train_test_split/train_list_start0.txt"
        transform_ops:
          - ResizeImage:
              size: 224
@@ -101,8 +103,8 @@ DataLoader:
    # TOTO: modify to the latest trainer
      dataset: 
        name: "VeriWild"
-        image_root: "/work/dataset/VeRI-Wild/images"
-        cls_label_path: "/work/dataset/VeRI-Wild/train_test_split/test_3000_id_query.txt"
+        image_root: "./dataset/VeRI-Wild/images"
+        cls_label_path: "./dataset/VeRI-Wild/train_test_split/test_3000_id_query.txt"
        transform_ops:
          - ResizeImage:
              size: 224
@@ -124,8 +126,8 @@ DataLoader:
    # TOTO: modify to the latest trainer
      dataset: 
        name: "VeriWild"
-        image_root: "/work/dataset/VeRI-Wild/images"
-        cls_label_path: "/work/dataset/VeRI-Wild/train_test_split/test_3000_id.txt"
+        image_root: "./dataset/VeRI-Wild/images"
+        cls_label_path: "./dataset/VeRI-Wild/train_test_split/test_3000_id.txt"
        transform_ops:
          - ResizeImage:
              size: 224