Task Type |
Datatset |
Pre-trained Models |
Start Command |
V100 GPU Cards |
Running Time |
Text Understanding |
SST-2 |
UNIMO base |
sh ./script/classification/SST-2/run.sh |
8 |
9h |
UNIMO large |
sh ./script/classification/SST-2_large/run.sh |
8 |
14h |
CoLA |
UNIMO base |
sh ./script/classification/CoLA/run.sh |
4 |
2h |
UNIMO large |
sh ./script/classification/CoLA_large/run.sh |
4 |
4h |
MNLI-AX |
UNIMO base |
sh ./script/classification/MNLI-AX/run.sh |
8 |
1d20h |
UNIMO large |
sh ./script/classification/MNLI-AX_large/run.sh |
8 |
2d13h |
STS-B |
UNIMO-mnli base |
sh ./script/regression/STS-B/run.sh |
8 |
2h |
UNIMO-mnli large |
sh ./script/regression/STS-B_large/run.sh |
8 |
4h |
Text Generation |
CNN/DailyMail |
UNIMO base |
sh ./script/seq2seq/cnndm/run.sh |
4 |
1d8h |
UNIMO large |
sh ./script/seq2seq/cnndm_large/run.sh |
4 |
3d18h |
Gigaword |
UNIMO base |
sh ./script/seq2seq/gigaword/run.sh |
4 |
1d3h |
UNIMO large |
sh ./script/seq2seq/gigaword_large/run.sh |
4 |
2d3h |
CoQA |
UNIMO base |
sh ./script/seq2seq/coqa/run.sh |
4 |
7h |
UNIMO large |
sh ./script/seq2seq/coqa_large/run.sh |
4 |
22h |
Squad_QG |
UNIMO base |
sh ./script/seq2seq/squad_qg/run.sh |
4 |
4h |
UNIMO large |
sh ./script/seq2seq/squad_qg_large/run.sh |
4 |
8h |
Multi-Modal Understanding |
Flickr30k |
UNIMO base |
sh ./script/retrieval/Flickr30k/run.sh |
16 |
3d |
UNIMO large |
sh ./script/retrieval/Flickr30k_large/run.sh |
16 |
3d |
SNLI-VE |
UNIMO base |
sh ./script/visual_entailment/SNLI-VE/run.sh |
16 |
16h |
UNIMO large |
sh ./script/visual_entailment/SNLI-VE_large/run.sh |
16 |
2d |
VQA |
UNIMO base |
- |
- |
- |
UNIMO large |
- |
- |
- |
Multi-Modal Generation |
COCO Caption |
UNIMO base |
sh ./script/img2txt/coco/run.sh |
16 |
3d |
UNIMO large |
sh ./script/img2txt/coco_large/run.sh |
16 |
4d |
---
## Text Understanding Tasks
### (1) Sentiment Classification
#### Download SST-2 dataset:
```
cd /path/to/data
wget --no-check-certificate -q https://unimo.bj.bcebos.com/data/SST-2.tar.gz
tar -zxf SST.tar.gz
```
#### Run the following common to train and evaluate on the SST-2 dataset:
For base model:
```
bash ./script/classification/SST-2/run.sh
```
For large model:
```
bash ./script/classification/SST-2_large/run.sh
```
#### Evaluation Results:
Model |
Acc |
UNIMO-base |
95.1 |
UNIMO-large |
96.8 |
### (2) Natural Language Inference
#### Download MNLI-AX dataset:
```
cd /path/to/data
wget --no-check-certificate -q https://unimo.bj.bcebos.com/data/MNLI-AX.tar.gz
tar -zxf MNLI-AX.tar.gz
```
#### Run the following common to train and evaluate on the MNLI-AX dataset:
For base model:
```
bash ./script/classification/MNLI-AX/run.sh
```
For large model:
```
bash ./script/classification/MNLI-AX_large/run.sh
```
#### Evaluation Results:
Model |
Acc-(m/mm) |
UNIMO-base |
86.8/86.7 |
UNIMO-large |
89.8/89.5 |
### (3) Similarity Tasks
#### Download STS-B dataset:
```
cd /path/to/data
wget --no-check-certificate -q https://unimo.bj.bcebos.com/data/STS-B.tar.gz
tar -zxf STS-B.tar.gz
```
#### Run the following common to train and evaluate on the STS-B dataset:
For base model:
```
bash ./script/regression/STS-B/run.sh
```
For large model:
```
bash ./script/regression/STS-B_large/run.sh
```
#### Evaluation Results:
Model |
Pearson correlation |
UNIMO-base |
91.0 |
UNIMO-large |
92.6 |
### (4) Linguistic Acceptability Judgments
#### Download CoLA dataset:
```
cd /path/to/data
wget --no-check-certificate -q https://unimo.bj.bcebos.com/data/CoLA.tar.gz
tar -zxf CoLA.tar.gz
```
#### Run the following common to train and evaluate on the CoLA dataset:
For base model:
```
bash ./script/classification/CoLA/run.sh
```
For large model:
```
bash ./script/classification/CoLA_large/run.sh
```
#### Evaluation Results:
Model |
Matthews correlation |
UNIMO-base |
65.4 |
UNIMO-large |
68.5 |
## Text Generation Tasks
### (1) Document Summarization
#### Download CNN/DailyMail dataset:
```
cd /path/to/data
wget --no-check-certificate -q https://unimo.bj.bcebos.com/data/cnndm.tar.gz
tar -zxf cnndm.tar.gz
```
#### Download evaluation script:
```
cd src/eval/tasks
wget --no-check-certificate -q https://unimo.bj.bcebos.com/eval_script/cnndm.tar.gz
tar -zxf cnndm.tar.gz
```
#### Run the following common to train and evaluate on the CNN/DailyMail dataset:
For base model:
```
bash ./script/seq2seq/cnndm/run.sh
```
For large model:
```
bash ./script/seq2seq/cnndm_large/run.sh
```
#### Evaluation Results:
Model |
ROUGE-1 |
ROUGE-2 |
ROUGE-L |
UNIMO-base |
42.42 |
20.12 |
39.61 |
UNIMO-large |
43.51 |
20.65 |
40.63 |
### (2) Sentence Compression
#### Download Gigaword dataset:
```
cd /path/to/data
wget --no-check-certificate -q https://unimo.bj.bcebos.com/data/gigaword.tar.gz
tar -zxf gigaword.tar.gz
```
#### Download evaluation script:
```
cd src/eval/tasks
wget --no-check-certificate -q https://unimo.bj.bcebos.com/eval_script/gigaword.tar.gz
tar -zxf gigaword.tar.gz
```
#### Run the following common to train and evaluate on the Gigaword dataset:
For base model:
```
bash ./script/seq2seq/gigaword/run.sh
```
For large model:
```
bash ./script/seq2seq/gigaword_large/run.sh
```
#### Evaluation Results:
Model |
ROUGE-1 |
ROUGE-2 |
ROUGE-L |
UNIMO-base |
38.80 |
19.99 |
36.27 |
UNIMO-large |
39.71 |
20.37 |
36.88 |
### (3) Question Generation
#### Download Squad dataset:
```
cd /path/to/data
wget --no-check-certificate -q https://unimo.bj.bcebos.com/data/squad_qg.tar.gz
tar -zxf squad_qg.tar.gz
```
#### Download evaluation script:
```
cd src/eval/tasks
wget --no-check-certificate -q https://unimo.bj.bcebos.com/eval_script/squad_qg.tar.gz
tar -zxf squad_qg.tar.gz
```
#### Run the following common to train and evaluate on the Squad dataset:
For base model:
```
bash ./script/seq2seq/squad_qg/run.sh
```
For large model:
```
bash ./script/seq2seq/squad_qg_large/run.sh
```
#### Evaluation Results:
Model |
BLUE4 |
METEOR |
ROUGE-L |
UNIMO-base |
22.78 |
25.24 |
51.34 |
UNIMO-large |
24.59 |
26.39 |
52.47 |
### (4) Conversation Question Answering
#### Download CoQA dataset:
```
cd /path/to/data
wget --no-check-certificate -q https://unimo.bj.bcebos.com/data/coqa.tar.gz
tar -zxf coqa.tar.gz
```
#### Download evaluation script:
```
cd src/eval/tasks
wget --no-check-certificate -q https://unimo.bj.bcebos.com/eval_script/coqa.tar.gz
tar -zxf coqa.tar.gz
```
#### Run the following common to train and evaluate on the CoQA dataset:
For base model:
```
bash ./script/seq2seq/coqa/run.sh
```
For large model:
```
bash ./script/seq2seq/coqa_large/run.sh
```
#### Evaluation Results:
Model |
Acc |
UNIMO-base |
80.2 |
UNIMO-large |
84.9 |
## Multi-Modal Understanding Tasks
### (1) Image-Text Retrieval
#### Download Flickr30k dataset:
##### Note: Visual features are extracted by [bottom-up-attention](https://github.com/peteanderson80/bottom-up-attention)
```
cd /path/to/data
wget --no-check-certificate -q https://unimo.bj.bcebos.com/data/Flickr30k.tar.gz # occupies about 37G disk space
tar -zxf Flickr30k.tar.gz
```
#### Run the following common to train and evaluate on the Flickr30k dataset:
For base model:
```
bash ./script/retrieval/Flickr30k/run.sh
```
For large model:
```
bash ./script/retrieval/Flickr30k_large/run.sh
```
#### Evaluation Results:
Results of Image Retrieval task on Flickr30k dataset
Model |
R@1 |
R@5 |
R@10 |
UNIMO-base |
74.66 |
93.40 |
96.08 |
UNIMO-large |
78.04 |
94.24 |
97.12 |
Results of Text Retrieval task on Flickr30k dataset
Model |
R@1 |
R@5 |
R@10 |
UNIMO-base |
89.70 |
98.40 |
99.10 |
UNIMO-large |
89.40 |
98.90 |
99.80 |
### (2) Visual Entailment
#### Download SNLI-VE dataset:
##### Note: Visual features are extracted by [bottom-up-attention](https://github.com/peteanderson80/bottom-up-attention)
```
cd /path/to/data
wget --no-check-certificate -q https://unimo.bj.bcebos.com/data/SNLI-VE.tar.gz
tar -zxf SNLI-VE.tar.gz
```
#### Run the following common to train and evaluate on the SNLI-VE dataset:
For base model:
```
bash ./script/visual_entailment/SNLI-VE/run.sh
```
For large model:
```
bash ./script/visual_entailment/SNLI-VE_large/run.sh
```
#### Evaluation Results:
Results of Visual Entailment task on SNLI-VE dataset
Model |
dev |
test |
UNIMO-base |
80.00 |
79.10 |
UNIMO-large |
81.11 |
80.63 |
## Multi-Modal Generation Tasks
### (1) Image Caption Generation
#### Download COCO Caption dataset:
##### Note: Visual features are extracted by [bottom-up-attention](https://github.com/peteanderson80/bottom-up-attention)
```
cd /path/to/data
wget --no-check-certificate -q https://unimo.bj.bcebos.com/data/coco.tar.gz
tar -zxf coco.tar.gz
```
#### Download evaluation script:
```
cd src/eval/tasks
wget --no-check-certificate -q https://unimo.bj.bcebos.com/eval_script/coco.tar.gz
tar -zxf coco.tar.gz
```
#### Run the following common to train and evaluate on the COCO Caption dataset:
For base model:
```
bash ./script/img2txt/coco/run.sh
```
For large model:
```
bash ./script/img2txt/coco_large/run.sh
```
#### Evaluation Results:
Model |
BLUE4 |
CIDEr |
UNIMO-base |
38.8 |
124.4 |
UNIMO-large |
39.6 |
127.7 |
---
Citation
---
If you find our paper and code useful, please cite the following paper:
```
@article{li2020unimo,
title={UNIMO: Towards Unified-Modal Understanding and Generation via Cross-Modal Contrastive Learning},
author={Li, Wei and Gao, Can and Niu, Guocheng and Xiao, Xinyan and Liu, Hao and Liu, Jiachen and Wu, Hua and Wang, Haifeng},
journal={arXiv preprint arXiv:2012.15409},
year={2020}
}
```
Contact information
---
For help or issues using `UNIMO`, please submit a GitHub issue.
For personal communication related to `UNIMO`, please contact Wei Li (liwei85@baidu.com), Guocheng Niu (niuguocheng@baidu.com) , Can Gao (gaocan01@baidu.com).