README.md 3.1 KB
Newer Older
Y
Yibing Liu 已提交
1 2 3 4
# __Deep Attention Matching Network__

This is the source code of Deep Attention Matching network (DAM), that is proposed for multi-turn response selection in the retrieval-based chatbot.

5
DAM is a neural matching network that entirely based on attention mechanism. The motivation of DAM is to capture those semantic dependencies, among dialogue elements at different level of granularities, in multi-turn conversation as matching evidences, in order to better match response candidate with its multi-turn context. DAM appears on ACL-2018, please find our paper at [http://aclweb.org/anthology/P18-1103](http://aclweb.org/anthology/P18-1103).
Y
Yibing Liu 已提交
6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30


## __Network__

DAM is inspired by Transformer in Machine Translation (Vaswani et al., 2017), and we extend the key attention mechanism of Transformer in two perspectives and introduce those two kinds of attention in one uniform neural network.

- **self-attention** To gradually capture semantic representations in different granularities by stacking attention from word-level embeddings. Those multi-grained semantic representations would facilitate exploring segmental dependencies between context and response.

- **cross-attention** Attention across context and response can generally capture the relevance in dependency between segment pairs, which could provide complementary information to textual relevance for matching response with multi-turn context.

<p align="center">
<img src="images/Figure1.png"/> <br />
Overview of Deep Attention Matching Network
</p>

## __Results__

We test DAM on two large-scale multi-turn response selection tasks, i.e., the Ubuntu Corpus v1 and Douban Conversation Corpus, experimental results are bellow:

<p align="center">
<img src="images/Figure2.png"/> <br />
</p>

## __Usage__

31 32 33 34 35 36 37 38 39
Take the experiment on the Ubuntu Corpus v1 for Example.

1) Go to the `ubuntu` directory

```
cd ubuntu
```
2) Download the well-preprocessed data for training  

Y
Yibing Liu 已提交
40
```
41
sh download_data.sh
Y
Yibing Liu 已提交
42
```
43
3) Execute the model training and evaluation by
Y
Yibing Liu 已提交
44 45

```
46
sh train.sh
Y
Yibing Liu 已提交
47
```
48
for more detailed explanation about the arguments, please run
Y
Yibing Liu 已提交
49 50

```
51
python ../train_and_evaluate.py --help
Y
Yibing Liu 已提交
52 53
```

Y
Yibing Liu 已提交
54 55 56 57 58 59
By default, the training is executed on one single GPU, which can be switched to multiple-GPU mode easily by simply resetting the visible devices in `train.sh`, e.g.,

```
export CUDA_VISIBLE_DEVICES=0,1,2,3
```

60 61 62 63 64 65 66 67 68
4) Run test by

```
sh test.sh
```
and run the test for different saved models by using different argument `--model_path`.

Similary, one can carry out the experiment on the Douban Conversation Corpus by going to the directory `douban` and following the same procedure.

Y
Yibing Liu 已提交
69 70 71
## __Dependencies__

- Python >= 2.7.3
72
- PaddlePaddle latest develop branch
Y
Yibing Liu 已提交
73 74 75 76 77 78 79 80 81 82 83 84 85 86 87

## __Citation__

The following article describe the DAM in detail. We recommend citing this article as default.

```
@inproceedings{ ,
  title={Multi-Turn Response Selection for Chatbots with Deep Attention Matching Network},
  author={Xiangyang Zhou, Lu Li, Daxiang Dong, Yi Liu, Ying Chen, Wayne Xin Zhao, Dianhai Yu and Hua Wu},
  booktitle={Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
  volume={1},
  pages={  --  },
  year={2018}
}
```