README.md 1.4 KB
Newer Older
R
ranqiu 已提交
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
# Convolutional Sequence to Sequence Learning
This model implements the work in the following paper:

Jonas Gehring, Micheal Auli, David Grangier, et al. Convolutional Sequence to Sequence Learning. Association for Computational Linguistics (ACL), 2017

# Training a Model
- Modify the following script if needed and then run:

	```bash
	python train.py \
	  --train_data_path ./data/train_data \
	  --test_data_path ./data/test_data \
	  --src_dict_path ./data/src_dict \
	  --trg_dict_path ./data/trg_dict \
	  --enc_blocks "[(256, 3)] * 5" \
	  --dec_blocks "[(256, 3)] * 3" \
	  --emb_size 256 \
	  --pos_size 200 \
	  --drop_rate 0.1 \
	  --use_gpu False \
	  --trainer_count 1 \
	  --batch_size 32 \
	  --num_passes 20 \
	  >train.log 2>&1
	```

# Inferring by a Trained Model
- Infer by a trained model by running:

	```bash
	python infer.py \
	  --infer_data_path ./data/infer_data \
	  --src_dict_path ./data/src_dict \
	  --trg_dict_path ./data/trg_dict \
	  --enc_blocks "[(256, 3)] * 5" \
	  --dec_blocks "[(256, 3)] * 3" \
	  --emb_size 256 \
	  --pos_size 200 \
	  --drop_rate 0.1 \
	  --use_gpu False \
	  --trainer_count 1 \
	  --max_len 100 \
	  --beam_size 1 \
	  --model_path ./params.pass-0.tar.gz \
	  1>infer_result 2>infer.log
	```

# Notes

Currently, the beam search will forward the whole network when predicting every word, which is a waste of time. And we will fix it later.