Add more research projects (#2611)

4fa5b715 · Yibing Liu · GitHub · 8d63308d · 4fa5b715 · 4fa5b715
11 changed file
--- a/PaddleNLP/Research/ACL2019-ARNOR/README.md
+++ b/PaddleNLP/Research/ACL2019-ARNOR/README.md
+Data
+=====
+
+This dataset is for our paper: ARNOR: Attention Regularization based Noise Reduction for Distant Supervision Relation Classification. This test set is for sentence-level evaluation.
+
+The original data is from the dataset in the paper: Cotype: Joint extraction of typed entities and relations with knowledge bases. It is a distant supervision dataset from NYT (New York Time). And the test set is annotated by humans. However the number of positive instances in test set is small. We revise and annotate more test data based on it.
+
+In a data file, each line is a json string. The content is like
+
+    {
+        "sentText": "The source sentence text",
+        "relationMentions": [
+                                {
+                                    "em1Text": "The first entity in relation",
+                                    "em2Text": "The second entity in relation",
+                                    "label": "Relation label",
+                                    "is_noise": false # only occur in test set
+                                },
+                                ...
+                            ],
+        "entityMentions":     [
+                                {
+                                    "text": "Entity words",
+                                    "label": "Entity type",
+                                    ...
+                                },
+                                ...
+                            ]
+        ...
+    }
+
+Data version 1.0.0
+=====
+
+This version of dataset is the original one applied in our paper, which includes four files: train.json, test.json, dev_part.json, and test_part.json. Here dev_part.json and test_part.json are from test.json. This dataset can be downloaded here: https://baidu-nlp.bj.bcebos.com/arnor_dataset-1.0.0.tar.gz
+
+
+Data version 2.0.0
+=====
+
+More test date are coming soon ......
--- a/PaddleNLP/Research/NAACL2019-MPM/README.md
+++ b/PaddleNLP/Research/NAACL2019-MPM/README.md
+# Multi-Perspective Models
+
+This model won the first place in SemEval 2019 Task 9 SubTask A - Suggestion Mining from Online Reviews and Forums.
+
+See more information about SemEval 2019: [http://alt.qcri.org/semeval2019/](http://alt.qcri.org/semeval2019/)
+
+## 1. Introduction
+This paper describes our system participated in Task 9 of SemEval-2019: the task is focused on suggestion mining and it aims to classify given sentences into suggestion and non-suggestion classes in domain speciﬁc and cross domain training setting respectively. We propose a multi-perspective architecture for learning representations by using different classical models including Convolutional Neural Networks (CNN), Gated Recurrent Units (GRU), Feed Forward Attention (FFA), etc. To leverage the semantics distributed in large amount of unsupervised data, we also have adopted the pre-trained Bidirectional Encoder Representations from Transformers (BERT) model as an encoder to produce sentence and word representations. The proposed architecture is applied for both sub-tasks, and achieved f1-score of 0.7812 for subtask A, and 0.8579 for subtask B. We won the ﬁrst and second place for the two tasks respectively in the ﬁnal competition.
+
+## 2. Quick Start
+### Installation
+This project depends on python2.7 and paddle.fluid = 1.3.2, please follow [quick start](http://www.paddlepaddle.org/#quick-start) to install.
+### Data Preparation
+- Download the competition's data
+
+```
+# Download the competition's data
+cd ./data && git clone https://github.com/Semeval2019Task9/Subtask-A.git
+cd ../
+```
+
+- Download BERT and pre-trained model
+
+```
+# Download BERT code
+git clone https://github.com/PaddlePaddle/LARK && mv LARK/BERT ./
+# Download BERT pre-trained model
+wget https://bert-models.bj.bcebos.com/uncased_L-24_H-1024_A-16.tar.gz
+tar zxf uncased_L-24_H-1024_A-16.tar.gz -C ./
+```
+
+### Train
+Use this command to start training:
+
+```
+# run training script
+sh train.sh
+```
+The models will output to ./output .
+
+### Ensemble & Evaluation
+Use this commad to evaluate ensemble result:
+
+```
+# run evaluation
+python evaluation.py \
+        ./data/Subtask-A/SubtaskA_EvaluationData_labeled.csv \
+        ./probs/prob_raw.txt \
+        ./probs/prob_cnn.txt \
+        ./probs/prob_gru.txt \
+        ./probs/prob_ffa.txt \
+```
+Due to the dataset size is small, the training result may fluctuate, please try re-training several times more.
+
+## 3. Advance
+### Task Introduction
+[Semeval2019-Task9](https://www.aclweb.org/anthology/S19-2151) presents the pilot SemEval task on Suggestion Mining. The task consists of subtasks A and B, creating labeled data from feedback forum and hotel reviews respectively. Examples:
+
+|Source |Sentence |Label|
+|------| ------|------|
+|Hotel reviews |Be sure to specify a room at the back of the hotel. |suggestion|
+|Hotel reviews |The point is, don’t advertise the service if there are caveats that go with it.|non-suggestion|
+|Suggestion forum| Why not let us have several pages that we can put tiles on and name whatever we want to |suggestion|
+|Suggestion forum| It fails with a uninformative message indicating deployment failed.|non-suggestion|
+
+### Model Introduction
+Model's framwork is shown in Figure 1:
+<p align="center">
+<img src="data/mpm.png"/> <br />
+<b>Figure 1: An overall framework and pipeline of our system for suggestion mining</b>
+</p>
+As shown in Figure 1. our model architecture is constituted of two modules which includes a universal encoding module as either a sentence or a word encoder, and a task speciﬁed module used for suggestion classiﬁcation. To fully explored the information generated by the encoder, we stack a serious of different task speciﬁed modules upon the encoder according to different perspective. Intuitively, we could use the sentence encoding directly to make a classiﬁcation, to go further beyond that, as language is time-series information in essence, the time perspective based GRU cells can also be applied to model the sequence state to learn the structure for the suggestion mining task. Similarly, the spatial perspective based CNN can be used to mimic the n-gram model, as well. Moreover, we also introduce a convenient attention mechanism FFA (Raffel and Ellis, 2015) to automatically learns the combination of most important features. At last, we ensemble those models by a voting strategy as ﬁnal prediction by this system.
+
+### Result
+| Models | CV f1-score | test score |
+| ----- | ----- | ------ |
+BERT-Large-Logistic | 0.8522 (±0.0213) | 0.7697
+BERT-Large-Conv | 0.8520 (±0.0231) | 0.7800
+BERT-Large-FFA | 0.8516 (±0.0307) | 0.7722
+BERT-Large-GRU | 0.8503 (±0.0275) | 0.7725
+Ensemble | – | 0.7812
+
+
+## 4. Others
+If you use the library in you research project, please cite the paper "OleNet at SemEval-2019 Task 9: BERT based Multi-Perspective Models for Suggestion Mining".
+### Citation
+
+```
+@inproceedings{BaiduMPM,
+    title={OleNet at SemEval-2019 Task 9: BERT based Multi-Perspective Models for Suggestion Mining},
+    author={Jiaxiang Liu, Shuohuan Wang, and Yu Sun},
+    booktitle={Proceedings of the 13th International Workshop on Semantic Evaluation (SemEval-2019)},
+    year={2019}
+}
+```
--- a/PaddleNLP/Research/NAACL2019-MPM/batching.py
+++ b/PaddleNLP/Research/NAACL2019-MPM/batching.py
+#   Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Mask, padding and batching."""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import numpy as np
+
+
+def mask(batch_tokens, total_token_num, vocab_size, CLS=1, SEP=2, MASK=3):
+    """
+    Add mask for batch_tokens, return out, mask_label, mask_pos;
+    Note: mask_pos responding the batch_tokens after padded;
+    """
+    max_len = max([len(sent) for sent in batch_tokens])
+    mask_label = []
+    mask_pos = []
+    prob_mask = np.random.rand(total_token_num)
+    # Note: the first token is [CLS], so [low=1]
+    replace_ids = np.random.randint(1, high=vocab_size, size=total_token_num)
+    pre_sent_len = 0
+    prob_index = 0
+    for sent_index, sent in enumerate(batch_tokens):
+        mask_flag = False
+        prob_index += pre_sent_len
+        for token_index, token in enumerate(sent):
+            prob = prob_mask[prob_index + token_index]
+            if prob > 0.15:
+                continue
+            elif 0.03 < prob <= 0.15:
+                # mask
+                if token != SEP and token != CLS:
+                    mask_label.append(sent[token_index])
+                    sent[token_index] = MASK
+                    mask_flag = True
+                    mask_pos.append(sent_index * max_len + token_index)
+            elif 0.015 < prob <= 0.03:
+                # random replace
+                if token != SEP and token != CLS:
+                    mask_label.append(sent[token_index])
+                    sent[token_index] = replace_ids[prob_index + token_index]
+                    mask_flag = True
+                    mask_pos.append(sent_index * max_len + token_index)
+            else:
+                # keep the original token
+                if token != SEP and token != CLS:
+                    mask_label.append(sent[token_index])
+                    mask_pos.append(sent_index * max_len + token_index)
+        pre_sent_len = len(sent)
+
+        # ensure at least mask one word in a sentence
+        while not mask_flag:
+            token_index = int(np.random.randint(1, high=len(sent) - 1, size=1))
+            if sent[token_index] != SEP and sent[token_index] != CLS:
+                mask_label.append(sent[token_index])
+                sent[token_index] = MASK
+                mask_flag = True
+                mask_pos.append(sent_index * max_len + token_index)
+    mask_label = np.array(mask_label).astype("int64").reshape([-1, 1])
+    mask_pos = np.array(mask_pos).astype("int64").reshape([-1, 1])
+    return batch_tokens, mask_label, mask_pos
+
+
+def prepare_batch_data(insts,
+                       total_token_num,
+                       voc_size=0,
+                       pad_id=None,
+                       cls_id=None,
+                       sep_id=None,
+                       mask_id=None,
+                       return_input_mask=True,
+                       return_max_len=True,
+                       return_num_token=False):
+    """
+    1. generate Tensor of data
+    2. generate Tensor of position
+    3. generate self attention mask, [shape: batch_size *  max_len * max_len]
+    """
+
+    batch_src_ids = [inst[0] for inst in insts]
+    batch_sent_ids = [inst[1] for inst in insts]
+    batch_pos_ids = [inst[2] for inst in insts]
+    seq_len = np.array(
+        [[len(inst[0])] for inst in insts]).astype("int64").reshape([-1, 1])
+    labels_list = []
+    # compatible with squad, whose example includes start/end positions, 
+    # or unique id
+
+    for i in range(3, len(insts[0]), 1):
+        labels = [inst[i] for inst in insts]
+        labels = np.array(labels).astype("int64").reshape([-1, 1])
+        labels_list.append(labels)
+
+    # First step: do mask without padding
+    if mask_id >= 0:
+        out, mask_label, mask_pos = mask(
+            batch_src_ids,
+            total_token_num,
+            vocab_size=voc_size,
+            CLS=cls_id,
+            SEP=sep_id,
+            MASK=mask_id)
+    else:
+        out = batch_src_ids
+    # Second step: padding
+    src_id, self_input_mask = pad_batch_data(
+        out, pad_idx=pad_id, return_input_mask=True)
+    pos_id = pad_batch_data(
+        batch_pos_ids,
+        pad_idx=pad_id,
+        return_pos=False,
+        return_input_mask=False)
+    sent_id = pad_batch_data(
+        batch_sent_ids,
+        pad_idx=pad_id,
+        return_pos=False,
+        return_input_mask=False)
+
+    if mask_id >= 0:
+        return_list = [
+            src_id, pos_id, sent_id, self_input_mask, mask_label, mask_pos
+        ] + labels_list
+    else:
+        return_list = [src_id, pos_id, sent_id, self_input_mask, seq_len
+                       ] + labels_list
+
+    return return_list if len(return_list) > 1 else return_list[0]
+
+
+def pad_batch_data(insts,
+                   pad_idx=0,
+                   return_pos=False,
+                   return_input_mask=False,
+                   return_max_len=False,
+                   return_num_token=False,
+                   return_seq_len=False):
+    """
+    Pad the instances to the max sequence length in batch, and generate the
+    corresponding position data and input mask.
+    """
+    return_list = []
+    max_len = max(len(inst) for inst in insts)
+    # Any token included in dict can be used to pad, since the paddings' loss
+    # will be masked out by weights and make no effect on parameter gradients.
+
+    inst_data = np.array([
+        list(inst) + list([pad_idx] * (max_len - len(inst))) for inst in insts
+    ])
+    return_list += [inst_data.astype("int64").reshape([-1, max_len, 1])]
+
+    # position data
+    if return_pos:
+        inst_pos = np.array([
+            list(range(0, len(inst))) + [pad_idx] * (max_len - len(inst))
+            for inst in insts
+        ])
+
+        return_list += [inst_pos.astype("int64").reshape([-1, max_len, 1])]
+
+    if return_input_mask:
+        # This is used to avoid attention on paddings.
+        input_mask_data = np.array(
+            [[1] * len(inst) + [0] * (max_len - len(inst)) for inst in insts])
+        input_mask_data = np.expand_dims(input_mask_data, axis=-1)
+        return_list += [input_mask_data.astype("float32")]
+
+    if return_max_len:
+        return_list += [max_len]
+
+    if return_num_token:
+        num_token = 0
+        for inst in insts:
+            num_token += len(inst)
+        return_list += [num_token]
+    if return_seq_len:
+        seq_len = np.array([[len(inst)] for inst in insts])
+        return_list += [seq_len.astype("int64").reshape([-1, 1])]
+    return return_list if len(return_list) > 1 else return_list[0]
+
+
+if __name__ == "__main__":
+    pass
--- a/PaddleNLP/Research/NAACL2019-MPM/classifier.py
+++ b/PaddleNLP/Research/NAACL2019-MPM/classifier.py
+#   Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Model for classifier."""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+import sys
+import numpy as np
+import paddle.fluid as fluid
+
+sys.path.append("./BERT")
+from model.bert import BertModel
+
+
+def create_model(args,
+                 pyreader_name,
+                 bert_config,
+                 num_labels,
+                 is_prediction=False):
+    """
+    define fine-tuning model
+    """
+    if args.binary:
+        pyreader = fluid.layers.py_reader(
+            capacity=50,
+            shapes=[[-1, args.max_seq_len, 1], [-1, args.max_seq_len, 1],
+                    [-1, args.max_seq_len, 1], [-1, args.max_seq_len, 1],
+                    [-1, 1], [-1, 1]],
+            dtypes=['int64', 'int64', 'int64', 'float32', 'int64', 'int64'],
+            lod_levels=[0, 0, 0, 0, 0, 0],
+            name=pyreader_name,
+            use_double_buffer=True)
+
+    (src_ids, pos_ids, sent_ids, input_mask, seq_len,
+     labels) = fluid.layers.read_file(pyreader)
+
+    bert = BertModel(
+        src_ids=src_ids,
+        position_ids=pos_ids,
+        sentence_ids=sent_ids,
+        input_mask=input_mask,
+        config=bert_config,
+        use_fp16=args.use_fp16)
+
+    if args.sub_model_type == 'raw':
+        cls_feats = bert.get_pooled_output()
+
+    elif args.sub_model_type == 'cnn':
+        bert_seq_out = bert.get_sequence_output()
+        bert_seq_out = fluid.layers.sequence_unpad(bert_seq_out, seq_len)
+        cnn_hidden_size = 100
+        convs = []
+        for h in [3, 4, 5]:
+            conv_feats = fluid.layers.sequence_conv(
+                input=bert_seq_out, num_filters=cnn_hidden_size, filter_size=h)
+            conv_feats = fluid.layers.batch_norm(input=conv_feats, act="relu")
+            conv_feats = fluid.layers.sequence_pool(
+                input=conv_feats, pool_type='max')
+            convs.append(conv_feats)
+
+        cls_feats = fluid.layers.concat(input=convs, axis=1)
+
+    elif args.sub_model_type == 'gru':
+        bert_seq_out = bert.get_sequence_output()
+        bert_seq_out = fluid.layers.sequence_unpad(bert_seq_out, seq_len)
+        gru_hidden_size = 1024
+        gru_input = fluid.layers.fc(input=bert_seq_out,
+                                    size=gru_hidden_size * 3)
+        gru_forward = fluid.layers.dynamic_gru(
+            input=gru_input, size=gru_hidden_size, is_reverse=False)
+        gru_backward = fluid.layers.dynamic_gru(
+            input=gru_input, size=gru_hidden_size, is_reverse=True)
+        gru_output = fluid.layers.concat([gru_forward, gru_backward], axis=1)
+        cls_feats = fluid.layers.sequence_pool(
+            input=gru_output, pool_type='max')
+
+    elif args.sub_model_type == 'ffa':
+        bert_seq_out = bert.get_sequence_output()
+        attn = fluid.layers.fc(input=bert_seq_out,
+                               num_flatten_dims=2,
+                               size=1,
+                               act='tanh')
+        attn = fluid.layers.softmax(attn)
+        weighted_input = bert_seq_out * attn
+        weighted_input = fluid.layers.sequence_unpad(weighted_input, seq_len)
+        cls_feats = fluid.layers.sequence_pool(weighted_input, pool_type='sum')
+
+    else:
+        raise NotImplementedError("%s is not implemented!" %
+                                  args.sub_model_type)
+
+    cls_feats = fluid.layers.dropout(
+        x=cls_feats,
+        dropout_prob=0.1,
+        dropout_implementation="upscale_in_train")
+
+    logits = fluid.layers.fc(
+        input=cls_feats,
+        size=num_labels,
+        param_attr=fluid.ParamAttr(
+            name="cls_out_w",
+            initializer=fluid.initializer.TruncatedNormal(scale=0.02)),
+        bias_attr=fluid.ParamAttr(
+            name="cls_out_b", initializer=fluid.initializer.Constant(0.)))
+    probs = fluid.layers.softmax(logits)
+
+    if is_prediction:
+        feed_targets_name = [
+            src_ids.name, pos_ids.name, sent_ids.name, input_mask.name
+        ]
+        return pyreader, probs, feed_targets_name
+
+    ce_loss = fluid.layers.softmax_with_cross_entropy(
+        logits=logits, label=labels)
+    loss = fluid.layers.mean(x=ce_loss)
+
+    if args.use_fp16 and args.loss_scaling > 1.0:
+        loss *= args.loss_scaling
+
+    num_seqs = fluid.layers.create_tensor(dtype='int64')
+    accuracy = fluid.layers.accuracy(input=probs, label=labels, total=num_seqs)
+
+    return (pyreader, loss, probs, accuracy, labels, num_seqs)
--- a/PaddleNLP/Research/NAACL2019-MPM/data/keywords
+++ b/PaddleNLP/Research/NAACL2019-MPM/data/keywords
+should
+be
+Please
+please
+add
+Allow
+could
+Add
+make
+need
+Make
+like
+Provide
+for
+to
+needs
+support
+a
+fix
+allow
+provide
+Feedly
+suggest
+Create
+want
+so
+option
+would
+remove
+maybe
+API
+feedly
+us
+give
+integration
+google
+least
+nice
+better
+as
+back
+feeds
+wish
+control
+games
+by
+Should
+helpful
+also
+function
+d
+bring
+must
+XAML
+might
+Can
+custom
+default
+RSS
+if
+history
+Let
+author
+let
+in
+developers
+lock
+from
+engine
+too
+Include
+Could
+useful
+textbox
+feed
+allowing
+can
+Get
+feedback
+extend
+attributes
+without
+user
+Enable
+Would
+or
+property
+into
+functionality
+specific
+APP
+Change
+love
+possibility
+ALL
+Give
+Remove
+ability
+more
+display
+including
+enable
+Dialog
+Twitter
+improve
+mark
+If
+UWP
+single
+information
+consider
+concept
+clock
+multi
+performance
+minute
+suggestion
+Update
+reset
+OneDrive
+through
+keyboard
+specified
+Bring
+And
+net
+really
+wanted
+tools
+So
+include
+these
+service
+articles
+Adding
+Maybe
+life
+controller
+screenshots
+manifest
+making
+Project
+users
+with
+filters
+email
+straight
+Why
+think
+optional
+bar
+trust
+needed
+Have
+APIs
+full
+based
+3rd
+unless
+greatly
+e
+great
+Use
+enabled
+Center
+Allowing
+Preview
+see
+string
+adding
+individual
+events
+downloading
+ru
+we
+re
+window
+everyone
+priority
+percentage
+OPML
+method
+Download
+ShowAsync
+export
+cool
+Cortana
+localization
+your
+case
+per
+opinion
--- a/PaddleNLP/Research/NAACL2019-MPM/data/mpm.png
+++ b/PaddleNLP/Research/NAACL2019-MPM/data/mpm.png
--- a/PaddleNLP/Research/NAACL2019-MPM/evaluation.py
+++ b/PaddleNLP/Research/NAACL2019-MPM/evaluation.py
+#   Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""script for ensemble and evaluation."""
+
+import os
+import sys
+import csv
+import numpy as np
+from sklearn.metrics import f1_score
+
+label_file = sys.argv[1]
+prob_file_1 = sys.argv[2]
+prob_file_2 = sys.argv[3]
+prob_file_3 = sys.argv[4]
+prob_file_4 = sys.argv[5]
+
+
+def get_labels(input_file):
+    """
+    get labels labels true labels file.
+    """
+    readers = csv.reader(open(input_file, "r"), delimiter=',')
+    lines = []
+    for line in readers:
+        lines.append(int(line[2]))
+    return lines
+
+
+def get_probs(input_file):
+    """
+    get probs from input file.
+    """
+    return [float(i.strip('\n')) for i in open(input_file)]
+
+
+def get_pred(probs, threshold=0.5):
+    """
+    get prediction from probs.
+    """
+    pred = []
+    for p in probs:
+        if p >= threshold:
+            pred.append(1)
+        else:
+            pred.append(0)
+    return pred
+
+
+def vote(pred_list):
+    """
+    get vote result from prediction list.
+    """
+    pred_list = np.array(pred_list).transpose()
+    preds = []
+    for p in pred_list:
+        counts = np.bincount(p)
+        preds.append(np.argmax(counts))
+    return preds
+
+
+def cal_f1(preds, labels):
+    """
+    calculate f1 score.
+    """
+    return f1_score(np.array(labels), np.array(preds))
+
+
+labels = get_labels(label_file)
+
+file_list = [prob_file_1, prob_file_2, prob_file_3, prob_file_4]
+pred_list = []
+for f in file_list:
+    pred_list.append(get_pred(get_probs(f)))
+
+pred_ensemble = vote(pred_list)
+
+print("all model ensemble(vote) f1: %.5f " % cal_f1(pred_ensemble, labels))
--- a/PaddleNLP/Research/NAACL2019-MPM/reader.py
+++ b/PaddleNLP/Research/NAACL2019-MPM/reader.py
--- a/PaddleNLP/Research/NAACL2019-MPM/run_classifier.py
+++ b/PaddleNLP/Research/NAACL2019-MPM/run_classifier.py
--- a/PaddleNLP/Research/NAACL2019-MPM/train.sh
+++ b/PaddleNLP/Research/NAACL2019-MPM/train.sh
+#!/bin/sh
+export CUDA_VISIBLE_DEVICES=0
+output_dir=./output
+prob_dir=./probs
+bert_dir=./uncased_L-24_H-1024_A-16
+mkdir -p $output_dir
+mkdir -p $prob_dir
+
+for model_type in raw cnn gru ffa
+do
+    python run_classifier.py \
+            --bert_config_path ${bert_dir}/bert_config.json \
+            --checkpoints ${output_dir}/bert_large_${model_type} \
+            --init_pretraining_params ${bert_dir}/params \
+            --data_dir ./data/Subtask-A \
+            --vocab_path ${bert_dir}/vocab.txt \
+            --task_name sem \
+            --sub_model_type ${model_type} \
+            --max_seq_len 128 \
+            --batch_size 32 \
+            --random_seed 777 \
+            --save_steps 200 \
+            --validation_steps 200 \
+            --drop_keyword True 
+
+    mv ${output_dir}/bert_large_${model_type}/prob.txt ${prob_dir}/prob_${model_type}.txt
+done
+
--- a/PaddleNLP/Research/README.md
+++ b/PaddleNLP/Research/README.md
+## PaddleNLP for Research
+Provide the most advanced, powerful and professional research papers.
+Fully open code and datasets to enable researchers to quickly understand the NLP frontier direction and information.