未验证 提交 1352e3d3 编写于 作者: 骑马小猫 提交者: GitHub

add paddlenlp community models (#5660)

* update project

* update icon and keyword
上级 747a474a
# 模型列表
## CLTL/MedRoBERTa.nl
| 模型名称 | 模型介绍 | 模型大小 | 模型下载 |
| --- | --- | --- | --- |
|CLTL/MedRoBERTa.nl| | 633.14MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/CLTL/MedRoBERTa.nl/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/CLTL/MedRoBERTa.nl/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/CLTL/MedRoBERTa.nl/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/CLTL/MedRoBERTa.nl/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/CLTL/MedRoBERTa.nl/vocab.json)<br>[vocab.txt](https://bj.bcebos.com/paddlenlp/models/community/CLTL/MedRoBERTa.nl/vocab.txt) |
也可以通过`paddlenlp` cli 工具来下载对应的模型权重,使用步骤如下所示:
* 安装paddlenlp
```shell
pip install --upgrade paddlenlp
```
* 下载命令行
```shell
paddlenlp download --cache-dir ./pretrained_models CLTL/MedRoBERTa.nl
```
有任何下载的问题都可以到[PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP)中发Issue提问。
\ No newline at end of file
# model list
##
| model | description | model_size | download |
| --- | --- | --- | --- |
|CLTL/MedRoBERTa.nl| | 633.14MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/CLTL/MedRoBERTa.nl/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/CLTL/MedRoBERTa.nl/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/CLTL/MedRoBERTa.nl/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/CLTL/MedRoBERTa.nl/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/CLTL/MedRoBERTa.nl/vocab.json)<br>[vocab.txt](https://bj.bcebos.com/paddlenlp/models/community/CLTL/MedRoBERTa.nl/vocab.txt) |
or you can download all of model file with the following steps:
* install paddlenlp
```shell
pip install --upgrade paddlenlp
```
* download model with cli tool
```shell
paddlenlp download --cache-dir ./pretrained_models CLTL/MedRoBERTa.nl
```
If you have any problems with it, you can post issue on [PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP) to get support.
Model_Info:
name: "CLTL/MedRoBERTa.nl"
description: "MedRoBERTa.nl"
description_en: "MedRoBERTa.nl"
icon: ""
from_repo: "https://huggingface.co/CLTL/MedRoBERTa.nl"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: ""
Publisher: "CLTL"
License: "mit"
Language: "Dutch"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
Datasets: conll2003
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: mit
Model_Info: Model_Info:
name: "Jean-Baptiste/roberta-large-ner-english" description: 'roberta-large-ner-english: model fine-tuned from roberta-large for
description: "roberta-large-ner-english: model fine-tuned from roberta-large for NER task" NER task'
description_en: "roberta-large-ner-english: model fine-tuned from roberta-large for NER task" description_en: 'roberta-large-ner-english: model fine-tuned from roberta-large
icon: "" for NER task'
from_repo: "https://huggingface.co/Jean-Baptiste/roberta-large-ner-english" from_repo: https://huggingface.co/Jean-Baptiste/roberta-large-ner-english
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: Jean-Baptiste/roberta-large-ner-english
Paper: null
Publisher: Jean-Baptiste
Task: Task:
- tag_en: "Natural Language Processing" - sub_tag: Token分类
tag: "自然语言处理" sub_tag_en: Token Classification
sub_tag_en: "Token Classification" tag: 自然语言处理
sub_tag: "Token分类" tag_en: Natural Language Processing
Example:
Datasets: "conll2003"
Publisher: "Jean-Baptiste"
License: "mit"
Language: "English"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
{
"cells": [
{
"cell_type": "markdown",
"id": "743f4950",
"metadata": {},
"source": [
"# roberta-large-ner-english: model fine-tuned from roberta-large for NER task\n"
]
},
{
"cell_type": "markdown",
"id": "0d517a6d",
"metadata": {},
"source": [
"## Introduction\n"
]
},
{
"cell_type": "markdown",
"id": "bbb5e934",
"metadata": {},
"source": [
"roberta-large-ner-english is an english NER model that was fine-tuned from roberta-large on conll2003 dataset.\n",
"Model was validated on emails/chat data and outperformed other models on this type of data specifically.\n",
"In particular the model seems to work better on entity that don't start with an upper case.\n"
]
},
{
"cell_type": "markdown",
"id": "a13117c3",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b9e58955",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "db077413",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"Jean-Baptiste/roberta-large-ner-english\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "86ae5e96",
"metadata": {},
"source": [
"For those who could be interested, here is a short article on how I used the results of this model to train a LSTM model for signature detection in emails:\n",
"https://medium.com/@jean-baptiste.polle/lstm-model-for-email-signature-detection-8e990384fefa\n",
"\n",
"\n",
"> 此模型介绍及权重来源于[https://huggingface.co/Jean-Baptiste/roberta-large-ner-english](https://huggingface.co/Jean-Baptiste/roberta-large-ner-english),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "b0541e6a",
"metadata": {},
"source": [
"# roberta-large-ner-english: model fine-tuned from roberta-large for NER task\n"
]
},
{
"cell_type": "markdown",
"id": "c85540d7",
"metadata": {},
"source": [
"## Introduction\n"
]
},
{
"cell_type": "markdown",
"id": "c2e2ebde",
"metadata": {},
"source": [
"roberta-large-ner-english is an english NER model that was fine-tuned from roberta-large on conll2003 dataset.\n",
"Model was validated on emails/chat data and outperformed other models on this type of data specifically.\n",
"In particular the model seems to work better on entity that don't start with an upper case.\n"
]
},
{
"cell_type": "markdown",
"id": "4f6d5dbe",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a159cf92",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "daa60299",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"Jean-Baptiste/roberta-large-ner-english\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "2a66154e",
"metadata": {},
"source": [
"For those who could be interested, here is a short article on how I used the results of this model to train a LSTM model for signature detection in emails:\n",
"https://medium.com/@jean-baptiste.polle/lstm-model-for-email-signature-detection-8e990384fefa\n",
"\n",
"> The model introduction and model weights originate from https://huggingface.co/Jean-Baptiste/roberta-large-ner-english and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: Chinese
License: apache-2.0
Model_Info: Model_Info:
name: "Langboat/mengzi-bert-base-fin" description: Mengzi-BERT base fin model (Chinese)
description: "Mengzi-BERT base fin model (Chinese)" description_en: Mengzi-BERT base fin model (Chinese)
description_en: "Mengzi-BERT base fin model (Chinese)" from_repo: https://huggingface.co/Langboat/mengzi-bert-base-fin
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/Langboat/mengzi-bert-base-fin" name: Langboat/mengzi-bert-base-fin
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: ""
Publisher: "Langboat"
License: "apache-2.0"
Language: "Chinese"
Paper: Paper:
- title: 'Mengzi: Towards Lightweight yet Ingenious Pre-trained Models for Chinese' - title: 'Mengzi: Towards Lightweight yet Ingenious Pre-trained Models for Chinese'
url: 'http://arxiv.org/abs/2110.06696v2' url: http://arxiv.org/abs/2110.06696v2
IfTraining: 0 Publisher: Langboat
IfOnlineDemo: 0 Task:
\ No newline at end of file - sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "18d5c43e",
"metadata": {},
"source": [
"# Mengzi-BERT base fin model (Chinese)\n",
"Continue trained mengzi-bert-base with 20G financial news and research reports. Masked language modeling(MLM), part-of-speech(POS) tagging and sentence order prediction(SOP) are used as training task.\n"
]
},
{
"cell_type": "markdown",
"id": "9aa78f76",
"metadata": {},
"source": [
"[Mengzi: Towards Lightweight yet Ingenious Pre-trained Models for Chinese](https://arxiv.org/abs/2110.06696)\n"
]
},
{
"cell_type": "markdown",
"id": "12bbac99",
"metadata": {},
"source": [
"## Usage\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3b18fe48",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1bb0e345",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"Langboat/mengzi-bert-base-fin\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "a8d785f4",
"metadata": {},
"source": [
"```\n",
"@misc{zhang2021mengzi,\n",
"title={Mengzi: Towards Lightweight yet Ingenious Pre-trained Models for Chinese},\n",
"author={Zhuosheng Zhang and Hanqing Zhang and Keming Chen and Yuhang Guo and Jingyun Hua and Yulong Wang and Ming Zhou},\n",
"year={2021},\n",
"eprint={2110.06696},\n",
"archivePrefix={arXiv},\n",
"primaryClass={cs.CL}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "ceb1547c",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/Langboat/mengzi-bert-base-fin](https://huggingface.co/Langboat/mengzi-bert-base-fin),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "752656a4",
"metadata": {},
"source": [
"# Mengzi-BERT base fin model (Chinese)\n",
"Continue trained mengzi-bert-base with 20G financial news and research reports. Masked language modeling(MLM), part-of-speech(POS) tagging and sentence order prediction(SOP) are used as training task.\n"
]
},
{
"cell_type": "markdown",
"id": "26c65092",
"metadata": {},
"source": [
"[Mengzi: Towards Lightweight yet Ingenious Pre-trained Models for Chinese](https://arxiv.org/abs/2110.06696)\n"
]
},
{
"cell_type": "markdown",
"id": "ea5404c7",
"metadata": {},
"source": [
"## Usage\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ebeb5daa",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d2c66056",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"Langboat/mengzi-bert-base-fin\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "a39809dc",
"metadata": {},
"source": [
"```\n",
"@misc{zhang2021mengzi,\n",
"title={Mengzi: Towards Lightweight yet Ingenious Pre-trained Models for Chinese},\n",
"author={Zhuosheng Zhang and Hanqing Zhang and Keming Chen and Yuhang Guo and Jingyun Hua and Yulong Wang and Ming Zhou},\n",
"year={2021},\n",
"eprint={2110.06696},\n",
"archivePrefix={arXiv},\n",
"primaryClass={cs.CL}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "f25bda96",
"metadata": {},
"source": [
"> The model introduction and model weights originate from https://huggingface.co/Langboat/mengzi-bert-base-fin and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
# 模型列表
## PlanTL-GOB-ES/roberta-base-biomedical-clinical-es
| 模型名称 | 模型介绍 | 模型大小 | 模型下载 |
| --- | --- | --- | --- |
|PlanTL-GOB-ES/roberta-base-biomedical-clinical-es| | 633.14MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-clinical-es/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-clinical-es/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-clinical-es/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-clinical-es/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-clinical-es/vocab.json)<br>[vocab.txt](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-clinical-es/vocab.txt) |
也可以通过`paddlenlp` cli 工具来下载对应的模型权重,使用步骤如下所示:
* 安装paddlenlp
```shell
pip install --upgrade paddlenlp
```
* 下载命令行
```shell
paddlenlp download --cache-dir ./pretrained_models PlanTL-GOB-ES/roberta-base-biomedical-clinical-es
```
有任何下载的问题都可以到[PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP)中发Issue提问。
\ No newline at end of file
# model list
##
| model | description | model_size | download |
| --- | --- | --- | --- |
|PlanTL-GOB-ES/roberta-base-biomedical-clinical-es| | 633.14MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-clinical-es/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-clinical-es/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-clinical-es/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-clinical-es/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-clinical-es/vocab.json)<br>[vocab.txt](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-clinical-es/vocab.txt) |
or you can download all of model file with the following steps:
* install paddlenlp
```shell
pip install --upgrade paddlenlp
```
* download model with cli tool
```shell
paddlenlp download --cache-dir ./pretrained_models PlanTL-GOB-ES/roberta-base-biomedical-clinical-es
```
If you have any problems with it, you can post issue on [PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP) to get support.
Model_Info:
name: "PlanTL-GOB-ES/roberta-base-biomedical-clinical-es"
description: "Biomedical-clinical language model for Spanish"
description_en: "Biomedical-clinical language model for Spanish"
icon: ""
from_repo: "https://huggingface.co/PlanTL-GOB-ES/roberta-base-biomedical-clinical-es"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: ""
Publisher: "PlanTL-GOB-ES"
License: "apache-2.0"
Language: "Spanish"
Paper:
- title: 'Biomedical and Clinical Language Models for Spanish: On the Benefits of Domain-Specific Pretraining in a Mid-Resource Scenario'
url: 'http://arxiv.org/abs/2109.03570v2'
- title: 'Spanish Biomedical Crawled Corpus: A Large, Diverse Dataset for Spanish Biomedical Language Models'
url: 'http://arxiv.org/abs/2109.07765v1'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
# 模型列表
## PlanTL-GOB-ES/roberta-base-biomedical-es
| 模型名称 | 模型介绍 | 模型大小 | 模型下载 |
| --- | --- | --- | --- |
|PlanTL-GOB-ES/roberta-base-biomedical-es| | 633.14MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-es/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-es/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-es/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-es/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-es/vocab.json) |
也可以通过`paddlenlp` cli 工具来下载对应的模型权重,使用步骤如下所示:
* 安装paddlenlp
```shell
pip install --upgrade paddlenlp
```
* 下载命令行
```shell
paddlenlp download --cache-dir ./pretrained_models PlanTL-GOB-ES/roberta-base-biomedical-es
```
有任何下载的问题都可以到[PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP)中发Issue提问。
\ No newline at end of file
# model list
##
| model | description | model_size | download |
| --- | --- | --- | --- |
|PlanTL-GOB-ES/roberta-base-biomedical-es| | 633.14MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-es/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-es/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-es/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-es/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-es/vocab.json) |
or you can download all of model file with the following steps:
* install paddlenlp
```shell
pip install --upgrade paddlenlp
```
* download model with cli tool
```shell
paddlenlp download --cache-dir ./pretrained_models PlanTL-GOB-ES/roberta-base-biomedical-es
```
If you have any problems with it, you can post issue on [PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP) to get support.
Model_Info:
name: "PlanTL-GOB-ES/roberta-base-biomedical-es"
description: "Biomedical language model for Spanish"
description_en: "Biomedical language model for Spanish"
icon: ""
from_repo: "https://huggingface.co/PlanTL-GOB-ES/roberta-base-biomedical-es"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: ""
Publisher: "PlanTL-GOB-ES"
License: "apache-2.0"
Language: "Spanish"
Paper:
- title: 'Biomedical and Clinical Language Models for Spanish: On the Benefits of Domain-Specific Pretraining in a Mid-Resource Scenario'
url: 'http://arxiv.org/abs/2109.03570v2'
- title: 'Spanish Biomedical Crawled Corpus: A Large, Diverse Dataset for Spanish Biomedical Language Models'
url: 'http://arxiv.org/abs/2109.07765v1'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
# 模型列表
## PlanTL-GOB-ES/roberta-base-ca
| 模型名称 | 模型介绍 | 模型大小 | 模型下载 |
| --- | --- | --- | --- |
|PlanTL-GOB-ES/roberta-base-ca| | 633.14MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-ca/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-ca/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-ca/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-ca/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-ca/vocab.json) |
也可以通过`paddlenlp` cli 工具来下载对应的模型权重,使用步骤如下所示:
* 安装paddlenlp
```shell
pip install --upgrade paddlenlp
```
* 下载命令行
```shell
paddlenlp download --cache-dir ./pretrained_models PlanTL-GOB-ES/roberta-base-ca
```
有任何下载的问题都可以到[PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP)中发Issue提问。
\ No newline at end of file
# model list
##
| model | description | model_size | download |
| --- | --- | --- | --- |
|PlanTL-GOB-ES/roberta-base-ca| | 633.14MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-ca/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-ca/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-ca/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-ca/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-ca/vocab.json) |
or you can download all of model file with the following steps:
* install paddlenlp
```shell
pip install --upgrade paddlenlp
```
* download model with cli tool
```shell
paddlenlp download --cache-dir ./pretrained_models PlanTL-GOB-ES/roberta-base-ca
```
If you have any problems with it, you can post issue on [PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP) to get support.
Model_Info:
name: "PlanTL-GOB-ES/roberta-base-ca"
description: "BERTa: RoBERTa-based Catalan language model"
description_en: "BERTa: RoBERTa-based Catalan language model"
icon: ""
from_repo: "https://huggingface.co/PlanTL-GOB-ES/roberta-base-ca"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: ""
Publisher: "PlanTL-GOB-ES"
License: "apache-2.0"
Language: "Catalan"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
Datasets: xnli
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: Spanish
License: mit
Model_Info: Model_Info:
name: "Recognai/bert-base-spanish-wwm-cased-xnli" description: bert-base-spanish-wwm-cased-xnli
description: "bert-base-spanish-wwm-cased-xnli" description_en: bert-base-spanish-wwm-cased-xnli
description_en: "bert-base-spanish-wwm-cased-xnli" from_repo: https://huggingface.co/Recognai/bert-base-spanish-wwm-cased-xnli
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/Recognai/bert-base-spanish-wwm-cased-xnli" name: Recognai/bert-base-spanish-wwm-cased-xnli
Paper: null
Publisher: Recognai
Task: Task:
- tag_en: "Natural Language Processing" - sub_tag: 零样本分类
tag: "自然语言处理" sub_tag_en: Zero-Shot Classification
sub_tag_en: "Zero-Shot Classification" tag: 自然语言处理
sub_tag: "零样本分类" tag_en: Natural Language Processing
- tag_en: "Natural Language Processing" - sub_tag: 文本分类
tag: "自然语言处理" sub_tag_en: Text Classification
sub_tag_en: "Text Classification" tag: 自然语言处理
sub_tag: "文本分类" tag_en: Natural Language Processing
Example:
Datasets: "xnli"
Publisher: "Recognai"
License: "mit"
Language: "Spanish"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
{
"cells": [
{
"cell_type": "markdown",
"id": "0b1e9532",
"metadata": {},
"source": [
"# bert-base-spanish-wwm-cased-xnli\n"
]
},
{
"cell_type": "markdown",
"id": "2b09a9af",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "e348457b",
"metadata": {},
"source": [
"This model is a fine-tuned version of the [spanish BERT model](https://huggingface.co/dccuchile/bert-base-spanish-wwm-cased) with the Spanish portion of the XNLI dataset. \n"
]
},
{
"cell_type": "markdown",
"id": "6643a3b7",
"metadata": {},
"source": [
"### How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8475d429",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ced3e559",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"Recognai/bert-base-spanish-wwm-cased-xnli\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "47419faf",
"metadata": {},
"source": [
"## Eval results\n"
]
},
{
"cell_type": "markdown",
"id": "9b87e64b",
"metadata": {},
"source": [
"Accuracy for the test set:\n"
]
},
{
"cell_type": "markdown",
"id": "7be74f6f",
"metadata": {},
"source": [
"| | XNLI-es |\n",
"|-----------------------------|---------|\n",
"|bert-base-spanish-wwm-cased-xnli | 79.9% |\n",
"> 此模型介绍及权重来源于[https://huggingface.co/Recognai/bert-base-spanish-wwm-cased-xnli](https://huggingface.co/Recognai/bert-base-spanish-wwm-cased-xnli),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "7a8a1587",
"metadata": {},
"source": [
"# bert-base-spanish-wwm-cased-xnli\n"
]
},
{
"cell_type": "markdown",
"id": "210c8e3a",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "fe16ef03",
"metadata": {},
"source": [
"This model is a fine-tuned version of the spanish BERT model with the Spanish portion of the XNLI dataset.\n"
]
},
{
"cell_type": "markdown",
"id": "b23d27b0",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "37e5b840",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "117b1e15",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"Recognai/bert-base-spanish-wwm-cased-xnli\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "65669489",
"metadata": {},
"source": [
"## Eval results\n",
"\n",
"Accuracy for the test set:\n",
"\n",
"| | XNLI-es |\n",
"|-----------------------------|---------|\n",
"|bert-base-spanish-wwm-cased-xnli | 79.9% |\n",
"\n",
"> The model introduction and model weights originate from https://huggingface.co/Recognai/bert-base-spanish-wwm-cased-xnli and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
# 模型列表
## allenai/macaw-3b
| 模型名称 | 模型介绍 | 模型大小 | 模型下载 |
| --- | --- | --- | --- |
|allenai/macaw-3b| | 10.99G | [model_config.json](https://bj.bcebos.com/paddlenlp/models/community/allenai/macaw-3b/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/allenai/macaw-3b/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/allenai/macaw-3b/tokenizer_config.json) |
也可以通过`paddlenlp` cli 工具来下载对应的模型权重,使用步骤如下所示:
* 安装paddlenlp
```shell
pip install --upgrade paddlenlp
```
* 下载命令行
```shell
paddlenlp download --cache-dir ./pretrained_models allenai/macaw-3b
```
有任何下载的问题都可以到[PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP)中发Issue提问。
\ No newline at end of file
# model list
##
| model | description | model_size | download |
| --- | --- | --- | --- |
|allenai/macaw-3b| | 10.99G | [model_config.json](https://bj.bcebos.com/paddlenlp/models/community/allenai/macaw-3b/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/allenai/macaw-3b/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/allenai/macaw-3b/tokenizer_config.json) |
or you can download all of model file with the following steps:
* install paddlenlp
```shell
pip install --upgrade paddlenlp
```
* download model with cli tool
```shell
paddlenlp download --cache-dir ./pretrained_models allenai/macaw-3b
```
If you have any problems with it, you can post issue on [PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP) to get support.
Model_Info:
name: "allenai/macaw-3b"
description: "macaw-3b"
description_en: "macaw-3b"
icon: ""
from_repo: "https://huggingface.co/allenai/macaw-3b"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text2Text Generation"
sub_tag: "文本生成"
Example:
Datasets: ""
Publisher: "allenai"
License: "apache-2.0"
Language: "English"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info: Model_Info:
name: "allenai/macaw-large" description: macaw-large
description: "macaw-large" description_en: macaw-large
description_en: "macaw-large" from_repo: https://huggingface.co/allenai/macaw-large
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/allenai/macaw-large" name: allenai/macaw-large
Paper: null
Publisher: allenai
Task: Task:
- tag_en: "Natural Language Processing" - sub_tag: 文本生成
tag: "自然语言处理" sub_tag_en: Text2Text Generation
sub_tag_en: "Text2Text Generation" tag: 自然语言处理
sub_tag: "文本生成" tag_en: Natural Language Processing
Example:
Datasets: ""
Publisher: "allenai"
License: "apache-2.0"
Language: "English"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
{
"cells": [
{
"cell_type": "markdown",
"id": "d50965ae",
"metadata": {},
"source": [
"# macaw-large\n",
"\n",
"## Model description\n",
"\n",
"Macaw (<b>M</b>ulti-<b>a</b>ngle <b>c</b>(q)uestion <b>a</b>ns<b>w</b>ering) is a ready-to-use model capable of\n",
"general question answering,\n",
"showing robustness outside the domains it was trained on. It has been trained in \"multi-angle\" fashion,\n",
"which means it can handle a flexible set of input and output \"slots\"\n",
"(question, answer, multiple-choice options, context, and explanation) .\n",
"\n",
"Macaw was built on top of [T5](https://github.com/google-research/text-to-text-transfer-transformer) and comes in\n",
"three sizes: macaw-11b, macaw-3b,\n",
"and macaw-large, as well as an answer-focused version featured on\n",
"various leaderboards macaw-answer-11b.\n",
"\n",
"See https://github.com/allenai/macaw for more details."
]
},
{
"cell_type": "markdown",
"id": "1c0bce56",
"metadata": {},
"source": [
"## How to Use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "cb7a2c88",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b0fd69ae",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"allenai/macaw-large\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "955d0705",
"metadata": {},
"source": [
"## Reference\n",
"\n",
"> 此模型介绍及权重来源于[https://huggingface.co/allenai/macaw-large](https://huggingface.co/allenai/macaw-large),并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "f5a296e3",
"metadata": {},
"source": [
"# macaw-large\n",
"\n",
"## Model description\n",
"\n",
"Macaw (<b>M</b>ulti-<b>a</b>ngle <b>c</b>(q)uestion <b>a</b>ns<b>w</b>ering) is a ready-to-use model capable of\n",
"general question answering,\n",
"showing robustness outside the domains it was trained on. It has been trained in \"multi-angle\" fashion,\n",
"which means it can handle a flexible set of input and output \"slots\"\n",
"(question, answer, multiple-choice options, context, and explanation) .\n",
"\n",
"Macaw was built on top of [T5](https://github.com/google-research/text-to-text-transfer-transformer) and comes in\n",
"three sizes: macaw-11b, macaw-3b,\n",
"and macaw-large, as well as an answer-focused version featured on\n",
"various leaderboards macaw-answer-11b.\n",
"\n",
"See https://github.com/allenai/macaw for more details."
]
},
{
"cell_type": "markdown",
"id": "27cf8ebc",
"metadata": {},
"source": [
"## How to Use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "027c735c",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8f52c07a",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"allenai/macaw-large\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "ce759903",
"metadata": {},
"source": [
"## Reference\n",
"\n",
"> The model introduction and model weights originate from [https://huggingface.co/allenai/macaw-large](https://huggingface.co/allenai/macaw-large) and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info: Model_Info:
name: "allenai/specter" description: SPECTER
description: "SPECTER" description_en: SPECTER
description_en: "SPECTER" from_repo: https://huggingface.co/allenai/specter
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/allenai/specter" name: allenai/specter
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Feature Extraction"
sub_tag: "特征抽取"
Example:
Datasets: ""
Publisher: "allenai"
License: "apache-2.0"
Language: "English"
Paper: Paper:
- title: 'SPECTER: Document-level Representation Learning using Citation-informed Transformers' - title: 'SPECTER: Document-level Representation Learning using Citation-informed
url: 'http://arxiv.org/abs/2004.07180v4' Transformers'
IfTraining: 0 url: http://arxiv.org/abs/2004.07180v4
IfOnlineDemo: 0 Publisher: allenai
\ No newline at end of file Task:
- sub_tag: 特征抽取
sub_tag_en: Feature Extraction
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "a5b54f39",
"metadata": {},
"source": [
"## SPECTER\n",
"\n",
"SPECTER is a pre-trained language model to generate document-level embedding of documents. It is pre-trained on a a powerful signal of document-level relatedness: the citation graph. Unlike existing pretrained language models, SPECTER can be easily applied to downstream applications without task-specific fine-tuning.\n",
"\n",
"Paper: [SPECTER: Document-level Representation Learning using Citation-informed Transformers](https://arxiv.org/pdf/2004.07180.pdf)\n",
"\n",
"Original Repo: [Github](https://github.com/allenai/specter)\n",
"\n",
"Evaluation Benchmark: [SciDocs](https://github.com/allenai/scidocs)\n",
"\n",
"Authors: *Arman Cohan, Sergey Feldman, Iz Beltagy, Doug Downey, Daniel S. Weld*"
]
},
{
"cell_type": "markdown",
"id": "e279b43d",
"metadata": {},
"source": [
"## How to Use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3dcf4e0b",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7348a84e",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"allenai/specter\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "89c70552",
"metadata": {},
"source": [
"## Reference\n",
"\n",
"> 此模型介绍及权重来源于[https://huggingface.co/allenai/specter](https://huggingface.co/allenai/specter),并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "a09f5723",
"metadata": {},
"source": [
"## SPECTER\n",
"\n",
"SPECTER is a pre-trained language model to generate document-level embedding of documents. It is pre-trained on a a powerful signal of document-level relatedness: the citation graph. Unlike existing pretrained language models, SPECTER can be easily applied to downstream applications without task-specific fine-tuning.\n",
"\n",
"Paper: [SPECTER: Document-level Representation Learning using Citation-informed Transformers](https://arxiv.org/pdf/2004.07180.pdf)\n",
"\n",
"Original Repo: [Github](https://github.com/allenai/specter)\n",
"\n",
"Evaluation Benchmark: [SciDocs](https://github.com/allenai/scidocs)\n",
"\n",
"Authors: *Arman Cohan, Sergey Feldman, Iz Beltagy, Doug Downey, Daniel S. Weld*"
]
},
{
"cell_type": "markdown",
"id": "b62bbb59",
"metadata": {},
"source": [
"## How to Use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2dff923a",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e60739cc",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"allenai/specter\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "cd668864",
"metadata": {},
"source": [
"## Reference\n",
"\n",
"> The model introduction and model weights originate from [https://huggingface.co/allenai/specter](https://huggingface.co/allenai/specter) and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info: Model_Info:
name: "alvaroalon2/biobert_chemical_ner" description: ''
description: "" description_en: ''
description_en: "" from_repo: https://huggingface.co/alvaroalon2/biobert_chemical_ner
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/alvaroalon2/biobert_chemical_ner" name: alvaroalon2/biobert_chemical_ner
Paper: null
Publisher: alvaroalon2
Task: Task:
- tag_en: "Natural Language Processing" - sub_tag: Token分类
tag: "自然语言处理" sub_tag_en: Token Classification
sub_tag_en: "Token Classification" tag: 自然语言处理
sub_tag: "Token分类" tag_en: Natural Language Processing
Example:
Datasets: ""
Publisher: "alvaroalon2"
License: "apache-2.0"
Language: "English"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
{
"cells": [
{
"cell_type": "markdown",
"id": "0b8f2339",
"metadata": {},
"source": [
"## Overview\n",
"\n",
"BioBERT model fine-tuned in NER task with BC5CDR-chemicals and BC4CHEMD corpus.\n",
"\n",
"This was fine-tuned in order to use it in a BioNER/BioNEN system which is available at: https://github.com/librairy/bio-ner"
]
},
{
"cell_type": "markdown",
"id": "934c3f34",
"metadata": {},
"source": [
"## How to Use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a8516341",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "70114f31",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"alvaroalon2/biobert_chemical_ner\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "fb7b2eb8",
"metadata": {},
"source": [
"## Reference\n",
"\n",
"> 此模型介绍及权重来源于:[https://huggingface.co/alvaroalon2/biobert_chemical_ner](https://huggingface.co/alvaroalon2/biobert_chemical_ner),并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "f769316b",
"metadata": {},
"source": [
"## Overview\n",
"\n",
"BioBERT model fine-tuned in NER task with BC5CDR-chemicals and BC4CHEMD corpus.\n",
"\n",
"This was fine-tuned in order to use it in a BioNER/BioNEN system which is available at: https://github.com/librairy/bio-ner"
]
},
{
"cell_type": "markdown",
"id": "3a77ed26",
"metadata": {},
"source": [
"## How to Use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "202a3ef9",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "fc11d032",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"alvaroalon2/biobert_chemical_ner\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "762dee96",
"metadata": {},
"source": [
"## Reference\n",
"\n",
"> The model introduction and model weights originate from [https://huggingface.co/alvaroalon2/biobert_chemical_ner](https://huggingface.co/alvaroalon2/biobert_chemical_ner) and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ncbi_disease
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info: Model_Info:
name: "alvaroalon2/biobert_diseases_ner" description: ''
description: "" description_en: ''
description_en: "" from_repo: https://huggingface.co/alvaroalon2/biobert_diseases_ner
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/alvaroalon2/biobert_diseases_ner" name: alvaroalon2/biobert_diseases_ner
Paper: null
Publisher: alvaroalon2
Task: Task:
- tag_en: "Natural Language Processing" - sub_tag: Token分类
tag: "自然语言处理" sub_tag_en: Token Classification
sub_tag_en: "Token Classification" tag: 自然语言处理
sub_tag: "Token分类" tag_en: Natural Language Processing
Example:
Datasets: "ncbi_disease"
Publisher: "alvaroalon2"
License: "apache-2.0"
Language: "English"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
{
"cells": [
{
"cell_type": "markdown",
"id": "578bdb21",
"metadata": {},
"source": [
"## Overview\n",
"\n",
"BioBERT model fine-tuned in NER task with BC5CDR-diseases and NCBI-diseases corpus\n",
"\n",
"This was fine-tuned in order to use it in a BioNER/BioNEN system which is available at: https://github.com/librairy/bio-ner"
]
},
{
"cell_type": "markdown",
"id": "d18b8736",
"metadata": {},
"source": [
"## How to Use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9b304ea9",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "49b790e5",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"alvaroalon2/biobert_diseases_ner\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "ab48464f",
"metadata": {},
"source": [
"## Reference\n",
"\n",
"> 此模型介绍及权重来源于[https://huggingface.co/alvaroalon2/biobert_diseases_ner](https://huggingface.co/alvaroalon2/biobert_diseases_ner),并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "98591560",
"metadata": {},
"source": [
"## Overview\n",
"\n",
"BioBERT model fine-tuned in NER task with BC5CDR-diseases and NCBI-diseases corpus\n",
"\n",
"This was fine-tuned in order to use it in a BioNER/BioNEN system which is available at: https://github.com/librairy/bio-ner"
]
},
{
"cell_type": "markdown",
"id": "da577da0",
"metadata": {},
"source": [
"## How to Use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0ee7d4df",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c6dfd3c0",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"alvaroalon2/biobert_diseases_ner\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "7a58f3ef",
"metadata": {},
"source": [
"## Reference\n",
"\n",
"> The model introduction and model weights originate from [https://huggingface.co/alvaroalon2/biobert_diseases_ner](https://huggingface.co/alvaroalon2/biobert_diseases_ner) and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info: Model_Info:
name: "alvaroalon2/biobert_genetic_ner" description: ''
description: "" description_en: ''
description_en: "" from_repo: https://huggingface.co/alvaroalon2/biobert_genetic_ner
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/alvaroalon2/biobert_genetic_ner" name: alvaroalon2/biobert_genetic_ner
Paper: null
Publisher: alvaroalon2
Task: Task:
- tag_en: "Natural Language Processing" - sub_tag: Token分类
tag: "自然语言处理" sub_tag_en: Token Classification
sub_tag_en: "Token Classification" tag: 自然语言处理
sub_tag: "Token分类" tag_en: Natural Language Processing
Example:
Datasets: ""
Publisher: "alvaroalon2"
License: "apache-2.0"
Language: "English"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
{
"cells": [
{
"cell_type": "markdown",
"id": "795618b9",
"metadata": {},
"source": [
"## Overview\n",
"\n",
"BioBERT model fine-tuned in NER task with JNLPBA and BC2GM corpus for genetic class entities.\n",
"\n",
"This was fine-tuned in order to use it in a BioNER/BioNEN system which is available at: https://github.com/librairy/bio-ner"
]
},
{
"cell_type": "markdown",
"id": "bf1bde1a",
"metadata": {},
"source": [
"## How to Use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "90bf4208",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c3f9ddc9",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"alvaroalon2/biobert_genetic_ner\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "45bef570",
"metadata": {},
"source": [
"## Reference\n",
"\n",
"> 此模型介绍及权重来源于[https://huggingface.co/alvaroalon2/biobert_genetic_ner](https://huggingface.co/alvaroalon2/biobert_genetic_ner),并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "eeb5731b",
"metadata": {},
"source": [
"## Overview\n",
"\n",
"BioBERT model fine-tuned in NER task with JNLPBA and BC2GM corpus for genetic class entities.\n",
"\n",
"This was fine-tuned in order to use it in a BioNER/BioNEN system which is available at: https://github.com/librairy/bio-ner"
]
},
{
"cell_type": "markdown",
"id": "3501c0f5",
"metadata": {},
"source": [
"## How to Use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "da1caa55",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a8a173da",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"alvaroalon2/biobert_genetic_ner\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "0c74ebfe",
"metadata": {},
"source": [
"## Reference\n",
"\n",
"> The model introduction and model weights originate from [https://huggingface.co/alvaroalon2/biobert_genetic_ner](https://huggingface.co/alvaroalon2/biobert_genetic_ner) and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: ''
License: apache-2.0
Model_Info: Model_Info:
name: "amberoad/bert-multilingual-passage-reranking-msmarco" description: Passage Reranking Multilingual BERT 🔃 🌍
description: "Passage Reranking Multilingual BERT 🔃 🌍" description_en: Passage Reranking Multilingual BERT 🔃 🌍
description_en: "Passage Reranking Multilingual BERT 🔃 🌍" from_repo: https://huggingface.co/amberoad/bert-multilingual-passage-reranking-msmarco
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/amberoad/bert-multilingual-passage-reranking-msmarco" name: amberoad/bert-multilingual-passage-reranking-msmarco
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text Classification"
sub_tag: "文本分类"
Example:
Datasets: ""
Publisher: "amberoad"
License: "apache-2.0"
Language: ""
Paper: Paper:
- title: 'Passage Re-ranking with BERT' - title: Passage Re-ranking with BERT
url: 'http://arxiv.org/abs/1901.04085v5' url: http://arxiv.org/abs/1901.04085v5
IfTraining: 0 Publisher: amberoad
IfOnlineDemo: 0 Task:
\ No newline at end of file - sub_tag: 文本分类
sub_tag_en: Text Classification
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "83244d63",
"metadata": {},
"source": [
"# Passage Reranking Multilingual BERT 🔃 🌍\n"
]
},
{
"cell_type": "markdown",
"id": "4c8c922a",
"metadata": {},
"source": [
"## Model description\n",
"**Input:** Supports over 100 Languages. See [List of supported languages](https://github.com/google-research/bert/blob/master/multilingual.md#list-of-languages) for all available.\n"
]
},
{
"cell_type": "markdown",
"id": "8b40d5de",
"metadata": {},
"source": [
"**Purpose:** This module takes a search query [1] and a passage [2] and calculates if the passage matches the query.\n",
"It can be used as an improvement for Elasticsearch Results and boosts the relevancy by up to 100%.\n"
]
},
{
"cell_type": "markdown",
"id": "c9d89366",
"metadata": {},
"source": [
"**Architecture:** On top of BERT there is a Densly Connected NN which takes the 768 Dimensional [CLS] Token as input and provides the output ([Arxiv](https://arxiv.org/abs/1901.04085)).\n"
]
},
{
"cell_type": "markdown",
"id": "29745195",
"metadata": {},
"source": [
"**Output:** Just a single value between between -10 and 10. Better matching query,passage pairs tend to have a higher a score.\n"
]
},
{
"cell_type": "markdown",
"id": "010a4d92",
"metadata": {},
"source": [
"## Intended uses & limitations\n",
"Both query[1] and passage[2] have to fit in 512 Tokens.\n",
"As you normally want to rerank the first dozens of search results keep in mind the inference time of approximately 300 ms/query.\n"
]
},
{
"cell_type": "markdown",
"id": "a9f2dea7",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8d023555",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4c83eef3",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"amberoad/bert-multilingual-passage-reranking-msmarco\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "2611b122",
"metadata": {},
"source": [
"## Training data\n"
]
},
{
"cell_type": "markdown",
"id": "ba62fbe0",
"metadata": {},
"source": [
"This model is trained using the [**Microsoft MS Marco Dataset**](https://microsoft.github.io/msmarco/ \"Microsoft MS Marco\"). This training dataset contains approximately 400M tuples of a query, relevant and non-relevant passages. All datasets used for training and evaluating are listed in this [table](https://github.com/microsoft/MSMARCO-Passage-Ranking#data-information-and-formating). The used dataset for training is called *Train Triples Large*, while the evaluation was made on *Top 1000 Dev*. There are 6,900 queries in total in the development dataset, where each query is mapped to top 1,000 passage retrieved using BM25 from MS MARCO corpus.\n"
]
},
{
"cell_type": "markdown",
"id": "afc188f2",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/amberoad/bert-multilingual-passage-reranking-msmarco](https://huggingface.co/amberoad/bert-multilingual-passage-reranking-msmarco),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "22c47298",
"metadata": {},
"source": [
"# Passage Reranking Multilingual BERT 🔃 🌍\n"
]
},
{
"cell_type": "markdown",
"id": "0bb73e0f",
"metadata": {},
"source": [
"## Model description\n",
"**Input:** Supports over 100 Languages. See [List of supported languages](https://github.com/google-research/bert/blob/master/multilingual.md#list-of-languages) for all available.\n"
]
},
{
"cell_type": "markdown",
"id": "fedf5cb8",
"metadata": {},
"source": [
"**Purpose:** This module takes a search query [1] and a passage [2] and calculates if the passage matches the query.\n",
"It can be used as an improvement for Elasticsearch Results and boosts the relevancy by up to 100%.\n"
]
},
{
"cell_type": "markdown",
"id": "146e3be4",
"metadata": {},
"source": [
"**Architecture:** On top of BERT there is a Densly Connected NN which takes the 768 Dimensional [CLS] Token as input and provides the output ([Arxiv](https://arxiv.org/abs/1901.04085)).\n"
]
},
{
"cell_type": "markdown",
"id": "772c5c82",
"metadata": {},
"source": [
"**Output:** Just a single value between between -10 and 10. Better matching query,passage pairs tend to have a higher a score.\n"
]
},
{
"cell_type": "markdown",
"id": "e5974e46",
"metadata": {},
"source": [
"## Intended uses & limitations\n",
"Both query[1] and passage[2] have to fit in 512 Tokens.\n",
"As you normally want to rerank the first dozens of search results keep in mind the inference time of approximately 300 ms/query.\n"
]
},
{
"cell_type": "markdown",
"id": "7d878609",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e0941f1f",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3bc201bf",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"amberoad/bert-multilingual-passage-reranking-msmarco\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "674ccc3a",
"metadata": {},
"source": [
"## Training data\n"
]
},
{
"cell_type": "markdown",
"id": "4404adda",
"metadata": {},
"source": [
"This model is trained using the [**Microsoft MS Marco Dataset**](https://microsoft.github.io/msmarco/ \"Microsoft MS Marco\"). This training dataset contains approximately 400M tuples of a query, relevant and non-relevant passages. All datasets used for training and evaluating are listed in this [table](https://github.com/microsoft/MSMARCO-Passage-Ranking#data-information-and-formating). The used dataset for training is called *Train Triples Large*, while the evaluation was made on *Top 1000 Dev*. There are 6,900 queries in total in the development dataset, where each query is mapped to top 1,000 passage retrieved using BM25 from MS MARCO corpus.\n"
]
},
{
"cell_type": "markdown",
"id": "79af5e42",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/amberoad/bert-multilingual-passage-reranking-msmarco](https://huggingface.co/amberoad/bert-multilingual-passage-reranking-msmarco) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
# 模型列表
## asi/gpt-fr-cased-base
| 模型名称 | 模型介绍 | 模型大小 | 模型下载 |
| --- | --- | --- | --- |
|asi/gpt-fr-cased-base| | 4.12G | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/asi/gpt-fr-cased-base/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/asi/gpt-fr-cased-base/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/asi/gpt-fr-cased-base/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/asi/gpt-fr-cased-base/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/asi/gpt-fr-cased-base/vocab.json) |
也可以通过`paddlenlp` cli 工具来下载对应的模型权重,使用步骤如下所示:
* 安装paddlenlp
```shell
pip install --upgrade paddlenlp
```
* 下载命令行
```shell
paddlenlp download --cache-dir ./pretrained_models asi/gpt-fr-cased-base
```
有任何下载的问题都可以到[PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP)中发Issue提问。
\ No newline at end of file
# model list
##
| model | description | model_size | download |
| --- | --- | --- | --- |
|asi/gpt-fr-cased-base| | 4.12G | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/asi/gpt-fr-cased-base/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/asi/gpt-fr-cased-base/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/asi/gpt-fr-cased-base/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/asi/gpt-fr-cased-base/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/asi/gpt-fr-cased-base/vocab.json) |
or you can download all of model file with the following steps:
* install paddlenlp
```shell
pip install --upgrade paddlenlp
```
* download model with cli tool
```shell
paddlenlp download --cache-dir ./pretrained_models asi/gpt-fr-cased-base
```
If you have any problems with it, you can post issue on [PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP) to get support.
Model_Info:
name: "asi/gpt-fr-cased-base"
description: "Model description"
description_en: "Model description"
icon: ""
from_repo: "https://huggingface.co/asi/gpt-fr-cased-base"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text Generation"
sub_tag: "文本生成"
Example:
Datasets: ""
Publisher: "asi"
License: "apache-2.0"
Language: "French"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
# 模型列表
## asi/gpt-fr-cased-small
| 模型名称 | 模型介绍 | 模型大小 | 模型下载 |
| --- | --- | --- | --- |
|asi/gpt-fr-cased-small| | 620.45MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/asi/gpt-fr-cased-small/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/asi/gpt-fr-cased-small/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/asi/gpt-fr-cased-small/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/asi/gpt-fr-cased-small/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/asi/gpt-fr-cased-small/vocab.json) |
也可以通过`paddlenlp` cli 工具来下载对应的模型权重,使用步骤如下所示:
* 安装paddlenlp
```shell
pip install --upgrade paddlenlp
```
* 下载命令行
```shell
paddlenlp download --cache-dir ./pretrained_models asi/gpt-fr-cased-small
```
有任何下载的问题都可以到[PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP)中发Issue提问。
\ No newline at end of file
# model list
##
| model | description | model_size | download |
| --- | --- | --- | --- |
|asi/gpt-fr-cased-small| | 620.45MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/asi/gpt-fr-cased-small/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/asi/gpt-fr-cased-small/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/asi/gpt-fr-cased-small/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/asi/gpt-fr-cased-small/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/asi/gpt-fr-cased-small/vocab.json) |
or you can download all of model file with the following steps:
* install paddlenlp
```shell
pip install --upgrade paddlenlp
```
* download model with cli tool
```shell
paddlenlp download --cache-dir ./pretrained_models asi/gpt-fr-cased-small
```
If you have any problems with it, you can post issue on [PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP) to get support.
Model_Info:
name: "asi/gpt-fr-cased-small"
description: "Model description"
description_en: "Model description"
icon: ""
from_repo: "https://huggingface.co/asi/gpt-fr-cased-small"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text Generation"
sub_tag: "文本生成"
Example:
Datasets: ""
Publisher: "asi"
License: "apache-2.0"
Language: "French"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: German
License: mit
Model_Info: Model_Info:
name: "benjamin/gerpt2-large" description: GerPT2
description: "GerPT2" description_en: GerPT2
description_en: "GerPT2" from_repo: https://huggingface.co/benjamin/gerpt2-large
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/benjamin/gerpt2-large" name: benjamin/gerpt2-large
Paper: null
Publisher: benjamin
Task: Task:
- tag_en: "Natural Language Processing" - sub_tag: 文本生成
tag: "自然语言处理" sub_tag_en: Text Generation
sub_tag_en: "Text Generation" tag: 自然语言处理
sub_tag: "文本生成" tag_en: Natural Language Processing
Example:
Datasets: ""
Publisher: "benjamin"
License: "mit"
Language: "German"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
{
"cells": [
{
"cell_type": "markdown",
"id": "e42aa4df",
"metadata": {},
"source": [
"# GerPT2\n"
]
},
{
"cell_type": "markdown",
"id": "08fd6403",
"metadata": {},
"source": [
"See the GPT2 model card for considerations on limitations and bias. See the GPT2 documentation for details on GPT2.\n"
]
},
{
"cell_type": "markdown",
"id": "8295e28d",
"metadata": {},
"source": [
"## Comparison to dbmdz/german-gpt2\n"
]
},
{
"cell_type": "markdown",
"id": "c0f50f67",
"metadata": {},
"source": [
"I evaluated both GerPT2-large and the other German GPT2, dbmdz/german-gpt2 on the [CC-100](http://data.statmt.org/cc-100/) dataset and on the German Wikipedia:\n"
]
},
{
"cell_type": "markdown",
"id": "6ecdc149",
"metadata": {},
"source": [
"| | CC-100 (PPL) | Wikipedia (PPL) |\n",
"|-------------------|--------------|-----------------|\n",
"| dbmdz/german-gpt2 | 49.47 | 62.92 |\n",
"| GerPT2 | 24.78 | 35.33 |\n",
"| GerPT2-large | __16.08__ | __23.26__ |\n",
"| | | |\n"
]
},
{
"cell_type": "markdown",
"id": "3cddd6a8",
"metadata": {},
"source": [
"See the script `evaluate.py` in the [GerPT2 Github repository](https://github.com/bminixhofer/gerpt2) for the code.\n"
]
},
{
"cell_type": "markdown",
"id": "d838da15",
"metadata": {},
"source": [
"## Usage\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "476bf523",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8f509fec",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"benjamin/gerpt2-large\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "d135a538",
"metadata": {},
"source": [
"```\n",
"@misc{Minixhofer_GerPT2_German_large_2020,\n",
"author = {Minixhofer, Benjamin},\n",
"doi = {10.5281/zenodo.5509984},\n",
"month = {12},\n",
"title = {{GerPT2: German large and small versions of GPT2}},\n",
"url = {https://github.com/bminixhofer/gerpt2},\n",
"year = {2020}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "63e09ad7",
"metadata": {},
"source": [
"## Acknowledgements\n"
]
},
{
"cell_type": "markdown",
"id": "d9dc51e1",
"metadata": {},
"source": [
"Thanks to [Hugging Face](https://huggingface.co) for awesome tools and infrastructure.\n",
"Huge thanks to [Artus Krohn-Grimberghe](https://twitter.com/artuskg) at [LYTiQ](https://www.lytiq.de/) for making this possible by sponsoring the resources used for training.\n",
"\n",
"> 此模型介绍及权重来源于[https://huggingface.co/benjamin/gerpt2-large](https://huggingface.co/benjamin/gerpt2-large),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "85c2e1a7",
"metadata": {},
"source": [
"# GerPT2\n"
]
},
{
"cell_type": "markdown",
"id": "595fe7cb",
"metadata": {},
"source": [
"See the GPT2 model card for considerations on limitations and bias. See the GPT2 documentation for details on GPT2.\n"
]
},
{
"cell_type": "markdown",
"id": "5b4f950b",
"metadata": {},
"source": [
"## Comparison to dbmdz/german-gpt2\n"
]
},
{
"cell_type": "markdown",
"id": "95be6eb8",
"metadata": {},
"source": [
"I evaluated both GerPT2-large and the other German GPT2, dbmdz/german-gpt2 on the [CC-100](http://data.statmt.org/cc-100/) dataset and on the German Wikipedia:\n"
]
},
{
"cell_type": "markdown",
"id": "8acd14be",
"metadata": {},
"source": [
"| | CC-100 (PPL) | Wikipedia (PPL) |\n",
"|-------------------|--------------|-----------------|\n",
"| dbmdz/german-gpt2 | 49.47 | 62.92 |\n",
"| GerPT2 | 24.78 | 35.33 |\n",
"| GerPT2-large | __16.08__ | __23.26__ |\n",
"| | | |\n"
]
},
{
"cell_type": "markdown",
"id": "6fa10d79",
"metadata": {},
"source": [
"See the script `evaluate.py` in the [GerPT2 Github repository](https://github.com/bminixhofer/gerpt2) for the code.\n"
]
},
{
"cell_type": "markdown",
"id": "a8514e1e",
"metadata": {},
"source": [
"## Usage\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4bc62c63",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "63f78302",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"benjamin/gerpt2-large\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "563152f3",
"metadata": {},
"source": [
"```\n",
"@misc{Minixhofer_GerPT2_German_large_2020,\n",
"author = {Minixhofer, Benjamin},\n",
"doi = {10.5281/zenodo.5509984},\n",
"month = {12},\n",
"title = {{GerPT2: German large and small versions of GPT2}},\n",
"url = {https://github.com/bminixhofer/gerpt2},\n",
"year = {2020}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "b0d67d21",
"metadata": {},
"source": [
"## Acknowledgements\n"
]
},
{
"cell_type": "markdown",
"id": "474c1c61",
"metadata": {},
"source": [
"Thanks to [Hugging Face](https://huggingface.co) for awesome tools and infrastructure.\n",
"Huge thanks to [Artus Krohn-Grimberghe](https://twitter.com/artuskg) at [LYTiQ](https://www.lytiq.de/) for making this possible by sponsoring the resources used for training.\n",
"\n",
"> The model introduction and model weights originate from [https://huggingface.co/benjamin/gerpt2-large](https://huggingface.co/benjamin/gerpt2-large) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
# 模型列表
## benjamin/gerpt2
| 模型名称 | 模型介绍 | 模型大小 | 模型下载 |
| --- | --- | --- | --- |
|benjamin/gerpt2| | 621.95MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/benjamin/gerpt2/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/benjamin/gerpt2/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/benjamin/gerpt2/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/benjamin/gerpt2/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/benjamin/gerpt2/vocab.json) |
也可以通过`paddlenlp` cli 工具来下载对应的模型权重,使用步骤如下所示:
* 安装paddlenlp
```shell
pip install --upgrade paddlenlp
```
* 下载命令行
```shell
paddlenlp download --cache-dir ./pretrained_models benjamin/gerpt2
```
有任何下载的问题都可以到[PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP)中发Issue提问。
\ No newline at end of file
# model list
##
| model | description | model_size | download |
| --- | --- | --- | --- |
|benjamin/gerpt2| | 621.95MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/benjamin/gerpt2/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/benjamin/gerpt2/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/benjamin/gerpt2/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/benjamin/gerpt2/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/benjamin/gerpt2/vocab.json) |
or you can download all of model file with the following steps:
* install paddlenlp
```shell
pip install --upgrade paddlenlp
```
* download model with cli tool
```shell
paddlenlp download --cache-dir ./pretrained_models benjamin/gerpt2
```
If you have any problems with it, you can post issue on [PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP) to get support.
Model_Info:
name: "benjamin/gerpt2"
description: "GerPT2"
description_en: "GerPT2"
icon: ""
from_repo: "https://huggingface.co/benjamin/gerpt2"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text Generation"
sub_tag: "文本生成"
Example:
Datasets: ""
Publisher: "benjamin"
License: "mit"
Language: "German"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: Korean
License: apache-2.0
Model_Info: Model_Info:
name: "beomi/kcbert-base" description: 'KcBERT: Korean comments BERT'
description: "KcBERT: Korean comments BERT" description_en: 'KcBERT: Korean comments BERT'
description_en: "KcBERT: Korean comments BERT" from_repo: https://huggingface.co/beomi/kcbert-base
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/beomi/kcbert-base" name: beomi/kcbert-base
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: ""
Publisher: "beomi"
License: "apache-2.0"
Language: "Korean"
Paper: Paper:
- title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding' - title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding'
url: 'http://arxiv.org/abs/1810.04805v2' url: http://arxiv.org/abs/1810.04805v2
IfTraining: 0 Publisher: beomi
IfOnlineDemo: 0 Task:
\ No newline at end of file - sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "8a51a2c8",
"metadata": {},
"source": [
"# KcBERT: Korean comments BERT\n"
]
},
{
"cell_type": "markdown",
"id": "29c7e5a4",
"metadata": {},
"source": [
"Kaggle에 학습을 위해 정제한(아래 `clean`처리를 거친) Dataset을 공개하였습니다!\n"
]
},
{
"cell_type": "markdown",
"id": "95a25c77",
"metadata": {},
"source": [
"직접 다운받으셔서 다양한 Task에 학습을 진행해보세요 :)\n"
]
},
{
"cell_type": "markdown",
"id": "edd96db1",
"metadata": {},
"source": [
"공개된 한국어 BERT는 대부분 한국어 위키, 뉴스 기사, 책 등 잘 정제된 데이터를 기반으로 학습한 모델입니다. 한편, 실제로 NSMC와 같은 댓글형 데이터셋은 정제되지 않았고 구어체 특징에 신조어가 많으며, 오탈자 등 공식적인 글쓰기에서 나타나지 않는 표현들이 빈번하게 등장합니다.\n"
]
},
{
"cell_type": "markdown",
"id": "a2df738b",
"metadata": {},
"source": [
"KcBERT는 위와 같은 특성의 데이터셋에 적용하기 위해, 네이버 뉴스에서 댓글과 대댓글을 수집해, 토크나이저와 BERT모델을 처음부터 학습한 Pretrained BERT 모델입니다.\n"
]
},
{
"cell_type": "markdown",
"id": "a0eb4ad8",
"metadata": {},
"source": [
"KcBERT는 Huggingface의 Transformers 라이브러리를 통해 간편히 불러와 사용할 수 있습니다. (별도의 파일 다운로드가 필요하지 않습니다.)\n"
]
},
{
"cell_type": "markdown",
"id": "d1c07267",
"metadata": {},
"source": [
"## KcBERT Performance\n"
]
},
{
"cell_type": "markdown",
"id": "52872aa3",
"metadata": {},
"source": [
"- Finetune 코드는 https://github.com/Beomi/KcBERT-finetune 에서 찾아보실 수 있습니다.\n"
]
},
{
"cell_type": "markdown",
"id": "fa15ccaf",
"metadata": {},
"source": [
"| | Size<br/>(용량) | **NSMC**<br/>(acc) | **Naver NER**<br/>(F1) | **PAWS**<br/>(acc) | **KorNLI**<br/>(acc) | **KorSTS**<br/>(spearman) | **Question Pair**<br/>(acc) | **KorQuaD (Dev)**<br/>(EM/F1) |\n",
"| :-------------------- | :---: | :----------------: | :--------------------: | :----------------: | :------------------: | :-----------------------: | :-------------------------: | :---------------------------: |\n",
"| KcBERT-Base | 417M | 89.62 | 84.34 | 66.95 | 74.85 | 75.57 | 93.93 | 60.25 / 84.39 |\n",
"| KcBERT-Large | 1.2G | **90.68** | 85.53 | 70.15 | 76.99 | 77.49 | 94.06 | 62.16 / 86.64 |\n",
"| KoBERT | 351M | 89.63 | 86.11 | 80.65 | 79.00 | 79.64 | 93.93 | 52.81 / 80.27 |\n",
"| XLM-Roberta-Base | 1.03G | 89.49 | 86.26 | 82.95 | 79.92 | 79.09 | 93.53 | 64.70 / 88.94 |\n",
"| HanBERT | 614M | 90.16 | **87.31** | 82.40 | **80.89** | 83.33 | 94.19 | 78.74 / 92.02 |\n",
"| KoELECTRA-Base | 423M | **90.21** | 86.87 | 81.90 | 80.85 | 83.21 | 94.20 | 61.10 / 89.59 |\n",
"| KoELECTRA-Base-v2 | 423M | 89.70 | 87.02 | **83.90** | 80.61 | **84.30** | **94.72** | **84.34 / 92.58** |\n",
"| DistilKoBERT | 108M | 88.41 | 84.13 | 62.55 | 70.55 | 73.21 | 92.48 | 54.12 / 77.80 |\n"
]
},
{
"cell_type": "markdown",
"id": "5193845f",
"metadata": {},
"source": [
"\\*HanBERT의 Size는 Bert Model과 Tokenizer DB를 합친 것입니다.\n"
]
},
{
"cell_type": "markdown",
"id": "93aecc1a",
"metadata": {},
"source": [
"\\***config의 세팅을 그대로 하여 돌린 결과이며, hyperparameter tuning을 추가적으로 할 시 더 좋은 성능이 나올 수 있습니다.**\n"
]
},
{
"cell_type": "markdown",
"id": "6f889bbd",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "465d2dee",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f884ed37",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"beomi/kcbert-base\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "a92e65b7",
"metadata": {},
"source": [
"```\n",
"@inproceedings{lee2020kcbert,\n",
"title={KcBERT: Korean Comments BERT},\n",
"author={Lee, Junbum},\n",
"booktitle={Proceedings of the 32nd Annual Conference on Human and Cognitive Language Technology},\n",
"pages={437--440},\n",
"year={2020}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "21364621",
"metadata": {},
"source": [
"- 논문집 다운로드 링크: http://hclt.kr/dwn/?v=bG5iOmNvbmZlcmVuY2U7aWR4OjMy (*혹은 http://hclt.kr/symp/?lnb=conference )\n"
]
},
{
"cell_type": "markdown",
"id": "45cdafe0",
"metadata": {},
"source": [
"## Acknowledgement\n"
]
},
{
"cell_type": "markdown",
"id": "a741fcf0",
"metadata": {},
"source": [
"KcBERT Model을 학습하는 GCP/TPU 환경은 TFRC 프로그램의 지원을 받았습니다.\n"
]
},
{
"cell_type": "markdown",
"id": "1c9655e9",
"metadata": {},
"source": [
"모델 학습 과정에서 많은 조언을 주신 [Monologg](https://github.com/monologg/) 님 감사합니다 :)\n"
]
},
{
"cell_type": "markdown",
"id": "85cb1e08",
"metadata": {},
"source": [
"## Reference\n"
]
},
{
"cell_type": "markdown",
"id": "227d89d2",
"metadata": {},
"source": [
"### Github Repos\n"
]
},
{
"cell_type": "markdown",
"id": "5e8f4de7",
"metadata": {},
"source": [
"- [BERT by Google](https://github.com/google-research/bert)\n",
"- [KoBERT by SKT](https://github.com/SKTBrain/KoBERT)\n",
"- [KoELECTRA by Monologg](https://github.com/monologg/KoELECTRA/)\n"
]
},
{
"cell_type": "markdown",
"id": "730bfede",
"metadata": {},
"source": [
"- [Transformers by Huggingface](https://github.com/huggingface/transformers)\n",
"- [Tokenizers by Hugginface](https://github.com/huggingface/tokenizers)\n"
]
},
{
"cell_type": "markdown",
"id": "66dbd496",
"metadata": {},
"source": [
"### Papers\n"
]
},
{
"cell_type": "markdown",
"id": "84fe619a",
"metadata": {},
"source": [
"- [BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding](https://arxiv.org/abs/1810.04805)\n"
]
},
{
"cell_type": "markdown",
"id": "63bb3dd3",
"metadata": {},
"source": [
"### Blogs\n"
]
},
{
"cell_type": "markdown",
"id": "a5aa5385",
"metadata": {},
"source": [
"- [Monologg님의 KoELECTRA 학습기](https://monologg.kr/categories/NLP/ELECTRA/)\n"
]
},
{
"cell_type": "markdown",
"id": "bcbd3600",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/beomi/kcbert-base](https://huggingface.co/beomi/kcbert-base),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "21e8b000",
"metadata": {},
"source": [
"# KcBERT: Korean comments BERT\n"
]
},
{
"cell_type": "markdown",
"id": "336ee0b8",
"metadata": {},
"source": [
"Kaggle에 학습을 위해 정제한(아래 `clean`처리를 거친) Dataset을 공개하였습니다!\n"
]
},
{
"cell_type": "markdown",
"id": "691c1f27",
"metadata": {},
"source": [
"직접 다운받으셔서 다양한 Task에 학습을 진행해보세요 :)\n"
]
},
{
"cell_type": "markdown",
"id": "36ec915c",
"metadata": {},
"source": [
"공개된 한국어 BERT는 대부분 한국어 위키, 뉴스 기사, 책 등 잘 정제된 데이터를 기반으로 학습한 모델입니다. 한편, 실제로 NSMC와 같은 댓글형 데이터셋은 정제되지 않았고 구어체 특징에 신조어가 많으며, 오탈자 등 공식적인 글쓰기에서 나타나지 않는 표현들이 빈번하게 등장합니다.\n"
]
},
{
"cell_type": "markdown",
"id": "b5b8d7d7",
"metadata": {},
"source": [
"KcBERT는 위와 같은 특성의 데이터셋에 적용하기 위해, 네이버 뉴스에서 댓글과 대댓글을 수집해, 토크나이저와 BERT모델을 처음부터 학습한 Pretrained BERT 모델입니다.\n"
]
},
{
"cell_type": "markdown",
"id": "b0095da8",
"metadata": {},
"source": [
"KcBERT는 Huggingface의 Transformers 라이브러리를 통해 간편히 불러와 사용할 수 있습니다. (별도의 파일 다운로드가 필요하지 않습니다.)\n"
]
},
{
"cell_type": "markdown",
"id": "4bf51d97",
"metadata": {},
"source": [
"## KcBERT Performance\n"
]
},
{
"cell_type": "markdown",
"id": "9679c8b9",
"metadata": {},
"source": [
"- Finetune 코드는 https://github.com/Beomi/KcBERT-finetune 에서 찾아보실 수 있습니다.\n"
]
},
{
"cell_type": "markdown",
"id": "486782a2",
"metadata": {},
"source": [
"| | Size<br/>(용량) | **NSMC**<br/>(acc) | **Naver NER**<br/>(F1) | **PAWS**<br/>(acc) | **KorNLI**<br/>(acc) | **KorSTS**<br/>(spearman) | **Question Pair**<br/>(acc) | **KorQuaD (Dev)**<br/>(EM/F1) |\n",
"| :-------------------- | :---: | :----------------: | :--------------------: | :----------------: | :------------------: | :-----------------------: | :-------------------------: | :---------------------------: |\n",
"| KcBERT-Base | 417M | 89.62 | 84.34 | 66.95 | 74.85 | 75.57 | 93.93 | 60.25 / 84.39 |\n",
"| KcBERT-Large | 1.2G | **90.68** | 85.53 | 70.15 | 76.99 | 77.49 | 94.06 | 62.16 / 86.64 |\n",
"| KoBERT | 351M | 89.63 | 86.11 | 80.65 | 79.00 | 79.64 | 93.93 | 52.81 / 80.27 |\n",
"| XLM-Roberta-Base | 1.03G | 89.49 | 86.26 | 82.95 | 79.92 | 79.09 | 93.53 | 64.70 / 88.94 |\n",
"| HanBERT | 614M | 90.16 | **87.31** | 82.40 | **80.89** | 83.33 | 94.19 | 78.74 / 92.02 |\n",
"| KoELECTRA-Base | 423M | **90.21** | 86.87 | 81.90 | 80.85 | 83.21 | 94.20 | 61.10 / 89.59 |\n",
"| KoELECTRA-Base-v2 | 423M | 89.70 | 87.02 | **83.90** | 80.61 | **84.30** | **94.72** | **84.34 / 92.58** |\n",
"| DistilKoBERT | 108M | 88.41 | 84.13 | 62.55 | 70.55 | 73.21 | 92.48 | 54.12 / 77.80 |\n"
]
},
{
"cell_type": "markdown",
"id": "e86103a2",
"metadata": {},
"source": [
"\\*HanBERT의 Size는 Bert Model과 Tokenizer DB를 합친 것입니다.\n"
]
},
{
"cell_type": "markdown",
"id": "1078bc5d",
"metadata": {},
"source": [
"\\***config의 세팅을 그대로 하여 돌린 결과이며, hyperparameter tuning을 추가적으로 할 시 더 좋은 성능이 나올 수 있습니다.**\n"
]
},
{
"cell_type": "markdown",
"id": "8ac2ee11",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e171068a",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "38c7ad79",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"beomi/kcbert-base\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "a794d15a",
"metadata": {},
"source": [
"```\n",
"@inproceedings{lee2020kcbert,\n",
"title={KcBERT: Korean Comments BERT},\n",
"author={Lee, Junbum},\n",
"booktitle={Proceedings of the 32nd Annual Conference on Human and Cognitive Language Technology},\n",
"pages={437--440},\n",
"year={2020}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "c0183cbe",
"metadata": {},
"source": [
"- 논문집 다운로드 링크: http://hclt.kr/dwn/?v=bG5iOmNvbmZlcmVuY2U7aWR4OjMy (*혹은 http://hclt.kr/symp/?lnb=conference )\n"
]
},
{
"cell_type": "markdown",
"id": "ba768b26",
"metadata": {},
"source": [
"## Acknowledgement\n"
]
},
{
"cell_type": "markdown",
"id": "ea148064",
"metadata": {},
"source": [
"KcBERT Model을 학습하는 GCP/TPU 환경은 TFRC 프로그램의 지원을 받았습니다.\n"
]
},
{
"cell_type": "markdown",
"id": "78732669",
"metadata": {},
"source": [
"모델 학습 과정에서 많은 조언을 주신 [Monologg](https://github.com/monologg/) 님 감사합니다 :)\n"
]
},
{
"cell_type": "markdown",
"id": "5ffa9ed9",
"metadata": {},
"source": [
"## Reference\n"
]
},
{
"cell_type": "markdown",
"id": "ea69da89",
"metadata": {},
"source": [
"### Github Repos\n"
]
},
{
"cell_type": "markdown",
"id": "d72d564c",
"metadata": {},
"source": [
"- [BERT by Google](https://github.com/google-research/bert)\n",
"- [KoBERT by SKT](https://github.com/SKTBrain/KoBERT)\n",
"- [KoELECTRA by Monologg](https://github.com/monologg/KoELECTRA/)\n"
]
},
{
"cell_type": "markdown",
"id": "38503607",
"metadata": {},
"source": [
"- [Transformers by Huggingface](https://github.com/huggingface/transformers)\n",
"- [Tokenizers by Hugginface](https://github.com/huggingface/tokenizers)\n"
]
},
{
"cell_type": "markdown",
"id": "a71a565f",
"metadata": {},
"source": [
"### Papers\n"
]
},
{
"cell_type": "markdown",
"id": "9aa4d324",
"metadata": {},
"source": [
"- [BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding](https://arxiv.org/abs/1810.04805)\n"
]
},
{
"cell_type": "markdown",
"id": "6b1ba932",
"metadata": {},
"source": [
"### Blogs\n"
]
},
{
"cell_type": "markdown",
"id": "5c9e32e1",
"metadata": {},
"source": [
"- [Monologg님의 KoELECTRA 학습기](https://monologg.kr/categories/NLP/ELECTRA/)\n"
]
},
{
"cell_type": "markdown",
"id": "0b551dcf",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/beomi/kcbert-base](https://huggingface.co/beomi/kcbert-base) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
# 模型列表
## bhadresh-savani/roberta-base-emotion
| 模型名称 | 模型介绍 | 模型大小 | 模型下载 |
| --- | --- | --- | --- |
|bhadresh-savani/roberta-base-emotion| | 475.53MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/bhadresh-savani/roberta-base-emotion/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/bhadresh-savani/roberta-base-emotion/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/bhadresh-savani/roberta-base-emotion/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/bhadresh-savani/roberta-base-emotion/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/bhadresh-savani/roberta-base-emotion/vocab.json)<br>[vocab.txt](https://bj.bcebos.com/paddlenlp/models/community/bhadresh-savani/roberta-base-emotion/vocab.txt) |
也可以通过`paddlenlp` cli 工具来下载对应的模型权重,使用步骤如下所示:
* 安装paddlenlp
```shell
pip install --upgrade paddlenlp
```
* 下载命令行
```shell
paddlenlp download --cache-dir ./pretrained_models bhadresh-savani/roberta-base-emotion
```
有任何下载的问题都可以到[PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP)中发Issue提问。
\ No newline at end of file
# model list
##
| model | description | model_size | download |
| --- | --- | --- | --- |
|bhadresh-savani/roberta-base-emotion| | 475.53MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/bhadresh-savani/roberta-base-emotion/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/bhadresh-savani/roberta-base-emotion/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/bhadresh-savani/roberta-base-emotion/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/bhadresh-savani/roberta-base-emotion/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/bhadresh-savani/roberta-base-emotion/vocab.json)<br>[vocab.txt](https://bj.bcebos.com/paddlenlp/models/community/bhadresh-savani/roberta-base-emotion/vocab.txt) |
or you can download all of model file with the following steps:
* install paddlenlp
```shell
pip install --upgrade paddlenlp
```
* download model with cli tool
```shell
paddlenlp download --cache-dir ./pretrained_models bhadresh-savani/roberta-base-emotion
```
If you have any problems with it, you can post issue on [PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP) to get support.
Model_Info:
name: "bhadresh-savani/roberta-base-emotion"
description: "robert-base-emotion"
description_en: "robert-base-emotion"
icon: ""
from_repo: "https://huggingface.co/bhadresh-savani/roberta-base-emotion"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text Classification"
sub_tag: "文本分类"
Example:
Datasets: "emotion,emotion"
Publisher: "bhadresh-savani"
License: "apache-2.0"
Language: "English"
Paper:
- title: 'RoBERTa: A Robustly Optimized BERT Pretraining Approach'
url: 'http://arxiv.org/abs/1907.11692v1'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
# 模型列表
## cahya/bert-base-indonesian-522M
| 模型名称 | 模型介绍 | 模型大小 | 模型下载 |
| --- | --- | --- | --- |
|cahya/bert-base-indonesian-522M| | 518.25MB | [model_config.json](https://bj.bcebos.com/paddlenlp/models/community/cahya/bert-base-indonesian-522M/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/cahya/bert-base-indonesian-522M/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/cahya/bert-base-indonesian-522M/tokenizer_config.json)<br>[vocab.txt](https://bj.bcebos.com/paddlenlp/models/community/cahya/bert-base-indonesian-522M/vocab.txt) |
也可以通过`paddlenlp` cli 工具来下载对应的模型权重,使用步骤如下所示:
* 安装paddlenlp
```shell
pip install --upgrade paddlenlp
```
* 下载命令行
```shell
paddlenlp download --cache-dir ./pretrained_models cahya/bert-base-indonesian-522M
```
有任何下载的问题都可以到[PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP)中发Issue提问。
\ No newline at end of file
# model list
##
| model | description | model_size | download |
| --- | --- | --- | --- |
|cahya/bert-base-indonesian-522M| | 518.25MB | [model_config.json](https://bj.bcebos.com/paddlenlp/models/community/cahya/bert-base-indonesian-522M/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/cahya/bert-base-indonesian-522M/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/cahya/bert-base-indonesian-522M/tokenizer_config.json)<br>[vocab.txt](https://bj.bcebos.com/paddlenlp/models/community/cahya/bert-base-indonesian-522M/vocab.txt) |
or you can download all of model file with the following steps:
* install paddlenlp
```shell
pip install --upgrade paddlenlp
```
* download model with cli tool
```shell
paddlenlp download --cache-dir ./pretrained_models cahya/bert-base-indonesian-522M
```
If you have any problems with it, you can post issue on [PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP) to get support.
Model_Info:
name: "cahya/bert-base-indonesian-522M"
description: "Indonesian BERT base model (uncased)"
description_en: "Indonesian BERT base model (uncased)"
icon: ""
from_repo: "https://huggingface.co/cahya/bert-base-indonesian-522M"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: "wikipedia"
Publisher: "cahya"
License: "mit"
Language: "Indonesian"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
# 模型列表
## cahya/gpt2-small-indonesian-522M
| 模型名称 | 模型介绍 | 模型大小 | 模型下载 |
| --- | --- | --- | --- |
|cahya/gpt2-small-indonesian-522M| | 621.95MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/cahya/gpt2-small-indonesian-522M/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/cahya/gpt2-small-indonesian-522M/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/cahya/gpt2-small-indonesian-522M/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/cahya/gpt2-small-indonesian-522M/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/cahya/gpt2-small-indonesian-522M/vocab.json) |
也可以通过`paddlenlp` cli 工具来下载对应的模型权重,使用步骤如下所示:
* 安装paddlenlp
```shell
pip install --upgrade paddlenlp
```
* 下载命令行
```shell
paddlenlp download --cache-dir ./pretrained_models cahya/gpt2-small-indonesian-522M
```
有任何下载的问题都可以到[PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP)中发Issue提问。
\ No newline at end of file
# model list
##
| model | description | model_size | download |
| --- | --- | --- | --- |
|cahya/gpt2-small-indonesian-522M| | 621.95MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/cahya/gpt2-small-indonesian-522M/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/cahya/gpt2-small-indonesian-522M/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/cahya/gpt2-small-indonesian-522M/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/cahya/gpt2-small-indonesian-522M/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/cahya/gpt2-small-indonesian-522M/vocab.json) |
or you can download all of model file with the following steps:
* install paddlenlp
```shell
pip install --upgrade paddlenlp
```
* download model with cli tool
```shell
paddlenlp download --cache-dir ./pretrained_models cahya/gpt2-small-indonesian-522M
```
If you have any problems with it, you can post issue on [PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP) to get support.
Model_Info:
name: "cahya/gpt2-small-indonesian-522M"
description: "Indonesian GPT2 small model"
description_en: "Indonesian GPT2 small model"
icon: ""
from_repo: "https://huggingface.co/cahya/gpt2-small-indonesian-522M"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text Generation"
sub_tag: "文本生成"
Example:
Datasets: ""
Publisher: "cahya"
License: "mit"
Language: "Indonesian"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
# 模型列表
## ceshine/t5-paraphrase-paws-msrp-opinosis
| 模型名称 | 模型介绍 | 模型大小 | 模型下载 |
| --- | --- | --- | --- |
|ceshine/t5-paraphrase-paws-msrp-opinosis| | 1.11G | [model_config.json](https://bj.bcebos.com/paddlenlp/models/community/ceshine/t5-paraphrase-paws-msrp-opinosis/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/ceshine/t5-paraphrase-paws-msrp-opinosis/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/ceshine/t5-paraphrase-paws-msrp-opinosis/tokenizer_config.json) |
也可以通过`paddlenlp` cli 工具来下载对应的模型权重,使用步骤如下所示:
* 安装paddlenlp
```shell
pip install --upgrade paddlenlp
```
* 下载命令行
```shell
paddlenlp download --cache-dir ./pretrained_models ceshine/t5-paraphrase-paws-msrp-opinosis
```
有任何下载的问题都可以到[PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP)中发Issue提问。
\ No newline at end of file
# model list
##
| model | description | model_size | download |
| --- | --- | --- | --- |
|ceshine/t5-paraphrase-paws-msrp-opinosis| | 1.11G | [model_config.json](https://bj.bcebos.com/paddlenlp/models/community/ceshine/t5-paraphrase-paws-msrp-opinosis/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/ceshine/t5-paraphrase-paws-msrp-opinosis/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/ceshine/t5-paraphrase-paws-msrp-opinosis/tokenizer_config.json) |
or you can download all of model file with the following steps:
* install paddlenlp
```shell
pip install --upgrade paddlenlp
```
* download model with cli tool
```shell
paddlenlp download --cache-dir ./pretrained_models ceshine/t5-paraphrase-paws-msrp-opinosis
```
If you have any problems with it, you can post issue on [PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP) to get support.
Model_Info:
name: "ceshine/t5-paraphrase-paws-msrp-opinosis"
description: "T5-base Parapharasing model fine-tuned on PAWS, MSRP, and Opinosis"
description_en: "T5-base Parapharasing model fine-tuned on PAWS, MSRP, and Opinosis"
icon: ""
from_repo: "https://huggingface.co/ceshine/t5-paraphrase-paws-msrp-opinosis"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text2Text Generation"
sub_tag: "文本生成"
Example:
Datasets: ""
Publisher: "ceshine"
License: "apache-2.0"
Language: "English"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
# 模型列表
## ceshine/t5-paraphrase-quora-paws
| 模型名称 | 模型介绍 | 模型大小 | 模型下载 |
| --- | --- | --- | --- |
|ceshine/t5-paraphrase-quora-paws| | 1.11G | [model_config.json](https://bj.bcebos.com/paddlenlp/models/community/ceshine/t5-paraphrase-quora-paws/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/ceshine/t5-paraphrase-quora-paws/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/ceshine/t5-paraphrase-quora-paws/tokenizer_config.json) |
也可以通过`paddlenlp` cli 工具来下载对应的模型权重,使用步骤如下所示:
* 安装paddlenlp
```shell
pip install --upgrade paddlenlp
```
* 下载命令行
```shell
paddlenlp download --cache-dir ./pretrained_models ceshine/t5-paraphrase-quora-paws
```
有任何下载的问题都可以到[PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP)中发Issue提问。
\ No newline at end of file
# model list
##
| model | description | model_size | download |
| --- | --- | --- | --- |
|ceshine/t5-paraphrase-quora-paws| | 1.11G | [model_config.json](https://bj.bcebos.com/paddlenlp/models/community/ceshine/t5-paraphrase-quora-paws/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/ceshine/t5-paraphrase-quora-paws/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/ceshine/t5-paraphrase-quora-paws/tokenizer_config.json) |
or you can download all of model file with the following steps:
* install paddlenlp
```shell
pip install --upgrade paddlenlp
```
* download model with cli tool
```shell
paddlenlp download --cache-dir ./pretrained_models ceshine/t5-paraphrase-quora-paws
```
If you have any problems with it, you can post issue on [PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP) to get support.
Model_Info:
name: "ceshine/t5-paraphrase-quora-paws"
description: "T5-base Parapharasing model fine-tuned on PAWS and Quora"
description_en: "T5-base Parapharasing model fine-tuned on PAWS and Quora"
icon: ""
from_repo: "https://huggingface.co/ceshine/t5-paraphrase-quora-paws"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text2Text Generation"
sub_tag: "文本生成"
Example:
Datasets: ""
Publisher: "ceshine"
License: "apache-2.0"
Language: "English"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: Russian,English
License: mit
Model_Info: Model_Info:
name: "cointegrated/rubert-tiny" description: pip install transformers sentencepiece
description: "pip install transformers sentencepiece" description_en: pip install transformers sentencepiece
description_en: "pip install transformers sentencepiece" from_repo: https://huggingface.co/cointegrated/rubert-tiny
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/cointegrated/rubert-tiny" name: cointegrated/rubert-tiny
Paper: null
Publisher: cointegrated
Task: Task:
- tag_en: "Natural Language Processing" - sub_tag: 特征抽取
tag: "自然语言处理" sub_tag_en: Feature Extraction
sub_tag_en: "Feature Extraction" tag: 自然语言处理
sub_tag: "特征抽取" tag_en: Natural Language Processing
- tag_en: "Natural Language Processing" - sub_tag: 槽位填充
tag: "自然语言处理" sub_tag_en: Fill-Mask
sub_tag_en: "Fill-Mask" tag: 自然语言处理
sub_tag: "槽位填充" tag_en: Natural Language Processing
- tag_en: "Natural Language Processing" - sub_tag: 句子相似度
tag: "自然语言处理" sub_tag_en: Sentence Similarity
sub_tag_en: "Sentence Similarity" tag: 自然语言处理
sub_tag: "句子相似度" tag_en: Natural Language Processing
Example:
Datasets: ""
Publisher: "cointegrated"
License: "mit"
Language: "Russian,English"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
{
"cells": [
{
"cell_type": "markdown",
"id": "83973edc",
"metadata": {},
"source": [
"## Overview\n",
"\n",
"This is a very small distilled version of the bert-base-multilingual-cased model for Russian and English (45 MB, 12M parameters). There is also an **updated version of this model**, rubert-tiny2, with a larger vocabulary and better quality on practically all Russian NLU tasks.\n"
]
},
{
"cell_type": "markdown",
"id": "59944441",
"metadata": {},
"source": [
"This model is useful if you want to fine-tune it for a relatively simple Russian task (e.g. NER or sentiment classification), and you care more about speed and size than about accuracy. It is approximately x10 smaller and faster than a base-sized BERT. Its `[CLS]` embeddings can be used as a sentence representation aligned between Russian and English.\n"
]
},
{
"cell_type": "markdown",
"id": "c0e2918f",
"metadata": {},
"source": [
"It was trained on the [Yandex Translate corpus](https://translate.yandex.ru/corpus), [OPUS-100](https://huggingface.co/datasets/opus100) and Tatoeba, using MLM loss distilled from bert-base-multilingual-cased, translation ranking loss, and `[CLS]` embeddings distilled from LaBSE, rubert-base-cased-sentence, Laser and USE.\n"
]
},
{
"cell_type": "markdown",
"id": "b0c0158e",
"metadata": {},
"source": [
"There is a more detailed [description in Russian](https://habr.com/ru/post/562064/).\n"
]
},
{
"cell_type": "markdown",
"id": "28ce4026",
"metadata": {},
"source": [
"Sentence embeddings can be produced as follows:\n"
]
},
{
"cell_type": "markdown",
"id": "d521437a",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "da5acdb0",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "df2d3cc6",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cointegrated/rubert-tiny\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "065bda47",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/cointegrated/rubert-tiny](https://huggingface.co/cointegrated/rubert-tiny),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "b59db37b",
"metadata": {},
"source": [
"## Overview\n",
"\n",
"This is a very small distilled version of the bert-base-multilingual-cased model for Russian and English (45 MB, 12M parameters). There is also an **updated version of this model**, rubert-tiny2, with a larger vocabulary and better quality on practically all Russian NLU tasks.\n"
]
},
{
"cell_type": "markdown",
"id": "5e7c8c35",
"metadata": {},
"source": [
"This model is useful if you want to fine-tune it for a relatively simple Russian task (e.g. NER or sentiment classification), and you care more about speed and size than about accuracy. It is approximately x10 smaller and faster than a base-sized BERT. Its `[CLS]` embeddings can be used as a sentence representation aligned between Russian and English.\n"
]
},
{
"cell_type": "markdown",
"id": "bc3c5717",
"metadata": {},
"source": [
"It was trained on the [Yandex Translate corpus](https://translate.yandex.ru/corpus), [OPUS-100](https://huggingface.co/datasets/opus100) and Tatoeba, using MLM loss (distilled from bert-base-multilingual-cased\n",
"), translation ranking loss, and `[CLS]` embeddings distilled from LaBSE, rubert-base-cased-sentence, Laser and USE.\n"
]
},
{
"cell_type": "markdown",
"id": "2db0a3ee",
"metadata": {},
"source": [
"There is a more detailed [description in Russian](https://habr.com/ru/post/562064/).\n"
]
},
{
"cell_type": "markdown",
"id": "c3a52477",
"metadata": {},
"source": [
"Sentence embeddings can be produced as follows:\n"
]
},
{
"cell_type": "markdown",
"id": "add13de4",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c0a8f905",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "481d0ca6",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cointegrated/rubert-tiny\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "e6df17e3",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/cointegrated/rubert-tiny](https://huggingface.co/cointegrated/rubert-tiny) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: Russian
License: mit
Model_Info: Model_Info:
name: "cointegrated/rubert-tiny2" description: pip install transformers sentencepiece
description: "pip install transformers sentencepiece" description_en: pip install transformers sentencepiece
description_en: "pip install transformers sentencepiece" from_repo: https://huggingface.co/cointegrated/rubert-tiny2
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/cointegrated/rubert-tiny2" name: cointegrated/rubert-tiny2
Paper: null
Publisher: cointegrated
Task: Task:
- tag_en: "Natural Language Processing" - sub_tag: 特征抽取
tag: "自然语言处理" sub_tag_en: Feature Extraction
sub_tag_en: "Feature Extraction" tag: 自然语言处理
sub_tag: "特征抽取" tag_en: Natural Language Processing
- tag_en: "Natural Language Processing" - sub_tag: 槽位填充
tag: "自然语言处理" sub_tag_en: Fill-Mask
sub_tag_en: "Fill-Mask" tag: 自然语言处理
sub_tag: "槽位填充" tag_en: Natural Language Processing
- tag_en: "Natural Language Processing" - sub_tag: 句子相似度
tag: "自然语言处理" sub_tag_en: Sentence Similarity
sub_tag_en: "Sentence Similarity" tag: 自然语言处理
sub_tag: "句子相似度" tag_en: Natural Language Processing
Example:
Datasets: ""
Publisher: "cointegrated"
License: "mit"
Language: "Russian"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
{
"cells": [
{
"cell_type": "markdown",
"id": "9eef057a",
"metadata": {},
"source": [
"## Overview\n",
"\n",
"This is an updated version of cointegrated/rubert-tiny: a small Russian BERT-based encoder with high-quality sentence embeddings. This [post in Russian](https://habr.com/ru/post/669674/) gives more details.\n"
]
},
{
"cell_type": "markdown",
"id": "08d9a049",
"metadata": {},
"source": [
"The differences from the previous version include:\n",
"- a larger vocabulary: 83828 tokens instead of 29564;\n",
"- larger supported sequences: 2048 instead of 512;\n",
"- sentence embeddings approximate LaBSE closer than before;\n",
"- meaningful segment embeddings (tuned on the NLI task)\n",
"- the model is focused only on Russian.\n"
]
},
{
"cell_type": "markdown",
"id": "8a7ba50b",
"metadata": {},
"source": [
"The model should be used as is to produce sentence embeddings (e.g. for KNN classification of short texts) or fine-tuned for a downstream task.\n"
]
},
{
"cell_type": "markdown",
"id": "184e1cc6",
"metadata": {},
"source": [
"Sentence embeddings can be produced as follows:\n"
]
},
{
"cell_type": "markdown",
"id": "a9613056",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d60b7b64",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "716f2b63",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cointegrated/rubert-tiny2\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "0ba8c599",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/cointegrated/rubert-tiny2](https://huggingface.co/cointegrated/rubert-tiny2),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "db267b71",
"metadata": {},
"source": [
"## Overview\n",
"\n",
"This is an updated version of cointegrated/rubert-tiny: a small Russian BERT-based encoder with high-quality sentence embeddings. This [post in Russian](https://habr.com/ru/post/669674/) gives more details.\n"
]
},
{
"cell_type": "markdown",
"id": "801acf5c",
"metadata": {},
"source": [
"The differences from the previous version include:\n",
"- a larger vocabulary: 83828 tokens instead of 29564;\n",
"- larger supported sequences: 2048 instead of 512;\n",
"- sentence embeddings approximate LaBSE closer than before;\n",
"- meaningful segment embeddings (tuned on the NLI task)\n",
"- the model is focused only on Russian.\n"
]
},
{
"cell_type": "markdown",
"id": "f2c7dbc1",
"metadata": {},
"source": [
"The model should be used as is to produce sentence embeddings (e.g. for KNN classification of short texts) or fine-tuned for a downstream task.\n"
]
},
{
"cell_type": "markdown",
"id": "9ff63df2",
"metadata": {},
"source": [
"Sentence embeddings can be produced as follows:\n"
]
},
{
"cell_type": "markdown",
"id": "2b073558",
"metadata": {},
"source": [
"## how to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c98c0cce",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "81978806",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cointegrated/rubert-tiny2\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "33dbe378",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/cointegrated/rubert-tiny2](https://huggingface.co/cointegrated/rubert-tiny2) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: ''
License: apache-2.0
Model_Info: Model_Info:
name: "cross-encoder/ms-marco-MiniLM-L-12-v2" description: Cross-Encoder for MS Marco
description: "Cross-Encoder for MS Marco" description_en: Cross-Encoder for MS Marco
description_en: "Cross-Encoder for MS Marco" from_repo: https://huggingface.co/cross-encoder/ms-marco-MiniLM-L-12-v2
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/cross-encoder/ms-marco-MiniLM-L-12-v2" name: cross-encoder/ms-marco-MiniLM-L-12-v2
Paper: null
Publisher: cross-encoder
Task: Task:
- tag_en: "Natural Language Processing" - sub_tag: 文本分类
tag: "自然语言处理" sub_tag_en: Text Classification
sub_tag_en: "Text Classification" tag: 自然语言处理
sub_tag: "文本分类" tag_en: Natural Language Processing
Example:
Datasets: ""
Publisher: "cross-encoder"
License: "apache-2.0"
Language: ""
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
{
"cells": [
{
"cell_type": "markdown",
"id": "b14e9fee",
"metadata": {},
"source": [
"# Cross-Encoder for MS Marco\n"
]
},
{
"cell_type": "markdown",
"id": "770d5215",
"metadata": {},
"source": [
"This model was trained on the [MS Marco Passage Ranking](https://github.com/microsoft/MSMARCO-Passage-Ranking) task.\n"
]
},
{
"cell_type": "markdown",
"id": "0e8686b5",
"metadata": {},
"source": [
"The model can be used for Information Retrieval: Given a query, encode the query will all possible passages (e.g. retrieved with ElasticSearch). Then sort the passages in a decreasing order. See [SBERT.net Retrieve & Re-rank](https://www.sbert.net/examples/applications/retrieve_rerank/README.html) for more details. The training code is available here: [SBERT.net Training MS Marco](https://github.com/UKPLab/sentence-transformers/tree/master/examples/training/ms_marco)\n"
]
},
{
"cell_type": "markdown",
"id": "c437c78a",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1f4581da",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "295c7df7",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/ms-marco-MiniLM-L-12-v2\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "706017d9",
"metadata": {},
"source": [
"## Performance\n",
"In the following table, we provide various pre-trained Cross-Encoders together with their performance on the [TREC Deep Learning 2019](https://microsoft.github.io/TREC-2019-Deep-Learning/) and the [MS Marco Passage Reranking](https://github.com/microsoft/MSMARCO-Passage-Ranking/) dataset.\n"
]
},
{
"cell_type": "markdown",
"id": "2aa6bf22",
"metadata": {},
"source": [
"| Model-Name | NDCG@10 (TREC DL 19) | MRR@10 (MS Marco Dev) | Docs / Sec |\n",
"| ------------- |:-------------| -----| --- |\n",
"| **Version 2 models** | | |\n",
"| cross-encoder/ms-marco-TinyBERT-L-2-v2 | 69.84 | 32.56 | 9000\n",
"| cross-encoder/ms-marco-MiniLM-L-2-v2 | 71.01 | 34.85 | 4100\n",
"| cross-encoder/ms-marco-MiniLM-L-4-v2 | 73.04 | 37.70 | 2500\n",
"| cross-encoder/ms-marco-MiniLM-L-6-v2 | 74.30 | 39.01 | 1800\n",
"| cross-encoder/ms-marco-MiniLM-L-12-v2 | 74.31 | 39.02 | 960\n",
"| **Version 1 models** | | |\n",
"| cross-encoder/ms-marco-TinyBERT-L-2 | 67.43 | 30.15 | 9000\n",
"| cross-encoder/ms-marco-TinyBERT-L-4 | 68.09 | 34.50 | 2900\n",
"| cross-encoder/ms-marco-TinyBERT-L-6 | 69.57 | 36.13 | 680\n",
"| cross-encoder/ms-marco-electra-base | 71.99 | 36.41 | 340\n",
"| **Other models** | | |\n",
"| nboost/pt-tinybert-msmarco | 63.63 | 28.80 | 2900\n",
"| nboost/pt-bert-base-uncased-msmarco | 70.94 | 34.75 | 340\n",
"| nboost/pt-bert-large-msmarco | 73.36 | 36.48 | 100\n",
"| Capreolus/electra-base-msmarco | 71.23 | 36.89 | 340\n",
"| amberoad/bert-multilingual-passage-reranking-msmarco | 68.40 | 35.54 | 330\n",
"| sebastian-hofstaetter/distilbert-cat-margin_mse-T2-msmarco | 72.82 | 37.88 | 720\n"
]
},
{
"cell_type": "markdown",
"id": "65eda465",
"metadata": {},
"source": [
"Note: Runtime was computed on a V100 GPU.\n",
"> 此模型介绍及权重来源于[https://huggingface.co/cross-encoder/ms-marco-MiniLM-L-12-v2](https://huggingface.co/cross-encoder/ms-marco-MiniLM-L-12-v2),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "366980e6",
"metadata": {},
"source": [
"# Cross-Encoder for MS Marco\n"
]
},
{
"cell_type": "markdown",
"id": "4c7d726e",
"metadata": {},
"source": [
"This model was trained on the [MS Marco Passage Ranking](https://github.com/microsoft/MSMARCO-Passage-Ranking) task.\n"
]
},
{
"cell_type": "markdown",
"id": "1535e90f",
"metadata": {},
"source": [
"The model can be used for Information Retrieval: Given a query, encode the query will all possible passages (e.g. retrieved with ElasticSearch). Then sort the passages in a decreasing order. See [SBERT.net Retrieve & Re-rank](https://www.sbert.net/examples/applications/retrieve_rerank/README.html) for more details. The training code is available here: [SBERT.net Training MS Marco](https://github.com/UKPLab/sentence-transformers/tree/master/examples/training/ms_marco)\n"
]
},
{
"cell_type": "markdown",
"id": "3eda3140",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "74d5bcd7",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "59553cde",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/ms-marco-MiniLM-L-12-v2\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "0b6883fa",
"metadata": {},
"source": [
"## Performance\n",
"In the following table, we provide various pre-trained Cross-Encoders together with their performance on the [TREC Deep Learning 2019](https://microsoft.github.io/TREC-2019-Deep-Learning/) and the [MS Marco Passage Reranking](https://github.com/microsoft/MSMARCO-Passage-Ranking/) dataset.\n"
]
},
{
"cell_type": "markdown",
"id": "e04ad9db",
"metadata": {},
"source": [
"| Model-Name | NDCG@10 (TREC DL 19) | MRR@10 (MS Marco Dev) | Docs / Sec |\n",
"| ------------- |:-------------| -----| --- |\n",
"| **Version 2 models** | | |\n",
"| cross-encoder/ms-marco-TinyBERT-L-2-v2 | 69.84 | 32.56 | 9000\n",
"| cross-encoder/ms-marco-MiniLM-L-2-v2 | 71.01 | 34.85 | 4100\n",
"| cross-encoder/ms-marco-MiniLM-L-4-v2 | 73.04 | 37.70 | 2500\n",
"| cross-encoder/ms-marco-MiniLM-L-6-v2 | 74.30 | 39.01 | 1800\n",
"| cross-encoder/ms-marco-MiniLM-L-12-v2 | 74.31 | 39.02 | 960\n",
"| **Version 1 models** | | |\n",
"| cross-encoder/ms-marco-TinyBERT-L-2 | 67.43 | 30.15 | 9000\n",
"| cross-encoder/ms-marco-TinyBERT-L-4 | 68.09 | 34.50 | 2900\n",
"| cross-encoder/ms-marco-TinyBERT-L-6 | 69.57 | 36.13 | 680\n",
"| cross-encoder/ms-marco-electra-base | 71.99 | 36.41 | 340\n",
"| **Other models** | | |\n",
"| nboost/pt-tinybert-msmarco | 63.63 | 28.80 | 2900\n",
"| nboost/pt-bert-base-uncased-msmarco | 70.94 | 34.75 | 340\n",
"| nboost/pt-bert-large-msmarco | 73.36 | 36.48 | 100\n",
"| Capreolus/electra-base-msmarco | 71.23 | 36.89 | 340\n",
"| amberoad/bert-multilingual-passage-reranking-msmarco | 68.40 | 35.54 | 330\n",
"| sebastian-hofstaetter/distilbert-cat-margin_mse-T2-msmarco | 72.82 | 37.88 | 720\n"
]
},
{
"cell_type": "markdown",
"id": "18e7124d",
"metadata": {},
"source": [
"Note: Runtime was computed on a V100 GPU.\n",
"> The model introduction and model weights originate from [https://huggingface.co/cross-encoder/ms-marco-MiniLM-L-12-v2](https://huggingface.co/cross-encoder/ms-marco-MiniLM-L-12-v2) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: ''
License: apache-2.0
Model_Info: Model_Info:
name: "cross-encoder/ms-marco-TinyBERT-L-2" description: Cross-Encoder for MS Marco
description: "Cross-Encoder for MS Marco" description_en: Cross-Encoder for MS Marco
description_en: "Cross-Encoder for MS Marco" from_repo: https://huggingface.co/cross-encoder/ms-marco-TinyBERT-L-2
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/cross-encoder/ms-marco-TinyBERT-L-2" name: cross-encoder/ms-marco-TinyBERT-L-2
Paper: null
Publisher: cross-encoder
Task: Task:
- tag_en: "Natural Language Processing" - sub_tag: 文本分类
tag: "自然语言处理" sub_tag_en: Text Classification
sub_tag_en: "Text Classification" tag: 自然语言处理
sub_tag: "文本分类" tag_en: Natural Language Processing
Example:
Datasets: ""
Publisher: "cross-encoder"
License: "apache-2.0"
Language: ""
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
{
"cells": [
{
"cell_type": "markdown",
"id": "32947f83",
"metadata": {},
"source": [
"# Cross-Encoder for MS Marco\n"
]
},
{
"cell_type": "markdown",
"id": "d34eaa08",
"metadata": {},
"source": [
"This model was trained on the [MS Marco Passage Ranking](https://github.com/microsoft/MSMARCO-Passage-Ranking) task.\n"
]
},
{
"cell_type": "markdown",
"id": "dcf2e434",
"metadata": {},
"source": [
"The model can be used for Information Retrieval: Given a query, encode the query will all possible passages (e.g. retrieved with ElasticSearch). Then sort the passages in a decreasing order. See [SBERT.net Retrieve & Re-rank](https://www.sbert.net/examples/applications/retrieve_rerank/README.html) for more details. The training code is available here: [SBERT.net Training MS Marco](https://github.com/UKPLab/sentence-transformers/tree/master/examples/training/ms_marco)\n"
]
},
{
"cell_type": "markdown",
"id": "bb938635",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "463fcbb2",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e3ac7704",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/ms-marco-TinyBERT-L-2\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "e185e8d7",
"metadata": {},
"source": [
"## Performance\n",
"In the following table, we provide various pre-trained Cross-Encoders together with their performance on the [TREC Deep Learning 2019](https://microsoft.github.io/TREC-2019-Deep-Learning/) and the [MS Marco Passage Reranking](https://github.com/microsoft/MSMARCO-Passage-Ranking/) dataset.\n"
]
},
{
"cell_type": "markdown",
"id": "1b6ce4a0",
"metadata": {},
"source": [
"| Model-Name | NDCG@10 (TREC DL 19) | MRR@10 (MS Marco Dev) | Docs / Sec |\n",
"| ------------- |:-------------| -----| --- |\n",
"| **Version 2 models** | | |\n",
"| cross-encoder/ms-marco-TinyBERT-L-2-v2 | 69.84 | 32.56 | 9000\n",
"| cross-encoder/ms-marco-MiniLM-L-2-v2 | 71.01 | 34.85 | 4100\n",
"| cross-encoder/ms-marco-MiniLM-L-4-v2 | 73.04 | 37.70 | 2500\n",
"| cross-encoder/ms-marco-MiniLM-L-6-v2 | 74.30 | 39.01 | 1800\n",
"| cross-encoder/ms-marco-MiniLM-L-12-v2 | 74.31 | 39.02 | 960\n",
"| **Version 1 models** | | |\n",
"| cross-encoder/ms-marco-TinyBERT-L-2 | 67.43 | 30.15 | 9000\n",
"| cross-encoder/ms-marco-TinyBERT-L-4 | 68.09 | 34.50 | 2900\n",
"| cross-encoder/ms-marco-TinyBERT-L-6 | 69.57 | 36.13 | 680\n",
"| cross-encoder/ms-marco-electra-base | 71.99 | 36.41 | 340\n",
"| **Other models** | | |\n",
"| nboost/pt-tinybert-msmarco | 63.63 | 28.80 | 2900\n",
"| nboost/pt-bert-base-uncased-msmarco | 70.94 | 34.75 | 340\n",
"| nboost/pt-bert-large-msmarco | 73.36 | 36.48 | 100\n",
"| Capreolus/electra-base-msmarco | 71.23 | 36.89 | 340\n",
"| amberoad/bert-multilingual-passage-reranking-msmarco | 68.40 | 35.54 | 330\n",
"| sebastian-hofstaetter/distilbert-cat-margin_mse-T2-msmarco | 72.82 | 37.88 | 720\n"
]
},
{
"cell_type": "markdown",
"id": "478f9bd9",
"metadata": {},
"source": [
"Note: Runtime was computed on a V100 GPU.\n",
"> 此模型介绍及权重来源于[https://huggingface.co/cross-encoder/ms-marco-TinyBERT-L-2](https://huggingface.co/cross-encoder/ms-marco-TinyBERT-L-2),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "545c6ec0",
"metadata": {},
"source": [
"# Cross-Encoder for MS Marco\n"
]
},
{
"cell_type": "markdown",
"id": "cbd27361",
"metadata": {},
"source": [
"This model was trained on the [MS Marco Passage Ranking](https://github.com/microsoft/MSMARCO-Passage-Ranking) task.\n"
]
},
{
"cell_type": "markdown",
"id": "185acb77",
"metadata": {},
"source": [
"The model can be used for Information Retrieval: Given a query, encode the query will all possible passages (e.g. retrieved with ElasticSearch). Then sort the passages in a decreasing order. See [SBERT.net Retrieve & Re-rank](https://www.sbert.net/examples/applications/retrieve_rerank/README.html) for more details. The training code is available here: [SBERT.net Training MS Marco](https://github.com/UKPLab/sentence-transformers/tree/master/examples/training/ms_marco)\n"
]
},
{
"cell_type": "markdown",
"id": "1fb83fc3",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2cf01d71",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d24e4eb7",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/ms-marco-TinyBERT-L-2\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "7eb19416",
"metadata": {},
"source": [
"## Performance\n",
"In the following table, we provide various pre-trained Cross-Encoders together with their performance on the [TREC Deep Learning 2019](https://microsoft.github.io/TREC-2019-Deep-Learning/) and the [MS Marco Passage Reranking](https://github.com/microsoft/MSMARCO-Passage-Ranking/) dataset.\n"
]
},
{
"cell_type": "markdown",
"id": "e51901bb",
"metadata": {},
"source": [
"| Model-Name | NDCG@10 (TREC DL 19) | MRR@10 (MS Marco Dev) | Docs / Sec |\n",
"| ------------- |:-------------| -----| --- |\n",
"| **Version 2 models** | | |\n",
"| cross-encoder/ms-marco-TinyBERT-L-2-v2 | 69.84 | 32.56 | 9000\n",
"| cross-encoder/ms-marco-MiniLM-L-2-v2 | 71.01 | 34.85 | 4100\n",
"| cross-encoder/ms-marco-MiniLM-L-4-v2 | 73.04 | 37.70 | 2500\n",
"| cross-encoder/ms-marco-MiniLM-L-6-v2 | 74.30 | 39.01 | 1800\n",
"| cross-encoder/ms-marco-MiniLM-L-12-v2 | 74.31 | 39.02 | 960\n",
"| **Version 1 models** | | |\n",
"| cross-encoder/ms-marco-TinyBERT-L-2 | 67.43 | 30.15 | 9000\n",
"| cross-encoder/ms-marco-TinyBERT-L-4 | 68.09 | 34.50 | 2900\n",
"| cross-encoder/ms-marco-TinyBERT-L-6 | 69.57 | 36.13 | 680\n",
"| cross-encoder/ms-marco-electra-base | 71.99 | 36.41 | 340\n",
"| **Other models** | | |\n",
"| nboost/pt-tinybert-msmarco | 63.63 | 28.80 | 2900\n",
"| nboost/pt-bert-base-uncased-msmarco | 70.94 | 34.75 | 340\n",
"| nboost/pt-bert-large-msmarco | 73.36 | 36.48 | 100\n",
"| Capreolus/electra-base-msmarco | 71.23 | 36.89 | 340\n",
"| amberoad/bert-multilingual-passage-reranking-msmarco | 68.40 | 35.54 | 330\n",
"| sebastian-hofstaetter/distilbert-cat-margin_mse-T2-msmarco | 72.82 | 37.88 | 720\n"
]
},
{
"cell_type": "markdown",
"id": "f2318843",
"metadata": {},
"source": [
"Note: Runtime was computed on a V100 GPU.\n",
"> The model introduction and model weights originate from [https://huggingface.co/cross-encoder/ms-marco-TinyBERT-L-2](https://huggingface.co/cross-encoder/ms-marco-TinyBERT-L-2) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: multi_nli,snli
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info: Model_Info:
name: "cross-encoder/nli-MiniLM2-L6-H768" description: Cross-Encoder for Natural Language Inference
description: "Cross-Encoder for Natural Language Inference" description_en: Cross-Encoder for Natural Language Inference
description_en: "Cross-Encoder for Natural Language Inference" from_repo: https://huggingface.co/cross-encoder/nli-MiniLM2-L6-H768
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/cross-encoder/nli-MiniLM2-L6-H768" name: cross-encoder/nli-MiniLM2-L6-H768
Paper: null
Publisher: cross-encoder
Task: Task:
- tag_en: "Natural Language Processing" - sub_tag: 零样本分类
tag: "自然语言处理" sub_tag_en: Zero-Shot Classification
sub_tag_en: "Zero-Shot Classification" tag: 自然语言处理
sub_tag: "零样本分类" tag_en: Natural Language Processing
- tag_en: "Natural Language Processing" - sub_tag: 文本分类
tag: "自然语言处理" sub_tag_en: Text Classification
sub_tag_en: "Text Classification" tag: 自然语言处理
sub_tag: "文本分类" tag_en: Natural Language Processing
Example:
Datasets: "multi_nli,snli"
Publisher: "cross-encoder"
License: "apache-2.0"
Language: "English"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
{
"cells": [
{
"cell_type": "markdown",
"id": "f11c50a6",
"metadata": {},
"source": [
"# Cross-Encoder for Natural Language Inference\n",
"This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class.\n"
]
},
{
"cell_type": "markdown",
"id": "e01fe90a",
"metadata": {},
"source": [
"## Training Data\n",
"The model was trained on the [SNLI](https://nlp.stanford.edu/projects/snli/) and [MultiNLI](https://cims.nyu.edu/~sbowman/multinli/) datasets. For a given sentence pair, it will output three scores corresponding to the labels: contradiction, entailment, neutral.\n"
]
},
{
"cell_type": "markdown",
"id": "ff850419",
"metadata": {},
"source": [
"## Performance\n",
"For evaluation results, see [SBERT.net - Pretrained Cross-Encoder](https://www.sbert.net/docs/pretrained_cross-encoders.html#nli).\n"
]
},
{
"cell_type": "markdown",
"id": "a0b92b0d",
"metadata": {},
"source": [
"## Usage\n"
]
},
{
"cell_type": "markdown",
"id": "d3857388",
"metadata": {},
"source": [
"Pre-trained models can be used like this:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d2c99a51",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "aeda53c1",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/nli-MiniLM2-L6-H768\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "760a7b59",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/cross-encoder/nli-MiniLM2-L6-H768](https://huggingface.co/cross-encoder/nli-MiniLM2-L6-H768),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "7d3f71fa",
"metadata": {},
"source": [
"# Cross-Encoder for Natural Language Inference\n",
"This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class.\n"
]
},
{
"cell_type": "markdown",
"id": "daf01f92",
"metadata": {},
"source": [
"## Training Data\n",
"The model was trained on the [SNLI](https://nlp.stanford.edu/projects/snli/) and [MultiNLI](https://cims.nyu.edu/~sbowman/multinli/) datasets. For a given sentence pair, it will output three scores corresponding to the labels: contradiction, entailment, neutral.\n"
]
},
{
"cell_type": "markdown",
"id": "805a7294",
"metadata": {},
"source": [
"## Performance\n",
"For evaluation results, see [SBERT.net - Pretrained Cross-Encoder](https://www.sbert.net/docs/pretrained_cross-encoders.html#nli).\n"
]
},
{
"cell_type": "markdown",
"id": "46a403e0",
"metadata": {},
"source": [
"## Usage\n"
]
},
{
"cell_type": "markdown",
"id": "abbbbd38",
"metadata": {},
"source": [
"Pre-trained models can be used like this:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a2522fb4",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1557ae2a",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/nli-MiniLM2-L6-H768\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "4259d72d",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/cross-encoder/nli-MiniLM2-L6-H768](https://huggingface.co/cross-encoder/nli-MiniLM2-L6-H768) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: multi_nli,snli
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info: Model_Info:
name: "cross-encoder/nli-distilroberta-base" description: Cross-Encoder for Natural Language Inference
description: "Cross-Encoder for Natural Language Inference" description_en: Cross-Encoder for Natural Language Inference
description_en: "Cross-Encoder for Natural Language Inference" from_repo: https://huggingface.co/cross-encoder/nli-distilroberta-base
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/cross-encoder/nli-distilroberta-base" name: cross-encoder/nli-distilroberta-base
Paper: null
Publisher: cross-encoder
Task: Task:
- tag_en: "Natural Language Processing" - sub_tag: 零样本分类
tag: "自然语言处理" sub_tag_en: Zero-Shot Classification
sub_tag_en: "Zero-Shot Classification" tag: 自然语言处理
sub_tag: "零样本分类" tag_en: Natural Language Processing
- tag_en: "Natural Language Processing" - sub_tag: 文本分类
tag: "自然语言处理" sub_tag_en: Text Classification
sub_tag_en: "Text Classification" tag: 自然语言处理
sub_tag: "文本分类" tag_en: Natural Language Processing
Example:
Datasets: "multi_nli,snli"
Publisher: "cross-encoder"
License: "apache-2.0"
Language: "English"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
{
"cells": [
{
"cell_type": "markdown",
"id": "dfce17cd",
"metadata": {},
"source": [
"# Cross-Encoder for Natural Language Inference\n",
"This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class.\n"
]
},
{
"cell_type": "markdown",
"id": "ec682169",
"metadata": {},
"source": [
"## Training Data\n",
"The model was trained on the [SNLI](https://nlp.stanford.edu/projects/snli/) and [MultiNLI](https://cims.nyu.edu/~sbowman/multinli/) datasets. For a given sentence pair, it will output three scores corresponding to the labels: contradiction, entailment, neutral.\n"
]
},
{
"cell_type": "markdown",
"id": "ba993930",
"metadata": {},
"source": [
"## Performance\n",
"For evaluation results, see [SBERT.net - Pretrained Cross-Encoder](https://www.sbert.net/docs/pretrained_cross-encoders.html#nli).\n"
]
},
{
"cell_type": "markdown",
"id": "15de6eec",
"metadata": {},
"source": [
"## Usage\n"
]
},
{
"cell_type": "markdown",
"id": "6ab89b97",
"metadata": {},
"source": [
"Pre-trained models can be used like this:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f53af30f",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f31b1839",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/nli-distilroberta-base\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "4254d407",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/cross-encoder/nli-distilroberta-base](https://huggingface.co/cross-encoder/nli-distilroberta-base),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "a4ae7e65",
"metadata": {},
"source": [
"# Cross-Encoder for Natural Language Inference\n",
"This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class.\n"
]
},
{
"cell_type": "markdown",
"id": "f2d88a35",
"metadata": {},
"source": [
"## Training Data\n",
"The model was trained on the [SNLI](https://nlp.stanford.edu/projects/snli/) and [MultiNLI](https://cims.nyu.edu/~sbowman/multinli/) datasets. For a given sentence pair, it will output three scores corresponding to the labels: contradiction, entailment, neutral.\n"
]
},
{
"cell_type": "markdown",
"id": "d982bc91",
"metadata": {},
"source": [
"## Performance\n",
"For evaluation results, see [SBERT.net - Pretrained Cross-Encoder](https://www.sbert.net/docs/pretrained_cross-encoders.html#nli).\n"
]
},
{
"cell_type": "markdown",
"id": "1f3796c9",
"metadata": {},
"source": [
"## Usage\n"
]
},
{
"cell_type": "markdown",
"id": "14206f74",
"metadata": {},
"source": [
"Pre-trained models can be used like this:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7e5f7a2f",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "05497be6",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/nli-distilroberta-base\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "ea7e434c",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/cross-encoder/nli-distilroberta-base](https://huggingface.co/cross-encoder/nli-distilroberta-base) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: multi_nli,snli
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info: Model_Info:
name: "cross-encoder/nli-roberta-base" description: Cross-Encoder for Natural Language Inference
description: "Cross-Encoder for Natural Language Inference" description_en: Cross-Encoder for Natural Language Inference
description_en: "Cross-Encoder for Natural Language Inference" from_repo: https://huggingface.co/cross-encoder/nli-roberta-base
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/cross-encoder/nli-roberta-base" name: cross-encoder/nli-roberta-base
Paper: null
Publisher: cross-encoder
Task: Task:
- tag_en: "Natural Language Processing" - sub_tag: 零样本分类
tag: "自然语言处理" sub_tag_en: Zero-Shot Classification
sub_tag_en: "Zero-Shot Classification" tag: 自然语言处理
sub_tag: "零样本分类" tag_en: Natural Language Processing
- tag_en: "Natural Language Processing" - sub_tag: 文本分类
tag: "自然语言处理" sub_tag_en: Text Classification
sub_tag_en: "Text Classification" tag: 自然语言处理
sub_tag: "文本分类" tag_en: Natural Language Processing
Example:
Datasets: "multi_nli,snli"
Publisher: "cross-encoder"
License: "apache-2.0"
Language: "English"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
{
"cells": [
{
"cell_type": "markdown",
"id": "4fd29af9",
"metadata": {},
"source": [
"# Cross-Encoder for Natural Language Inference\n",
"This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class.\n"
]
},
{
"cell_type": "markdown",
"id": "26cf9863",
"metadata": {},
"source": [
"## Training Data\n",
"The model was trained on the [SNLI](https://nlp.stanford.edu/projects/snli/) and [MultiNLI](https://cims.nyu.edu/~sbowman/multinli/) datasets. For a given sentence pair, it will output three scores corresponding to the labels: contradiction, entailment, neutral.\n"
]
},
{
"cell_type": "markdown",
"id": "913c77b3",
"metadata": {},
"source": [
"## Performance\n",
"For evaluation results, see [SBERT.net - Pretrained Cross-Encoder](https://www.sbert.net/docs/pretrained_cross-encoders.html#nli).\n"
]
},
{
"cell_type": "markdown",
"id": "1edcf5c1",
"metadata": {},
"source": [
"## Usage\n"
]
},
{
"cell_type": "markdown",
"id": "a3d044ef",
"metadata": {},
"source": [
"Pre-trained models can be used like this:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "549f470f",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "358989b6",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/nli-roberta-base\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "453c5b27",
"metadata": {},
"source": [
"此模型介绍及权重来源于[https://huggingface.co/cross-encoder/nli-roberta-base](https://huggingface.co/cross-encoder/nli-roberta-base),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "d174d9c5",
"metadata": {},
"source": [
"# Cross-Encoder for Natural Language Inference\n",
"This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class.\n"
]
},
{
"cell_type": "markdown",
"id": "6b47f4c6",
"metadata": {},
"source": [
"## Training Data\n",
"The model was trained on the [SNLI](https://nlp.stanford.edu/projects/snli/) and [MultiNLI](https://cims.nyu.edu/~sbowman/multinli/) datasets. For a given sentence pair, it will output three scores corresponding to the labels: contradiction, entailment, neutral.\n"
]
},
{
"cell_type": "markdown",
"id": "39bc9190",
"metadata": {},
"source": [
"## Performance\n",
"For evaluation results, see [SBERT.net - Pretrained Cross-Encoder](https://www.sbert.net/docs/pretrained_cross-encoders.html#nli).\n"
]
},
{
"cell_type": "markdown",
"id": "0d84928d",
"metadata": {},
"source": [
"## Usage\n"
]
},
{
"cell_type": "markdown",
"id": "3b2a033c",
"metadata": {},
"source": [
"Pre-trained models can be used like this:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4d9e33fd",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f84786a3",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/nli-roberta-base\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "dac6563c",
"metadata": {},
"source": [
"The model introduction and model weights originate from [https://huggingface.co/cross-encoder/nli-roberta-base](https://huggingface.co/cross-encoder/nli-roberta-base) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
# 模型列表
## cross-encoder/qnli-distilroberta-base
| 模型名称 | 模型介绍 | 模型大小 | 模型下载 |
| --- | --- | --- | --- |
|cross-encoder/qnli-distilroberta-base| | 313.28MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/cross-encoder/qnli-distilroberta-base/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/cross-encoder/qnli-distilroberta-base/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/cross-encoder/qnli-distilroberta-base/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/cross-encoder/qnli-distilroberta-base/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/cross-encoder/qnli-distilroberta-base/vocab.json)<br>[vocab.txt](https://bj.bcebos.com/paddlenlp/models/community/cross-encoder/qnli-distilroberta-base/vocab.txt) |
也可以通过`paddlenlp` cli 工具来下载对应的模型权重,使用步骤如下所示:
* 安装paddlenlp
```shell
pip install --upgrade paddlenlp
```
* 下载命令行
```shell
paddlenlp download --cache-dir ./pretrained_models cross-encoder/qnli-distilroberta-base
```
有任何下载的问题都可以到[PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP)中发Issue提问。
\ No newline at end of file
# model list
##
| model | description | model_size | download |
| --- | --- | --- | --- |
|cross-encoder/qnli-distilroberta-base| | 313.28MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/cross-encoder/qnli-distilroberta-base/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/cross-encoder/qnli-distilroberta-base/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/cross-encoder/qnli-distilroberta-base/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/cross-encoder/qnli-distilroberta-base/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/cross-encoder/qnli-distilroberta-base/vocab.json)<br>[vocab.txt](https://bj.bcebos.com/paddlenlp/models/community/cross-encoder/qnli-distilroberta-base/vocab.txt) |
or you can download all of model file with the following steps:
* install paddlenlp
```shell
pip install --upgrade paddlenlp
```
* download model with cli tool
```shell
paddlenlp download --cache-dir ./pretrained_models cross-encoder/qnli-distilroberta-base
```
If you have any problems with it, you can post issue on [PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP) to get support.
Model_Info:
name: "cross-encoder/qnli-distilroberta-base"
description: "Cross-Encoder for Quora Duplicate Questions Detection"
description_en: "Cross-Encoder for Quora Duplicate Questions Detection"
icon: ""
from_repo: "https://huggingface.co/cross-encoder/qnli-distilroberta-base"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text Classification"
sub_tag: "文本分类"
Example:
Datasets: ""
Publisher: "cross-encoder"
License: "apache-2.0"
Language: ""
Paper:
- title: 'GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding'
url: 'http://arxiv.org/abs/1804.07461v3'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: ''
License: apache-2.0
Model_Info: Model_Info:
name: "cross-encoder/quora-distilroberta-base" description: Cross-Encoder for Quora Duplicate Questions Detection
description: "Cross-Encoder for Quora Duplicate Questions Detection" description_en: Cross-Encoder for Quora Duplicate Questions Detection
description_en: "Cross-Encoder for Quora Duplicate Questions Detection" from_repo: https://huggingface.co/cross-encoder/quora-distilroberta-base
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/cross-encoder/quora-distilroberta-base" name: cross-encoder/quora-distilroberta-base
Paper: null
Publisher: cross-encoder
Task: Task:
- tag_en: "Natural Language Processing" - sub_tag: 文本分类
tag: "自然语言处理" sub_tag_en: Text Classification
sub_tag_en: "Text Classification" tag: 自然语言处理
sub_tag: "文本分类" tag_en: Natural Language Processing
Example:
Datasets: ""
Publisher: "cross-encoder"
License: "apache-2.0"
Language: ""
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
{
"cells": [
{
"cell_type": "markdown",
"id": "9108ec88",
"metadata": {},
"source": [
"# Cross-Encoder for Quora Duplicate Questions Detection\n",
"This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class.\n"
]
},
{
"cell_type": "markdown",
"id": "dcc58a5d",
"metadata": {},
"source": [
"## Training Data\n",
"This model was trained on the [Quora Duplicate Questions](https://www.quora.com/q/quoradata/First-Quora-Dataset-Release-Question-Pairs) dataset. The model will predict a score between 0 and 1 how likely the two given questions are duplicates.\n"
]
},
{
"cell_type": "markdown",
"id": "4f967914",
"metadata": {},
"source": [
"Note: The model is not suitable to estimate the similarity of questions, e.g. the two questions \"How to learn Java\" and \"How to learn Python\" will result in a rahter low score, as these are not duplicates.\n"
]
},
{
"cell_type": "markdown",
"id": "fe95bb7e",
"metadata": {},
"source": [
"## Usage and Performance\n"
]
},
{
"cell_type": "markdown",
"id": "8efd69d2",
"metadata": {},
"source": [
"Pre-trained models can be used like this:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "92142a26",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "436ba799",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/quora-distilroberta-base\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "a5a90cce",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/cross-encoder/quora-distilroberta-base](https://huggingface.co/cross-encoder/quora-distilroberta-base),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "104bbe82",
"metadata": {},
"source": [
"# Cross-Encoder for Quora Duplicate Questions Detection\n",
"This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class.\n"
]
},
{
"cell_type": "markdown",
"id": "71def254",
"metadata": {},
"source": [
"## Training Data\n",
"This model was trained on the [Quora Duplicate Questions](https://www.quora.com/q/quoradata/First-Quora-Dataset-Release-Question-Pairs) dataset. The model will predict a score between 0 and 1 how likely the two given questions are duplicates.\n"
]
},
{
"cell_type": "markdown",
"id": "10f2b17c",
"metadata": {},
"source": [
"Note: The model is not suitable to estimate the similarity of questions, e.g. the two questions \"How to learn Java\" and \"How to learn Python\" will result in a rahter low score, as these are not duplicates.\n"
]
},
{
"cell_type": "markdown",
"id": "9e28c83a",
"metadata": {},
"source": [
"## Usage and Performance\n"
]
},
{
"cell_type": "markdown",
"id": "bc8ce622",
"metadata": {},
"source": [
"Pre-trained models can be used like this:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3f66406a",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7ba92b4f",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/quora-distilroberta-base\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "93656328",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/cross-encoder/quora-distilroberta-base](https://huggingface.co/cross-encoder/quora-distilroberta-base) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: ''
License: apache-2.0
Model_Info: Model_Info:
name: "cross-encoder/quora-roberta-base" description: Cross-Encoder for Quora Duplicate Questions Detection
description: "Cross-Encoder for Quora Duplicate Questions Detection" description_en: Cross-Encoder for Quora Duplicate Questions Detection
description_en: "Cross-Encoder for Quora Duplicate Questions Detection" from_repo: https://huggingface.co/cross-encoder/quora-roberta-base
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/cross-encoder/quora-roberta-base" name: cross-encoder/quora-roberta-base
Paper: null
Publisher: cross-encoder
Task: Task:
- tag_en: "Natural Language Processing" - sub_tag: 文本分类
tag: "自然语言处理" sub_tag_en: Text Classification
sub_tag_en: "Text Classification" tag: 自然语言处理
sub_tag: "文本分类" tag_en: Natural Language Processing
Example:
Datasets: ""
Publisher: "cross-encoder"
License: "apache-2.0"
Language: ""
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
{
"cells": [
{
"cell_type": "markdown",
"id": "87e3266c",
"metadata": {},
"source": [
"# Cross-Encoder for Quora Duplicate Questions Detection\n",
"This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class.\n"
]
},
{
"cell_type": "markdown",
"id": "e743234a",
"metadata": {},
"source": [
"## Training Data\n",
"This model was trained on the [Quora Duplicate Questions](https://www.quora.com/q/quoradata/First-Quora-Dataset-Release-Question-Pairs) dataset. The model will predict a score between 0 and 1 how likely the two given questions are duplicates.\n"
]
},
{
"cell_type": "markdown",
"id": "f755b608",
"metadata": {},
"source": [
"Note: The model is not suitable to estimate the similarity of questions, e.g. the two questions \"How to learn Java\" and \"How to learn Python\" will result in a rahter low score, as these are not duplicates.\n"
]
},
{
"cell_type": "markdown",
"id": "5a08f2c7",
"metadata": {},
"source": [
"## Usage and Performance\n"
]
},
{
"cell_type": "markdown",
"id": "c4021393",
"metadata": {},
"source": [
"Pre-trained models can be used like this:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5f704b5f",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "161c640b",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/quora-roberta-base\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "93a5e3b7",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/cross-encoder/quora-roberta-base](https://huggingface.co/cross-encoder/quora-roberta-base),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "74b2ba5f",
"metadata": {},
"source": [
"# Cross-Encoder for Quora Duplicate Questions Detection\n",
"This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class.\n"
]
},
{
"cell_type": "markdown",
"id": "36bf7390",
"metadata": {},
"source": [
"## Training Data\n",
"This model was trained on the [Quora Duplicate Questions](https://www.quora.com/q/quoradata/First-Quora-Dataset-Release-Question-Pairs) dataset. The model will predict a score between 0 and 1 how likely the two given questions are duplicates.\n"
]
},
{
"cell_type": "markdown",
"id": "5aa29571",
"metadata": {},
"source": [
"Note: The model is not suitable to estimate the similarity of questions, e.g. the two questions \"How to learn Java\" and \"How to learn Python\" will result in a rahter low score, as these are not duplicates.\n"
]
},
{
"cell_type": "markdown",
"id": "1fe76310",
"metadata": {},
"source": [
"## Usage and Performance\n"
]
},
{
"cell_type": "markdown",
"id": "e7067bef",
"metadata": {},
"source": [
"Pre-trained models can be used like this:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a9ea7b3d",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b30bfcd4",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/quora-roberta-base\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "ecb795de",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/cross-encoder/quora-roberta-base](https://huggingface.co/cross-encoder/quora-roberta-base) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: ''
License: apache-2.0
Model_Info: Model_Info:
name: "cross-encoder/stsb-TinyBERT-L-4" description: Cross-Encoder for Quora Duplicate Questions Detection
description: "Cross-Encoder for Quora Duplicate Questions Detection" description_en: Cross-Encoder for Quora Duplicate Questions Detection
description_en: "Cross-Encoder for Quora Duplicate Questions Detection" from_repo: https://huggingface.co/cross-encoder/stsb-TinyBERT-L-4
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/cross-encoder/stsb-TinyBERT-L-4" name: cross-encoder/stsb-TinyBERT-L-4
Paper: null
Publisher: cross-encoder
Task: Task:
- tag_en: "Natural Language Processing" - sub_tag: 文本分类
tag: "自然语言处理" sub_tag_en: Text Classification
sub_tag_en: "Text Classification" tag: 自然语言处理
sub_tag: "文本分类" tag_en: Natural Language Processing
Example:
Datasets: ""
Publisher: "cross-encoder"
License: "apache-2.0"
Language: ""
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
{
"cells": [
{
"cell_type": "markdown",
"id": "a3deebdc",
"metadata": {},
"source": [
"# Cross-Encoder for Quora Duplicate Questions Detection\n",
"This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class.\n"
]
},
{
"cell_type": "markdown",
"id": "4fc17643",
"metadata": {},
"source": [
"## Training Data\n",
"This model was trained on the [STS benchmark dataset](http://ixa2.si.ehu.eus/stswiki/index.php/STSbenchmark). The model will predict a score between 0 and 1 how for the semantic similarity of two sentences.\n"
]
},
{
"cell_type": "markdown",
"id": "f66fb11e",
"metadata": {},
"source": [
"## Usage and Performance\n"
]
},
{
"cell_type": "markdown",
"id": "fd12128b",
"metadata": {},
"source": [
"Pre-trained models can be used like this:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d0d04e39",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "d07e31aa",
"metadata": {
"collapsed": true
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/root/miniconda3/envs/paddle/lib/python3.7/site-packages/tqdm/auto.py:22: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
" from .autonotebook import tqdm as notebook_tqdm\n",
"\u001b[32m[2022-11-21 02:38:07,127] [ INFO]\u001b[0m - Downloading model_config.json from https://bj.bcebos.com/paddlenlp/models/community/cross-encoder/stsb-TinyBERT-L-4/model_config.json\u001b[0m\n",
"100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 432/432 [00:00<00:00, 425kB/s]\n",
"\u001b[32m[2022-11-21 02:38:07,197] [ INFO]\u001b[0m - We are using <class 'paddlenlp.transformers.bert.modeling.BertModel'> to load 'cross-encoder/stsb-TinyBERT-L-4'.\u001b[0m\n",
"\u001b[32m[2022-11-21 02:38:07,198] [ INFO]\u001b[0m - Downloading https://bj.bcebos.com/paddlenlp/models/community/cross-encoder/stsb-TinyBERT-L-4/model_state.pdparams and saved to /root/.paddlenlp/models/cross-encoder/stsb-TinyBERT-L-4\u001b[0m\n",
"\u001b[32m[2022-11-21 02:38:07,198] [ INFO]\u001b[0m - Downloading model_state.pdparams from https://bj.bcebos.com/paddlenlp/models/community/cross-encoder/stsb-TinyBERT-L-4/model_state.pdparams\u001b[0m\n",
"100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 54.8M/54.8M [00:00<00:00, 64.7MB/s]\n",
"\u001b[32m[2022-11-21 02:38:08,199] [ INFO]\u001b[0m - Already cached /root/.paddlenlp/models/cross-encoder/stsb-TinyBERT-L-4/model_config.json\u001b[0m\n",
"W1121 02:38:08.202270 64563 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 10.2, Runtime API Version: 10.2\n",
"W1121 02:38:08.207437 64563 gpu_resources.cc:91] device: 0, cuDNN Version: 7.6.\n",
"\u001b[32m[2022-11-21 02:38:09,661] [ INFO]\u001b[0m - Weights from pretrained model not used in BertModel: ['classifier.weight', 'classifier.bias']\u001b[0m\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"(Tensor(shape=[1, 20, 312], dtype=float32, place=Place(gpu:0), stop_gradient=False,\n",
" [[[-0.73827386, -0.57349819, 0.47456041, ..., -0.07317579,\n",
" 0.23808761, -0.43587247],\n",
" [-0.71079123, -0.37019217, 0.44499084, ..., -0.07541266,\n",
" 0.22209664, -0.48883811],\n",
" [-0.61283624, 0.01138088, 0.46346331, ..., -0.15316986,\n",
" 0.38455290, -0.23527470],\n",
" ...,\n",
" [-0.19267607, -0.42171016, 0.40080610, ..., -0.04322027,\n",
" 0.16102640, -0.43728969],\n",
" [-0.76348048, 0.00028179, 0.50795513, ..., 0.02495949,\n",
" 0.32419923, -0.44668996],\n",
" [-0.72070849, -0.48510927, 0.47747549, ..., -0.01621611,\n",
" 0.31407145, -0.38287419]]]), Tensor(shape=[1, 312], dtype=float32, place=Place(gpu:0), stop_gradient=False,\n",
" [[ 0.38359359, 0.16227540, -0.58949089, -0.67293817, 0.70552814,\n",
" 0.74028063, -0.60770833, 0.50480992, 0.71489060, -0.73976040,\n",
" -0.11784898, 0.73014355, -0.65726435, 0.17490843, -0.44103470,\n",
" 0.62014306, 0.35533482, -0.44271812, -0.61711168, -0.70586687,\n",
" 0.69903672, 0.00862758, 0.69424403, 0.31887573, 0.38736165,\n",
" 0.02848060, -0.69896543, 0.69952166, 0.56477094, 0.68585342,\n",
" 0.66026199, 0.67826200, 0.67839348, 0.74852920, -0.04272985,\n",
" 0.76357287, 0.38685408, -0.69717598, 0.69945419, 0.44048944,\n",
" -0.66915488, 0.11735962, 0.37215349, 0.73054057, 0.71345085,\n",
" 0.66489315, 0.19956835, 0.71552449, 0.64762783, -0.46583632,\n",
" -0.09976894, -0.45265704, 0.54242563, 0.42835563, -0.60076892,\n",
" 0.69768012, -0.72207040, -0.52898210, 0.34657273, 0.05400079,\n",
" 0.57360554, -0.72731823, -0.71799070, -0.37212241, -0.70602018,\n",
" -0.71248102, 0.02778789, -0.73165607, 0.46581894, -0.72120243,\n",
" 0.60769719, -0.63354278, 0.75307459, 0.00700274, -0.00984141,\n",
" -0.58984685, 0.36321065, 0.60098255, -0.72467339, 0.18362086,\n",
" 0.10687865, -0.63730168, -0.62655306, -0.00187578, -0.51795095,\n",
" -0.64884937, 0.69950461, 0.72286713, 0.72522557, -0.45434299,\n",
" -0.43063730, -0.10669708, -0.51012146, 0.66286671, 0.69542134,\n",
" 0.21393165, -0.02928682, 0.67238331, 0.20404275, -0.63556075,\n",
" 0.55774790, 0.26141557, 0.70166790, -0.03091500, 0.65226245,\n",
" -0.69878876, 0.32701582, -0.68492270, 0.67152256, 0.66395414,\n",
" -0.68914133, -0.63889050, 0.71558940, 0.50034380, -0.12911484,\n",
" 0.70831281, 0.68631476, -0.41206849, 0.23268108, 0.67747647,\n",
" -0.29744238, 0.65135175, -0.70074749, 0.56074560, -0.63501489,\n",
" 0.74985635, -0.60603380, 0.66920304, -0.72418481, -0.59756589,\n",
" -0.70151484, -0.38735744, -0.66458094, -0.71190053, -0.69316322,\n",
" 0.43108079, -0.21692288, 0.70705998, -0.14984211, 0.75786442,\n",
" 0.69729054, -0.68925959, -0.46773866, 0.66707891, -0.07957093,\n",
" 0.73757517, 0.10062494, -0.73353016, 0.10992812, -0.48824292,\n",
" 0.62493157, 0.43311006, -0.15723324, -0.48392498, -0.65230477,\n",
" -0.41098344, -0.65238249, -0.41507134, -0.55544889, -0.32195652,\n",
" -0.74827588, -0.64071310, -0.49207535, -0.69750905, -0.57037342,\n",
" 0.35724813, 0.74778593, 0.49369636, -0.69870174, 0.24547403,\n",
" 0.73229605, 0.15653144, 0.41334581, 0.64413625, 0.53084993,\n",
" -0.64746642, -0.58720803, 0.63381183, 0.76515305, -0.68342912,\n",
" 0.65923864, -0.74662960, -0.72339952, 0.32203752, -0.63402468,\n",
" -0.71399093, -0.50430977, 0.26967043, -0.21176267, 0.65678287,\n",
" 0.09193933, 0.23962519, 0.59481263, -0.61463839, -0.28634411,\n",
" 0.69451737, 0.47513142, 0.30889973, -0.18030594, -0.50777411,\n",
" 0.71548641, -0.34869543, -0.01252351, 0.12018032, 0.69536412,\n",
" 0.53745425, 0.54889160, -0.10619923, 0.68386155, -0.68498713,\n",
" 0.23352134, 0.67296249, -0.12094481, -0.69636226, -0.06552890,\n",
" 0.00965041, -0.52394331, 0.72305930, -0.17239039, -0.73262835,\n",
" 0.50841606, 0.39529455, -0.70830429, 0.51234418, 0.68391299,\n",
" -0.72483873, -0.51841038, -0.58264560, -0.74197364, 0.46386808,\n",
" -0.23263671, 0.21232133, -0.69674802, 0.33948907, 0.75922930,\n",
" -0.43505231, -0.53149903, -0.65927148, 0.09607304, -0.68945718,\n",
" 0.66966355, 0.68096715, 0.66396469, 0.13001618, -0.68894261,\n",
" -0.66597682, 0.61407733, 0.69670630, 0.63995171, 0.33257753,\n",
" 0.66776848, 0.57427299, 0.32768273, 0.69438887, 0.41346189,\n",
" -0.71529591, -0.09860074, -0.72291893, 0.16860481, -0.67641008,\n",
" 0.70644248, -0.24303547, 0.28892463, 0.56054235, 0.55539572,\n",
" 0.70762485, -0.50166684, -0.70544142, -0.74241722, -0.74010289,\n",
" 0.70217764, -0.09219251, 0.47989756, -0.17431454, 0.76019192,\n",
" -0.09623899, -0.64994997, -0.03216666, 0.70323825, -0.66661566,\n",
" 0.71163839, -0.08982500, -0.35390857, 0.61377501, -0.49430367,\n",
" 0.49526611, 0.75078416, -0.05324765, -0.75398672, 0.70934319,\n",
" 0.21146417, -0.59094489, 0.39163795, -0.67382598, -0.63484156,\n",
" -0.27295890, 0.75101918, 0.70603085, 0.71781063, -0.57344818,\n",
" -0.22560060, -0.62196493, 0.68178481, 0.61596531, -0.12730023,\n",
" -0.69500911, 0.73689735, 0.12627751, -0.26101601, -0.24929181,\n",
" 0.68093145, 0.05896470]]))\n"
]
}
],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/stsb-TinyBERT-L-4\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "aeccdfe1",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/cross-encoder/stsb-TinyBERT-L-4](https://huggingface.co/cross-encoder/stsb-TinyBERT-L-4),并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "6e8592db",
"metadata": {},
"source": [
"# Cross-Encoder for Quora Duplicate Questions Detection\n",
"This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class.\n"
]
},
{
"cell_type": "markdown",
"id": "c3be9ab9",
"metadata": {},
"source": [
"## Training Data\n",
"This model was trained on the [STS benchmark dataset](http://ixa2.si.ehu.eus/stswiki/index.php/STSbenchmark). The model will predict a score between 0 and 1 how for the semantic similarity of two sentences.\n"
]
},
{
"cell_type": "markdown",
"id": "3f2d2712",
"metadata": {},
"source": [
"## Usage and Performance\n"
]
},
{
"cell_type": "markdown",
"id": "0127bf3d",
"metadata": {},
"source": [
"Pre-trained models can be used like this:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e6968e7e",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "39e99053",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/stsb-TinyBERT-L-4\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "35446f31",
"metadata": {},
"source": [
"You can use this model also without sentence_transformers and by just using ``AutoModel`` class\n",
"> The model introduction and model weights originate from [https://huggingface.co/cross-encoder/stsb-TinyBERT-L-4](https://huggingface.co/cross-encoder/stsb-TinyBERT-L-4) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: ''
License: apache-2.0
Model_Info: Model_Info:
name: "cross-encoder/stsb-distilroberta-base" description: Cross-Encoder for Quora Duplicate Questions Detection
description: "Cross-Encoder for Quora Duplicate Questions Detection" description_en: Cross-Encoder for Quora Duplicate Questions Detection
description_en: "Cross-Encoder for Quora Duplicate Questions Detection" from_repo: https://huggingface.co/cross-encoder/stsb-distilroberta-base
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/cross-encoder/stsb-distilroberta-base" name: cross-encoder/stsb-distilroberta-base
Paper: null
Publisher: cross-encoder
Task: Task:
- tag_en: "Natural Language Processing" - sub_tag: 文本分类
tag: "自然语言处理" sub_tag_en: Text Classification
sub_tag_en: "Text Classification" tag: 自然语言处理
sub_tag: "文本分类" tag_en: Natural Language Processing
Example:
Datasets: ""
Publisher: "cross-encoder"
License: "apache-2.0"
Language: ""
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
{
"cells": [
{
"cell_type": "markdown",
"id": "7c9e1c38",
"metadata": {},
"source": [
"# Cross-Encoder for Quora Duplicate Questions Detection\n",
"This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class.\n"
]
},
{
"cell_type": "markdown",
"id": "c62db00c",
"metadata": {},
"source": [
"## Training Data\n",
"This model was trained on the [STS benchmark dataset](http://ixa2.si.ehu.eus/stswiki/index.php/STSbenchmark). The model will predict a score between 0 and 1 how for the semantic similarity of two sentences.\n"
]
},
{
"cell_type": "markdown",
"id": "03f81dda",
"metadata": {},
"source": [
"## Usage and Performance\n"
]
},
{
"cell_type": "markdown",
"id": "ac99e012",
"metadata": {},
"source": [
"Pre-trained models can be used like this:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "37931dd1",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ff0714d5",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/stsb-distilroberta-base\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "e783f36c",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/cross-encoder/stsb-distilroberta-base](https://huggingface.co/cross-encoder/stsb-distilroberta-base),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "cc55c2df",
"metadata": {},
"source": [
"# Cross-Encoder for Quora Duplicate Questions Detection\n",
"This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class.\n"
]
},
{
"cell_type": "markdown",
"id": "6e6d61e4",
"metadata": {},
"source": [
"## Training Data\n",
"This model was trained on the [STS benchmark dataset](http://ixa2.si.ehu.eus/stswiki/index.php/STSbenchmark). The model will predict a score between 0 and 1 how for the semantic similarity of two sentences.\n"
]
},
{
"cell_type": "markdown",
"id": "73fa7630",
"metadata": {},
"source": [
"## Usage and Performance\n"
]
},
{
"cell_type": "markdown",
"id": "33248e47",
"metadata": {},
"source": [
"Pre-trained models can be used like this:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "48f2d520",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f16202eb",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/stsb-distilroberta-base\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "8586b106",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/cross-encoder/stsb-distilroberta-base](https://huggingface.co/cross-encoder/stsb-distilroberta-base) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: ''
License: apache-2.0
Model_Info: Model_Info:
name: "cross-encoder/stsb-roberta-base" description: Cross-Encoder for Quora Duplicate Questions Detection
description: "Cross-Encoder for Quora Duplicate Questions Detection" description_en: Cross-Encoder for Quora Duplicate Questions Detection
description_en: "Cross-Encoder for Quora Duplicate Questions Detection" from_repo: https://huggingface.co/cross-encoder/stsb-roberta-base
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/cross-encoder/stsb-roberta-base" name: cross-encoder/stsb-roberta-base
Paper: null
Publisher: cross-encoder
Task: Task:
- tag_en: "Natural Language Processing" - sub_tag: 文本分类
tag: "自然语言处理" sub_tag_en: Text Classification
sub_tag_en: "Text Classification" tag: 自然语言处理
sub_tag: "文本分类" tag_en: Natural Language Processing
Example:
Datasets: ""
Publisher: "cross-encoder"
License: "apache-2.0"
Language: ""
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
{
"cells": [
{
"cell_type": "markdown",
"id": "0ce6be0e",
"metadata": {},
"source": [
"# Cross-Encoder for Quora Duplicate Questions Detection\n",
"This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class.\n"
]
},
{
"cell_type": "markdown",
"id": "6e5557d3",
"metadata": {},
"source": [
"## Training Data\n",
"This model was trained on the [STS benchmark dataset](http://ixa2.si.ehu.eus/stswiki/index.php/STSbenchmark). The model will predict a score between 0 and 1 how for the semantic similarity of two sentences.\n"
]
},
{
"cell_type": "markdown",
"id": "dac1f27b",
"metadata": {},
"source": [
"## Usage and Performance\n"
]
},
{
"cell_type": "markdown",
"id": "c279cc30",
"metadata": {},
"source": [
"Pre-trained models can be used like this:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "64e1d35f",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3c22da03",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/stsb-roberta-base\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "49af1fc0",
"metadata": {},
"source": [
"You can use this model also without sentence_transformers and by just using ``AutoModel`` class\n",
"> 此模型介绍及权重来源于[https://huggingface.co/cross-encoder/stsb-roberta-base](https://huggingface.co/cross-encoder/stsb-roberta-base),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "c3137a69",
"metadata": {},
"source": [
"# Cross-Encoder for Quora Duplicate Questions Detection\n",
"This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class.\n"
]
},
{
"cell_type": "markdown",
"id": "5406455e",
"metadata": {},
"source": [
"## Training Data\n",
"This model was trained on the [STS benchmark dataset](http://ixa2.si.ehu.eus/stswiki/index.php/STSbenchmark). The model will predict a score between 0 and 1 how for the semantic similarity of two sentences.\n"
]
},
{
"cell_type": "markdown",
"id": "565af020",
"metadata": {},
"source": [
"## Usage and Performance\n"
]
},
{
"cell_type": "markdown",
"id": "bd866838",
"metadata": {},
"source": [
"Pre-trained models can be used like this:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "07301a77",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0b756d3a",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/stsb-roberta-base\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "6ba822d5",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/cross-encoder/stsb-roberta-base](https://huggingface.co/cross-encoder/stsb-roberta-base) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: ''
License: apache-2.0
Model_Info: Model_Info:
name: "cross-encoder/stsb-roberta-large" description: Cross-Encoder for Quora Duplicate Questions Detection
description: "Cross-Encoder for Quora Duplicate Questions Detection" description_en: Cross-Encoder for Quora Duplicate Questions Detection
description_en: "Cross-Encoder for Quora Duplicate Questions Detection" from_repo: https://huggingface.co/cross-encoder/stsb-roberta-large
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/cross-encoder/stsb-roberta-large" name: cross-encoder/stsb-roberta-large
Paper: null
Publisher: cross-encoder
Task: Task:
- tag_en: "Natural Language Processing" - sub_tag: 文本分类
tag: "自然语言处理" sub_tag_en: Text Classification
sub_tag_en: "Text Classification" tag: 自然语言处理
sub_tag: "文本分类" tag_en: Natural Language Processing
Example:
Datasets: ""
Publisher: "cross-encoder"
License: "apache-2.0"
Language: ""
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
{
"cells": [
{
"cell_type": "markdown",
"id": "a8a5f540",
"metadata": {},
"source": [
"# Cross-Encoder for Quora Duplicate Questions Detection\n",
"This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class.\n"
]
},
{
"cell_type": "markdown",
"id": "e4d8f5f6",
"metadata": {},
"source": [
"## Training Data\n",
"This model was trained on the [STS benchmark dataset](http://ixa2.si.ehu.eus/stswiki/index.php/STSbenchmark). The model will predict a score between 0 and 1 how for the semantic similarity of two sentences.\n"
]
},
{
"cell_type": "markdown",
"id": "182943f7",
"metadata": {},
"source": [
"## Usage and Performance\n"
]
},
{
"cell_type": "markdown",
"id": "764e0664",
"metadata": {},
"source": [
"Pre-trained models can be used like this:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "61787745",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f4671372",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/stsb-roberta-large\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "9e8e26d0",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/cross-encoder/stsb-roberta-large](https://huggingface.co/cross-encoder/stsb-roberta-large),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "291a48fa",
"metadata": {},
"source": [
"# Cross-Encoder for Quora Duplicate Questions Detection\n",
"This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class.\n"
]
},
{
"cell_type": "markdown",
"id": "92f483ed",
"metadata": {},
"source": [
"## Training Data\n",
"This model was trained on the [STS benchmark dataset](http://ixa2.si.ehu.eus/stswiki/index.php/STSbenchmark). The model will predict a score between 0 and 1 how for the semantic similarity of two sentences.\n"
]
},
{
"cell_type": "markdown",
"id": "5dbde912",
"metadata": {},
"source": [
"## Usage and Performance\n"
]
},
{
"cell_type": "markdown",
"id": "3e04e94a",
"metadata": {},
"source": [
"Pre-trained models can be used like this:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4209e47d",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e2649c47",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/stsb-roberta-large\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "678e37ab",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/cross-encoder/stsb-roberta-large](https://huggingface.co/cross-encoder/stsb-roberta-large) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
# 模型列表
## csarron/roberta-base-squad-v1
| 模型名称 | 模型介绍 | 模型大小 | 模型下载 |
| --- | --- | --- | --- |
|csarron/roberta-base-squad-v1| | 475.51MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/csarron/roberta-base-squad-v1/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/csarron/roberta-base-squad-v1/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/csarron/roberta-base-squad-v1/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/csarron/roberta-base-squad-v1/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/csarron/roberta-base-squad-v1/vocab.json)<br>[vocab.txt](https://bj.bcebos.com/paddlenlp/models/community/csarron/roberta-base-squad-v1/vocab.txt) |
也可以通过`paddlenlp` cli 工具来下载对应的模型权重,使用步骤如下所示:
* 安装paddlenlp
```shell
pip install --upgrade paddlenlp
```
* 下载命令行
```shell
paddlenlp download --cache-dir ./pretrained_models csarron/roberta-base-squad-v1
```
有任何下载的问题都可以到[PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP)中发Issue提问。
\ No newline at end of file
# model list
##
| model | description | model_size | download |
| --- | --- | --- | --- |
|csarron/roberta-base-squad-v1| | 475.51MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/csarron/roberta-base-squad-v1/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/csarron/roberta-base-squad-v1/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/csarron/roberta-base-squad-v1/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/csarron/roberta-base-squad-v1/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/csarron/roberta-base-squad-v1/vocab.json)<br>[vocab.txt](https://bj.bcebos.com/paddlenlp/models/community/csarron/roberta-base-squad-v1/vocab.txt) |
or you can download all of model file with the following steps:
* install paddlenlp
```shell
pip install --upgrade paddlenlp
```
* download model with cli tool
```shell
paddlenlp download --cache-dir ./pretrained_models csarron/roberta-base-squad-v1
```
If you have any problems with it, you can post issue on [PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP) to get support.
Model_Info:
name: "csarron/roberta-base-squad-v1"
description: "RoBERTa-base fine-tuned on SQuAD v1"
description_en: "RoBERTa-base fine-tuned on SQuAD v1"
icon: ""
from_repo: "https://huggingface.co/csarron/roberta-base-squad-v1"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Question Answering"
sub_tag: "回答问题"
Example:
Datasets: "squad"
Publisher: "csarron"
License: "mit"
Language: "English"
Paper:
- title: 'RoBERTa: A Robustly Optimized BERT Pretraining Approach'
url: 'http://arxiv.org/abs/1907.11692v1'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: German
License: mit
Model_Info: Model_Info:
name: "dbmdz/bert-base-german-cased" description: 🤗 + 📚 dbmdz German BERT models
description: "🤗 + 📚 dbmdz German BERT models" description_en: 🤗 + 📚 dbmdz German BERT models
description_en: "🤗 + 📚 dbmdz German BERT models" from_repo: https://huggingface.co/dbmdz/bert-base-german-cased
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/dbmdz/bert-base-german-cased" name: dbmdz/bert-base-german-cased
Paper: null
Publisher: dbmdz
Task: Task:
- tag_en: "Natural Language Processing" - sub_tag: 槽位填充
tag: "自然语言处理" sub_tag_en: Fill-Mask
sub_tag_en: "Fill-Mask" tag: 自然语言处理
sub_tag: "槽位填充" tag_en: Natural Language Processing
Example:
Datasets: ""
Publisher: "dbmdz"
License: "mit"
Language: "German"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
{
"cells": [
{
"cell_type": "markdown",
"id": "9fb28341",
"metadata": {},
"source": [
"# 🤗 + 📚 dbmdz German BERT models\n",
"\n",
"In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State\n",
"Library open sources another German BERT models 🎉\n",
"\n",
"# German BERT\n",
"\n",
"## Stats\n",
"\n",
"In addition to the recently released [German BERT](https://deepset.ai/german-bert)\n",
"model by [deepset](https://deepset.ai/) we provide another German-language model.\n",
"\n",
"The source data for the model consists of a recent Wikipedia dump, EU Bookshop corpus,\n",
"Open Subtitles, CommonCrawl, ParaCrawl and News Crawl. This results in a dataset with\n",
"a size of 16GB and 2,350,234,427 tokens.\n",
"\n",
"For sentence splitting, we use [spacy](https://spacy.io/). Our preprocessing steps\n",
"(sentence piece model for vocab generation) follow those used for training\n",
"[SciBERT](https://github.com/allenai/scibert). The model is trained with an initial\n",
"sequence length of 512 subwords and was performed for 1.5M steps."
]
},
{
"cell_type": "markdown",
"id": "589fadf4",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "646e12d4",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5935d3e0",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dbmdz/bert-base-german-cased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "b05add24",
"metadata": {},
"source": [
"# Reference\n",
"\n",
"> 此模型介绍及权重来源于[https://huggingface.co/dbmdz/bert-base-german-cased](https://huggingface.co/dbmdz/bert-base-german-cased),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "e875e0cc",
"metadata": {},
"source": [
"# 🤗 + 📚 dbmdz German BERT models\n",
"\n",
"In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State\n",
"Library open sources another German BERT models 🎉\n",
"\n",
"# German BERT\n",
"\n",
"## Stats\n",
"\n",
"In addition to the recently released [German BERT](https://deepset.ai/german-bert)\n",
"model by [deepset](https://deepset.ai/) we provide another German-language model.\n",
"\n",
"The source data for the model consists of a recent Wikipedia dump, EU Bookshop corpus,\n",
"Open Subtitles, CommonCrawl, ParaCrawl and News Crawl. This results in a dataset with\n",
"a size of 16GB and 2,350,234,427 tokens.\n",
"\n",
"For sentence splitting, we use [spacy](https://spacy.io/). Our preprocessing steps\n",
"(sentence piece model for vocab generation) follow those used for training\n",
"[SciBERT](https://github.com/allenai/scibert). The model is trained with an initial\n",
"sequence length of 512 subwords and was performed for 1.5M steps."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8dcad967",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c7c65281",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dbmdz/bert-base-german-cased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "1b52feb8",
"metadata": {},
"source": [
"# Reference"
]
},
{
"cell_type": "markdown",
"id": "bc00304a",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/dbmdz/bert-base-german-cased](https://huggingface.co/dbmdz/bert-base-german-cased) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: German
License: mit
Model_Info: Model_Info:
name: "dbmdz/bert-base-german-uncased" description: 🤗 + 📚 dbmdz German BERT models
description: "🤗 + 📚 dbmdz German BERT models" description_en: 🤗 + 📚 dbmdz German BERT models
description_en: "🤗 + 📚 dbmdz German BERT models" from_repo: https://huggingface.co/dbmdz/bert-base-german-uncased
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/dbmdz/bert-base-german-uncased" name: dbmdz/bert-base-german-uncased
Paper: null
Publisher: dbmdz
Task: Task:
- tag_en: "Natural Language Processing" - sub_tag: 槽位填充
tag: "自然语言处理" sub_tag_en: Fill-Mask
sub_tag_en: "Fill-Mask" tag: 自然语言处理
sub_tag: "槽位填充" tag_en: Natural Language Processing
Example:
Datasets: ""
Publisher: "dbmdz"
License: "mit"
Language: "German"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
{
"cells": [
{
"cell_type": "markdown",
"id": "46b7bbb6",
"metadata": {},
"source": [
"# 🤗 + 📚 dbmdz German BERT models\n",
"\n",
"In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State\n",
"Library open sources another German BERT models 🎉\n",
"\n",
"# German BERT\n",
"\n",
"## Stats\n",
"\n",
"In addition to the recently released [German BERT](https://deepset.ai/german-bert)\n",
"model by [deepset](https://deepset.ai/) we provide another German-language model.\n",
"\n",
"The source data for the model consists of a recent Wikipedia dump, EU Bookshop corpus,\n",
"Open Subtitles, CommonCrawl, ParaCrawl and News Crawl. This results in a dataset with\n",
"a size of 16GB and 2,350,234,427 tokens.\n",
"\n",
"For sentence splitting, we use [spacy](https://spacy.io/). Our preprocessing steps\n",
"(sentence piece model for vocab generation) follow those used for training\n",
"[SciBERT](https://github.com/allenai/scibert). The model is trained with an initial\n",
"sequence length of 512 subwords and was performed for 1.5M steps."
]
},
{
"cell_type": "markdown",
"id": "bc37d3e3",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2afff18c",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "967f058e",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dbmdz/bert-base-german-uncased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "483dbced",
"metadata": {},
"source": [
"# Reference"
]
},
{
"cell_type": "markdown",
"id": "04e50d8c",
"metadata": {},
"source": [
"\n",
"> 此模型介绍及权重来源于[https://huggingface.co/dbmdz/bert-base-german-uncased](https://huggingface.co/dbmdz/bert-base-german-uncased),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "5e0d446c",
"metadata": {},
"source": [
"# 🤗 + 📚 dbmdz German BERT models\n",
"\n",
"In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State\n",
"Library open sources another German BERT models 🎉\n",
"\n",
"# German BERT\n",
"\n",
"## Stats\n",
"\n",
"In addition to the recently released [German BERT](https://deepset.ai/german-bert)\n",
"model by [deepset](https://deepset.ai/) we provide another German-language model.\n",
"\n",
"The source data for the model consists of a recent Wikipedia dump, EU Bookshop corpus,\n",
"Open Subtitles, CommonCrawl, ParaCrawl and News Crawl. This results in a dataset with\n",
"a size of 16GB and 2,350,234,427 tokens.\n",
"\n",
"For sentence splitting, we use [spacy](https://spacy.io/). Our preprocessing steps\n",
"(sentence piece model for vocab generation) follow those used for training\n",
"[SciBERT](https://github.com/allenai/scibert). The model is trained with an initial\n",
"sequence length of 512 subwords and was performed for 1.5M steps."
]
},
{
"cell_type": "markdown",
"id": "524680d5",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "39332440",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "19cf118e",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dbmdz/bert-base-german-uncased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "fb81d709",
"metadata": {},
"source": [
"# Reference"
]
},
{
"cell_type": "markdown",
"id": "747fd5d3",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/dbmdz/bert-base-german-uncased](https://huggingface.co/dbmdz/bert-base-german-uncased) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: wikipedia
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: Italian
License: mit
Model_Info: Model_Info:
name: "dbmdz/bert-base-italian-uncased" description: 🤗 + 📚 dbmdz BERT and ELECTRA models
description: "🤗 + 📚 dbmdz BERT and ELECTRA models" description_en: 🤗 + 📚 dbmdz BERT and ELECTRA models
description_en: "🤗 + 📚 dbmdz BERT and ELECTRA models" from_repo: https://huggingface.co/dbmdz/bert-base-italian-uncased
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/dbmdz/bert-base-italian-uncased" name: dbmdz/bert-base-italian-uncased
Paper: null
Publisher: dbmdz
Task: Task:
- tag_en: "Natural Language Processing" - sub_tag: 槽位填充
tag: "自然语言处理" sub_tag_en: Fill-Mask
sub_tag_en: "Fill-Mask" tag: 自然语言处理
sub_tag: "槽位填充" tag_en: Natural Language Processing
Example:
Datasets: "wikipedia"
Publisher: "dbmdz"
License: "mit"
Language: "Italian"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
{
"cells": [
{
"cell_type": "markdown",
"id": "dea2fc9e",
"metadata": {},
"source": [
"# 🤗 + 📚 dbmdz BERT and ELECTRA models\n"
]
},
{
"cell_type": "markdown",
"id": "00744cbd",
"metadata": {},
"source": [
"In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State\n",
"Library open sources Italian BERT and ELECTRA models 🎉\n"
]
},
{
"cell_type": "markdown",
"id": "d7106b74",
"metadata": {},
"source": [
"# Italian BERT\n"
]
},
{
"cell_type": "markdown",
"id": "7ee0fd67",
"metadata": {},
"source": [
"The source data for the Italian BERT model consists of a recent Wikipedia dump and\n",
"various texts from the [OPUS corpora](http://opus.nlpl.eu/) collection. The final\n",
"training corpus has a size of 13GB and 2,050,057,573 tokens.\n"
]
},
{
"cell_type": "markdown",
"id": "a3961910",
"metadata": {},
"source": [
"For sentence splitting, we use NLTK (faster compared to spacy).\n",
"Our cased and uncased models are training with an initial sequence length of 512\n",
"subwords for ~2-3M steps.\n"
]
},
{
"cell_type": "markdown",
"id": "480e4fea",
"metadata": {},
"source": [
"For the XXL Italian models, we use the same training data from OPUS and extend\n",
"it with data from the Italian part of the [OSCAR corpus](https://traces1.inria.fr/oscar/).\n",
"Thus, the final training corpus has a size of 81GB and 13,138,379,147 tokens.\n"
]
},
{
"cell_type": "markdown",
"id": "d710804e",
"metadata": {},
"source": [
"Note: Unfortunately, a wrong vocab size was used when training the XXL models.\n",
"This explains the mismatch of the \"real\" vocab size of 31102, compared to the\n",
"vocab size specified in `config.json`. However, the model is working and all\n",
"evaluations were done under those circumstances.\n",
"See [this issue](https://github.com/dbmdz/berts/issues/7) for more information.\n"
]
},
{
"cell_type": "markdown",
"id": "2d9c79e5",
"metadata": {},
"source": [
"The Italian ELECTRA model was trained on the \"XXL\" corpus for 1M steps in total using a batch\n",
"size of 128. We pretty much following the ELECTRA training procedure as used for\n",
"[BERTurk](https://github.com/stefan-it/turkish-bert/tree/master/electra).\n"
]
},
{
"cell_type": "markdown",
"id": "3ee71cee",
"metadata": {},
"source": [
"## Usage\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9ffe9a93",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "82d327d4",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dbmdz/bert-base-italian-uncased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "56d92161",
"metadata": {},
"source": [
"# Reference"
]
},
{
"cell_type": "markdown",
"id": "ad146f63",
"metadata": {},
"source": [
"\n",
"> 此模型介绍及权重来源于[https://huggingface.co/dbmdz/bert-base-italian-uncased](https://huggingface.co/dbmdz/bert-base-italian-uncased),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "8601b7e0",
"metadata": {},
"source": [
"# 🤗 + 📚 dbmdz BERT and ELECTRA models\n"
]
},
{
"cell_type": "markdown",
"id": "2e2ee06f",
"metadata": {},
"source": [
"In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State\n",
"Library open sources Italian BERT and ELECTRA models 🎉\n"
]
},
{
"cell_type": "markdown",
"id": "a7b6e470",
"metadata": {},
"source": [
"# Italian BERT\n"
]
},
{
"cell_type": "markdown",
"id": "d1afb03c",
"metadata": {},
"source": [
"The source data for the Italian BERT model consists of a recent Wikipedia dump and\n",
"various texts from the [OPUS corpora](http://opus.nlpl.eu/) collection. The final\n",
"training corpus has a size of 13GB and 2,050,057,573 tokens.\n"
]
},
{
"cell_type": "markdown",
"id": "a900d41a",
"metadata": {},
"source": [
"For sentence splitting, we use NLTK (faster compared to spacy).\n",
"Our cased and uncased models are training with an initial sequence length of 512\n",
"subwords for ~2-3M steps.\n"
]
},
{
"cell_type": "markdown",
"id": "d4ea3425",
"metadata": {},
"source": [
"For the XXL Italian models, we use the same training data from OPUS and extend\n",
"it with data from the Italian part of the [OSCAR corpus](https://traces1.inria.fr/oscar/).\n",
"Thus, the final training corpus has a size of 81GB and 13,138,379,147 tokens.\n"
]
},
{
"cell_type": "markdown",
"id": "f1d5804d",
"metadata": {},
"source": [
"Note: Unfortunately, a wrong vocab size was used when training the XXL models.\n",
"This explains the mismatch of the \"real\" vocab size of 31102, compared to the\n",
"vocab size specified in `config.json`. However, the model is working and all\n",
"evaluations were done under those circumstances.\n",
"See [this issue](https://github.com/dbmdz/berts/issues/7) for more information.\n"
]
},
{
"cell_type": "markdown",
"id": "cc4f3d3d",
"metadata": {},
"source": [
"The Italian ELECTRA model was trained on the \"XXL\" corpus for 1M steps in total using a batch\n",
"size of 128. We pretty much following the ELECTRA training procedure as used for\n",
"[BERTurk](https://github.com/stefan-it/turkish-bert/tree/master/electra).\n"
]
},
{
"cell_type": "markdown",
"id": "76e431e8",
"metadata": {},
"source": [
"## Usage\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1b014af1",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ca7904c6",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dbmdz/bert-base-italian-uncased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "261390e6",
"metadata": {},
"source": [
"# Reference"
]
},
{
"cell_type": "markdown",
"id": "f5c0c815",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/dbmdz/bert-base-italian-uncased](https://huggingface.co/dbmdz/bert-base-italian-uncased) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: wikipedia
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: Italian
License: mit
Model_Info: Model_Info:
name: "dbmdz/bert-base-italian-xxl-cased" description: 🤗 + 📚 dbmdz BERT and ELECTRA models
description: "🤗 + 📚 dbmdz BERT and ELECTRA models" description_en: 🤗 + 📚 dbmdz BERT and ELECTRA models
description_en: "🤗 + 📚 dbmdz BERT and ELECTRA models" from_repo: https://huggingface.co/dbmdz/bert-base-italian-xxl-cased
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/dbmdz/bert-base-italian-xxl-cased" name: dbmdz/bert-base-italian-xxl-cased
Paper: null
Publisher: dbmdz
Task: Task:
- tag_en: "Natural Language Processing" - sub_tag: 槽位填充
tag: "自然语言处理" sub_tag_en: Fill-Mask
sub_tag_en: "Fill-Mask" tag: 自然语言处理
sub_tag: "槽位填充" tag_en: Natural Language Processing
Example:
Datasets: "wikipedia"
Publisher: "dbmdz"
License: "mit"
Language: "Italian"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
{
"cells": [
{
"cell_type": "markdown",
"id": "4e448d86",
"metadata": {},
"source": [
"# 🤗 + 📚 dbmdz BERT and ELECTRA models\n"
]
},
{
"cell_type": "markdown",
"id": "9bcf089b",
"metadata": {},
"source": [
"In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State\n",
"Library open sources Italian BERT and ELECTRA models 🎉\n"
]
},
{
"cell_type": "markdown",
"id": "fb9adbdd",
"metadata": {},
"source": [
"# Italian BERT\n"
]
},
{
"cell_type": "markdown",
"id": "e5a80c49",
"metadata": {},
"source": [
"The source data for the Italian BERT model consists of a recent Wikipedia dump and\n",
"various texts from the [OPUS corpora](http://opus.nlpl.eu/) collection. The final\n",
"training corpus has a size of 13GB and 2,050,057,573 tokens.\n"
]
},
{
"cell_type": "markdown",
"id": "3513aa96",
"metadata": {},
"source": [
"For sentence splitting, we use NLTK (faster compared to spacy).\n",
"Our cased and uncased models are training with an initial sequence length of 512\n",
"subwords for ~2-3M steps.\n"
]
},
{
"cell_type": "markdown",
"id": "ca0e58ee",
"metadata": {},
"source": [
"For the XXL Italian models, we use the same training data from OPUS and extend\n",
"it with data from the Italian part of the [OSCAR corpus](https://traces1.inria.fr/oscar/).\n",
"Thus, the final training corpus has a size of 81GB and 13,138,379,147 tokens.\n"
]
},
{
"cell_type": "markdown",
"id": "744e0851",
"metadata": {},
"source": [
"Note: Unfortunately, a wrong vocab size was used when training the XXL models.\n",
"This explains the mismatch of the \"real\" vocab size of 31102, compared to the\n",
"vocab size specified in `config.json`. However, the model is working and all\n",
"evaluations were done under those circumstances.\n",
"See [this issue](https://github.com/dbmdz/berts/issues/7) for more information.\n"
]
},
{
"cell_type": "markdown",
"id": "3bb28396",
"metadata": {},
"source": [
"The Italian ELECTRA model was trained on the \"XXL\" corpus for 1M steps in total using a batch\n",
"size of 128. We pretty much following the ELECTRA training procedure as used for\n",
"[BERTurk](https://github.com/stefan-it/turkish-bert/tree/master/electra).\n"
]
},
{
"cell_type": "markdown",
"id": "4c0c2ecb",
"metadata": {},
"source": [
"## Usage\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e059cf91",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "95a883f8",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dbmdz/bert-base-italian-xxl-cased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "6b2c856e",
"metadata": {},
"source": [
"# Reference"
]
},
{
"cell_type": "markdown",
"id": "ffdf7223",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/dbmdz/bert-base-italian-xxl-cased](https://huggingface.co/dbmdz/bert-base-italian-xxl-cased),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "41ca2df0",
"metadata": {},
"source": [
"# 🤗 + 📚 dbmdz BERT and ELECTRA models\n"
]
},
{
"cell_type": "markdown",
"id": "58e60a32",
"metadata": {},
"source": [
"In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State\n",
"Library open sources Italian BERT and ELECTRA models 🎉\n"
]
},
{
"cell_type": "markdown",
"id": "c7b5f379",
"metadata": {},
"source": [
"# Italian BERT\n"
]
},
{
"cell_type": "markdown",
"id": "5bf65013",
"metadata": {},
"source": [
"The source data for the Italian BERT model consists of a recent Wikipedia dump and\n",
"various texts from the [OPUS corpora](http://opus.nlpl.eu/) collection. The final\n",
"training corpus has a size of 13GB and 2,050,057,573 tokens.\n"
]
},
{
"cell_type": "markdown",
"id": "a3aadc8d",
"metadata": {},
"source": [
"For sentence splitting, we use NLTK (faster compared to spacy).\n",
"Our cased and uncased models are training with an initial sequence length of 512\n",
"subwords for ~2-3M steps.\n"
]
},
{
"cell_type": "markdown",
"id": "4d670485",
"metadata": {},
"source": [
"For the XXL Italian models, we use the same training data from OPUS and extend\n",
"it with data from the Italian part of the [OSCAR corpus](https://traces1.inria.fr/oscar/).\n",
"Thus, the final training corpus has a size of 81GB and 13,138,379,147 tokens.\n"
]
},
{
"cell_type": "markdown",
"id": "a366dc7d",
"metadata": {},
"source": [
"Note: Unfortunately, a wrong vocab size was used when training the XXL models.\n",
"This explains the mismatch of the \"real\" vocab size of 31102, compared to the\n",
"vocab size specified in `config.json`. However, the model is working and all\n",
"evaluations were done under those circumstances.\n",
"See [this issue](https://github.com/dbmdz/berts/issues/7) for more information.\n"
]
},
{
"cell_type": "markdown",
"id": "eaee3adf",
"metadata": {},
"source": [
"The Italian ELECTRA model was trained on the \"XXL\" corpus for 1M steps in total using a batch\n",
"size of 128. We pretty much following the ELECTRA training procedure as used for\n",
"[BERTurk](https://github.com/stefan-it/turkish-bert/tree/master/electra).\n"
]
},
{
"cell_type": "markdown",
"id": "c5151f8a",
"metadata": {},
"source": [
"## Usage\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f9e48e19",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c02e6f47",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dbmdz/bert-base-italian-xxl-cased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "13e03e4b",
"metadata": {},
"source": [
"# Reference"
]
},
{
"cell_type": "markdown",
"id": "66705724",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/dbmdz/bert-base-italian-xxl-cased](https://huggingface.co/dbmdz/bert-base-italian-xxl-cased) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Model_Info: Datasets: ''
name: "dbmdz/bert-base-turkish-128k-cased" Example: null
description: "🤗 + 📚 dbmdz Turkish BERT model" IfOnlineDemo: 0
description_en: "🤗 + 📚 dbmdz Turkish BERT model"
icon: ""
from_repo: "https://huggingface.co/dbmdz/bert-base-turkish-128k-cased"
Task:
Example:
Datasets: ""
Publisher: "dbmdz"
License: "mit"
Language: "Turkish"
Paper:
IfTraining: 0 IfTraining: 0
IfOnlineDemo: 0 Language: Turkish
\ No newline at end of file License: mit
Model_Info:
description: 🤗 + 📚 dbmdz Turkish BERT model
description_en: 🤗 + 📚 dbmdz Turkish BERT model
from_repo: https://huggingface.co/dbmdz/bert-base-turkish-128k-cased
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: dbmdz/bert-base-turkish-128k-cased
Paper: null
Publisher: dbmdz
Task: null
{
"cells": [
{
"cell_type": "markdown",
"id": "bda1db47",
"metadata": {},
"source": [
"# 🤗 + 📚 dbmdz Turkish BERT model\n"
]
},
{
"cell_type": "markdown",
"id": "ba254a15",
"metadata": {},
"source": [
"In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State\n",
"Library open sources a cased model for Turkish 🎉\n"
]
},
{
"cell_type": "markdown",
"id": "bf2818ba",
"metadata": {},
"source": [
"# 🇹🇷 BERTurk\n"
]
},
{
"cell_type": "markdown",
"id": "788f7baa",
"metadata": {},
"source": [
"BERTurk is a community-driven cased BERT model for Turkish.\n"
]
},
{
"cell_type": "markdown",
"id": "5e051a7d",
"metadata": {},
"source": [
"Some datasets used for pretraining and evaluation are contributed from the\n",
"awesome Turkish NLP community, as well as the decision for the model name: BERTurk.\n"
]
},
{
"cell_type": "markdown",
"id": "1edbcf52",
"metadata": {},
"source": [
"## Stats\n"
]
},
{
"cell_type": "markdown",
"id": "5b7c3ff4",
"metadata": {},
"source": [
"The current version of the model is trained on a filtered and sentence\n",
"segmented version of the Turkish [OSCAR corpus](https://traces1.inria.fr/oscar/),\n",
"a recent Wikipedia dump, various [OPUS corpora](http://opus.nlpl.eu/) and a\n",
"special corpus provided by [Kemal Oflazer](http://www.andrew.cmu.edu/user/ko/).\n"
]
},
{
"cell_type": "markdown",
"id": "9413f21a",
"metadata": {},
"source": [
"The final training corpus has a size of 35GB and 44,04,976,662 tokens.\n"
]
},
{
"cell_type": "markdown",
"id": "25593952",
"metadata": {},
"source": [
"For this model we use a vocab size of 128k.\n"
]
},
{
"cell_type": "markdown",
"id": "962cf00d",
"metadata": {},
"source": [
"## Usage\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d4a4e8e3",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7157d7da",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dbmdz/bert-base-turkish-128k-cased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "e47155ee",
"metadata": {},
"source": [
"# Reference"
]
},
{
"cell_type": "markdown",
"id": "081638b2",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/dbmdz/bert-base-turkish-128k-cased](https://huggingface.co/dbmdz/bert-base-turkish-128k-cased),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "911a1be9",
"metadata": {},
"source": [
"# 🤗 + 📚 dbmdz Turkish BERT model\n"
]
},
{
"cell_type": "markdown",
"id": "4f09b0f1",
"metadata": {},
"source": [
"In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State\n",
"Library open sources a cased model for Turkish 🎉\n"
]
},
{
"cell_type": "markdown",
"id": "fa2a78a0",
"metadata": {},
"source": [
"# 🇹🇷 BERTurk\n"
]
},
{
"cell_type": "markdown",
"id": "8b2f8c68",
"metadata": {},
"source": [
"BERTurk is a community-driven cased BERT model for Turkish.\n"
]
},
{
"cell_type": "markdown",
"id": "fe23e365",
"metadata": {},
"source": [
"Some datasets used for pretraining and evaluation are contributed from the\n",
"awesome Turkish NLP community, as well as the decision for the model name: BERTurk.\n"
]
},
{
"cell_type": "markdown",
"id": "2da0ce24",
"metadata": {},
"source": [
"## Stats\n"
]
},
{
"cell_type": "markdown",
"id": "d3f6af43",
"metadata": {},
"source": [
"The current version of the model is trained on a filtered and sentence\n",
"segmented version of the Turkish [OSCAR corpus](https://traces1.inria.fr/oscar/),\n",
"a recent Wikipedia dump, various [OPUS corpora](http://opus.nlpl.eu/) and a\n",
"special corpus provided by [Kemal Oflazer](http://www.andrew.cmu.edu/user/ko/).\n"
]
},
{
"cell_type": "markdown",
"id": "0d8d60c1",
"metadata": {},
"source": [
"The final training corpus has a size of 35GB and 44,04,976,662 tokens.\n"
]
},
{
"cell_type": "markdown",
"id": "ce42504f",
"metadata": {},
"source": [
"For this model we use a vocab size of 128k.\n"
]
},
{
"cell_type": "markdown",
"id": "4815bfdb",
"metadata": {},
"source": [
"## Usage\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7a084604",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8d041c78",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dbmdz/bert-base-turkish-128k-cased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "da82079c",
"metadata": {},
"source": [
"# Reference"
]
},
{
"cell_type": "markdown",
"id": "b6632d46",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/dbmdz/bert-base-turkish-128k-cased](https://huggingface.co/dbmdz/bert-base-turkish-128k-cased) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Model_Info: Datasets: ''
name: "dbmdz/bert-base-turkish-cased" Example: null
description: "🤗 + 📚 dbmdz Turkish BERT model" IfOnlineDemo: 0
description_en: "🤗 + 📚 dbmdz Turkish BERT model"
icon: ""
from_repo: "https://huggingface.co/dbmdz/bert-base-turkish-cased"
Task:
Example:
Datasets: ""
Publisher: "dbmdz"
License: "mit"
Language: "Turkish"
Paper:
IfTraining: 0 IfTraining: 0
IfOnlineDemo: 0 Language: Turkish
\ No newline at end of file License: mit
Model_Info:
description: 🤗 + 📚 dbmdz Turkish BERT model
description_en: 🤗 + 📚 dbmdz Turkish BERT model
from_repo: https://huggingface.co/dbmdz/bert-base-turkish-cased
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: dbmdz/bert-base-turkish-cased
Paper: null
Publisher: dbmdz
Task: null
{
"cells": [
{
"cell_type": "markdown",
"id": "e9075248",
"metadata": {},
"source": [
"# 🤗 + 📚 dbmdz Turkish BERT model\n"
]
},
{
"cell_type": "markdown",
"id": "0f68224a",
"metadata": {},
"source": [
"In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State\n",
"Library open sources a cased model for Turkish 🎉\n"
]
},
{
"cell_type": "markdown",
"id": "a800751f",
"metadata": {},
"source": [
"# 🇹🇷 BERTurk\n"
]
},
{
"cell_type": "markdown",
"id": "0f418bcc",
"metadata": {},
"source": [
"BERTurk is a community-driven cased BERT model for Turkish.\n"
]
},
{
"cell_type": "markdown",
"id": "059528cb",
"metadata": {},
"source": [
"Some datasets used for pretraining and evaluation are contributed from the\n",
"awesome Turkish NLP community, as well as the decision for the model name: BERTurk.\n"
]
},
{
"cell_type": "markdown",
"id": "ec8d00db",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1cb273ef",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "45fd943c",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dbmdz/bert-base-turkish-cased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "0653e10b",
"metadata": {},
"source": [
"# Reference"
]
},
{
"cell_type": "markdown",
"id": "14b8d869",
"metadata": {},
"source": [
"\n",
"\n",
"> 此模型介绍及权重来源于[https://huggingface.co/dbmdz/bert-base-turkish-cased](https://huggingface.co/dbmdz/bert-base-turkish-cased),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "22b9df2e",
"metadata": {},
"source": [
"# 🤗 + 📚 dbmdz Turkish BERT model\n"
]
},
{
"cell_type": "markdown",
"id": "509b461d",
"metadata": {},
"source": [
"In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State\n",
"Library open sources a cased model for Turkish 🎉\n"
]
},
{
"cell_type": "markdown",
"id": "84ab205a",
"metadata": {},
"source": [
"# 🇹🇷 BERTurk\n"
]
},
{
"cell_type": "markdown",
"id": "aafa4b5d",
"metadata": {},
"source": [
"BERTurk is a community-driven cased BERT model for Turkish.\n"
]
},
{
"cell_type": "markdown",
"id": "16251feb",
"metadata": {},
"source": [
"Some datasets used for pretraining and evaluation are contributed from the\n",
"awesome Turkish NLP community, as well as the decision for the model name: BERTurk.\n"
]
},
{
"cell_type": "markdown",
"id": "1bdf0158",
"metadata": {},
"source": [
"## Usage\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "aa2b4d91",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0f55d31d",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dbmdz/bert-base-turkish-cased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "9aae3e54",
"metadata": {},
"source": [
"# Reference"
]
},
{
"cell_type": "markdown",
"id": "839b89b9",
"metadata": {},
"source": [
"\n",
"> The model introduction and model weights originate from [https://huggingface.co/dbmdz/bert-base-turkish-cased](https://huggingface.co/dbmdz/bert-base-turkish-cased) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Model_Info: Datasets: ''
name: "dbmdz/bert-base-turkish-uncased" Example: null
description: "🤗 + 📚 dbmdz Turkish BERT model" IfOnlineDemo: 0
description_en: "🤗 + 📚 dbmdz Turkish BERT model"
icon: ""
from_repo: "https://huggingface.co/dbmdz/bert-base-turkish-uncased"
Task:
Example:
Datasets: ""
Publisher: "dbmdz"
License: "mit"
Language: "Turkish"
Paper:
IfTraining: 0 IfTraining: 0
IfOnlineDemo: 0 Language: Turkish
\ No newline at end of file License: mit
Model_Info:
description: 🤗 + 📚 dbmdz Turkish BERT model
description_en: 🤗 + 📚 dbmdz Turkish BERT model
from_repo: https://huggingface.co/dbmdz/bert-base-turkish-uncased
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: dbmdz/bert-base-turkish-uncased
Paper: null
Publisher: dbmdz
Task: null
{
"cells": [
{
"cell_type": "markdown",
"id": "f3dbf349",
"metadata": {},
"source": [
"# 🤗 + 📚 dbmdz Turkish BERT model\n"
]
},
{
"cell_type": "markdown",
"id": "479d8e10",
"metadata": {},
"source": [
"In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State\n",
"Library open sources an uncased model for Turkish 🎉\n"
]
},
{
"cell_type": "markdown",
"id": "fc31d938",
"metadata": {},
"source": [
"# 🇹🇷 BERTurk\n"
]
},
{
"cell_type": "markdown",
"id": "c05c98f4",
"metadata": {},
"source": [
"BERTurk is a community-driven uncased BERT model for Turkish.\n"
]
},
{
"cell_type": "markdown",
"id": "116bbd89",
"metadata": {},
"source": [
"Some datasets used for pretraining and evaluation are contributed from the\n",
"awesome Turkish NLP community, as well as the decision for the model name: BERTurk.\n"
]
},
{
"cell_type": "markdown",
"id": "eab29ae1",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "531e7c2f",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c23a11a9",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dbmdz/bert-base-turkish-uncased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "1a4d2556",
"metadata": {},
"source": [
"# Reference"
]
},
{
"cell_type": "markdown",
"id": "4e10e25f",
"metadata": {},
"source": [
"\n",
"\n",
"> 此模型介绍及权重来源于[https://huggingface.co/dbmdz/bert-base-turkish-uncased](https://huggingface.co/dbmdz/bert-base-turkish-uncased),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "f1968bb1",
"metadata": {},
"source": [
"# 🤗 + 📚 dbmdz Turkish BERT model\n"
]
},
{
"cell_type": "markdown",
"id": "37119e6e",
"metadata": {},
"source": [
"In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State\n",
"Library open sources an uncased model for Turkish 🎉\n"
]
},
{
"cell_type": "markdown",
"id": "e2428d3f",
"metadata": {},
"source": [
"# 🇹🇷 BERTurk\n"
]
},
{
"cell_type": "markdown",
"id": "455a98e2",
"metadata": {},
"source": [
"BERTurk is a community-driven uncased BERT model for Turkish.\n"
]
},
{
"cell_type": "markdown",
"id": "3c7b1272",
"metadata": {},
"source": [
"Some datasets used for pretraining and evaluation are contributed from the\n",
"awesome Turkish NLP community, as well as the decision for the model name: BERTurk.\n"
]
},
{
"cell_type": "markdown",
"id": "cdd8f852",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "81436ade",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "bd223538",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dbmdz/bert-base-turkish-uncased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "7edb6fa7",
"metadata": {},
"source": [
"# Reference"
]
},
{
"cell_type": "markdown",
"id": "95b108cb",
"metadata": {},
"source": [
"\n",
"\n",
"> The model introduction and model weights originate from [https://huggingface.co/dbmdz/bert-base-turkish-uncased](https://huggingface.co/dbmdz/bert-base-turkish-uncased) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: ''
License: mit
Model_Info: Model_Info:
name: "deepparag/Aeona" description: Aeona | Chatbot
description: "Aeona | Chatbot" description_en: Aeona | Chatbot
description_en: "Aeona | Chatbot" from_repo: https://huggingface.co/deepparag/Aeona
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/deepparag/Aeona" name: deepparag/Aeona
Paper: null
Publisher: deepparag
Task: Task:
- tag_en: "Natural Language Processing" - sub_tag: 文本生成
tag: "自然语言处理" sub_tag_en: Text Generation
sub_tag_en: "Text Generation" tag: 自然语言处理
sub_tag: "文本生成" tag_en: Natural Language Processing
Example:
Datasets: ""
Publisher: "deepparag"
License: "mit"
Language: ""
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
{
"cells": [
{
"cell_type": "markdown",
"id": "9d9bb2aa",
"metadata": {},
"source": [
"# Aeona | Chatbot\n"
]
},
{
"cell_type": "markdown",
"id": "7361d804",
"metadata": {},
"source": [
"An generative AI made using [microsoft/DialoGPT-small](https://huggingface.co/microsoft/DialoGPT-small).\n"
]
},
{
"cell_type": "markdown",
"id": "008bcb8d",
"metadata": {},
"source": [
"Recommended to use along with an [AIML Chatbot](https://github.com/deepsarda/Aeona-Aiml) to reduce load, get better replies, add name and personality to your bot.\n",
"Using an AIML Chatbot will allow you to hardcode some replies also.\n"
]
},
{
"cell_type": "markdown",
"id": "b4bfb9cd",
"metadata": {},
"source": [
"## Evaluation\n",
"Below is a comparison of Aeona vs. other baselines on the mixed dataset given above using automatic evaluation metrics.\n"
]
},
{
"cell_type": "markdown",
"id": "4d478ffa",
"metadata": {},
"source": [
"| Model | Perplexity |\n",
"|---|---|\n",
"| Seq2seq Baseline [3] | 29.8 |\n",
"| Wolf et al. [5] | 16.3 |\n",
"| GPT-2 baseline | 99.5 |\n",
"| DialoGPT baseline | 56.6 |\n",
"| DialoGPT finetuned | 11.4 |\n",
"| PersonaGPT | 10.2 |\n",
"| **Aeona** | **7.9** |\n"
]
},
{
"cell_type": "markdown",
"id": "ebb927ce",
"metadata": {},
"source": [
"## Usage\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ea2a9d8e",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "dc15795c",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"deepparag/Aeona\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "074bd20d",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/deepparag/Aeona](https://huggingface.co/deepparag/Aeona),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "f8079990",
"metadata": {},
"source": [
"# Aeona | Chatbot\n"
]
},
{
"cell_type": "markdown",
"id": "6a69f81a",
"metadata": {},
"source": [
"An generative AI made using [microsoft/DialoGPT-small](https://huggingface.co/microsoft/DialoGPT-small).\n"
]
},
{
"cell_type": "markdown",
"id": "a65479b8",
"metadata": {},
"source": [
"Recommended to use along with an [AIML Chatbot](https://github.com/deepsarda/Aeona-Aiml) to reduce load, get better replies, add name and personality to your bot.\n",
"Using an AIML Chatbot will allow you to hardcode some replies also.\n"
]
},
{
"cell_type": "markdown",
"id": "ea390590",
"metadata": {},
"source": [
"## Evaluation\n",
"Below is a comparison of Aeona vs. other baselines on the mixed dataset given above using automatic evaluation metrics.\n"
]
},
{
"cell_type": "markdown",
"id": "5b64325a",
"metadata": {},
"source": [
"| Model | Perplexity |\n",
"|---|---|\n",
"| Seq2seq Baseline [3] | 29.8 |\n",
"| Wolf et al. [5] | 16.3 |\n",
"| GPT-2 baseline | 99.5 |\n",
"| DialoGPT baseline | 56.6 |\n",
"| DialoGPT finetuned | 11.4 |\n",
"| PersonaGPT | 10.2 |\n",
"| **Aeona** | **7.9** |\n"
]
},
{
"cell_type": "markdown",
"id": "bf7f0d0e",
"metadata": {},
"source": [
"## Usage\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "16b58290",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "cf18c96e",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"deepparag/Aeona\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "fae16f6e",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/deepparag/Aeona](https://huggingface.co/deepparag/Aeona) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
# 模型列表
## deepparag/DumBot
| 模型名称 | 模型介绍 | 模型大小 | 模型下载 |
| --- | --- | --- | --- |
|deepparag/DumBot| | 621.95MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/deepparag/DumBot/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/deepparag/DumBot/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/deepparag/DumBot/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/deepparag/DumBot/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/deepparag/DumBot/vocab.json) |
也可以通过`paddlenlp` cli 工具来下载对应的模型权重,使用步骤如下所示:
* 安装paddlenlp
```shell
pip install --upgrade paddlenlp
```
* 下载命令行
```shell
paddlenlp download --cache-dir ./pretrained_models deepparag/DumBot
```
有任何下载的问题都可以到[PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP)中发Issue提问。
\ No newline at end of file
# model list
##
| model | description | model_size | download |
| --- | --- | --- | --- |
|deepparag/DumBot| | 621.95MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/deepparag/DumBot/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/deepparag/DumBot/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/deepparag/DumBot/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/deepparag/DumBot/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/deepparag/DumBot/vocab.json) |
or you can download all of model file with the following steps:
* install paddlenlp
```shell
pip install --upgrade paddlenlp
```
* download model with cli tool
```shell
paddlenlp download --cache-dir ./pretrained_models deepparag/DumBot
```
If you have any problems with it, you can post issue on [PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP) to get support.
Model_Info:
name: "deepparag/DumBot"
description: "THIS AI IS OUTDATED. See [Aeona](https://huggingface.co/deepparag/Aeona)"
description_en: "THIS AI IS OUTDATED. See [Aeona](https://huggingface.co/deepparag/Aeona)"
icon: ""
from_repo: "https://huggingface.co/deepparag/DumBot"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text Generation"
sub_tag: "文本生成"
Example:
Datasets: ""
Publisher: "deepparag"
License: "mit"
Language: ""
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
Datasets: squad_v2
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: mit
Model_Info: Model_Info:
name: "deepset/roberta-base-squad2-distilled" description: Overview
description: "Overview" description_en: Overview
description_en: "Overview" from_repo: https://huggingface.co/deepset/roberta-base-squad2-distilled
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/deepset/roberta-base-squad2-distilled" name: deepset/roberta-base-squad2-distilled
Paper: null
Publisher: deepset
Task: Task:
- tag_en: "Natural Language Processing" - sub_tag: 回答问题
tag: "自然语言处理" sub_tag_en: Question Answering
sub_tag_en: "Question Answering" tag: 自然语言处理
sub_tag: "回答问题" tag_en: Natural Language Processing
Example:
Datasets: "squad_v2"
Publisher: "deepset"
License: "mit"
Language: "English"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
{
"cells": [
{
"cell_type": "markdown",
"id": "85b7cc2e",
"metadata": {},
"source": [
"## Overview\n",
"\n",
"Language model: deepset/roberta-base-squad2-distilled\n",
"\n",
"Language: English\n",
"\n",
"Training data: SQuAD 2.0 training set Eval data: SQuAD 2.0 dev set Infrastructure: 4x V100 GPU\n",
"\n",
"Published: Dec 8th, 2021"
]
},
{
"cell_type": "markdown",
"id": "a455ff64",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d51fa907",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4590c7eb",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"deepset/roberta-base-squad2-distilled\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "ac6e34fd",
"metadata": {},
"source": [
"## Authors\n",
"- Timo Möller: `timo.moeller [at] deepset.ai`\n",
"- Julian Risch: `julian.risch [at] deepset.ai`\n",
"- Malte Pietsch: `malte.pietsch [at] deepset.ai`\n",
"- Michel Bartels: `michel.bartels [at] deepset.ai`\n",
"## About us\n",
"![deepset logo](https://workablehr.s3.amazonaws.com/uploads/account/logo/476306/logo)\n",
"We bring NLP to the industry via open source!\n",
"Our focus: Industry specific language models & large scale QA systems.\n"
]
},
{
"cell_type": "markdown",
"id": "3d22bf87",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/deepset/roberta-base-squad2-distilled](https://huggingface.co/deepset/roberta-base-squad2-distilled),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "b917c220",
"metadata": {},
"source": [
"## Overview\n",
"\n",
"Language model: deepset/roberta-base-squad2-distilled\n",
"\n",
"Language: English\n",
"\n",
"Training data: SQuAD 2.0 training set Eval data: SQuAD 2.0 dev set Infrastructure: 4x V100 GPU\n",
"\n",
"Published: Dec 8th, 2021"
]
},
{
"cell_type": "markdown",
"id": "94e41c66",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8b2c9009",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b9472a8e",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"deepset/roberta-base-squad2-distilled\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "942ce61d",
"metadata": {},
"source": [
"## Authors\n",
"- Timo Möller: `timo.moeller [at] deepset.ai`\n",
"- Julian Risch: `julian.risch [at] deepset.ai`\n",
"- Malte Pietsch: `malte.pietsch [at] deepset.ai`\n",
"- Michel Bartels: `michel.bartels [at] deepset.ai`\n",
"## About us\n",
"![deepset logo](https://workablehr.s3.amazonaws.com/uploads/account/logo/476306/logo)\n",
"We bring NLP to the industry via open source!\n",
"Our focus: Industry specific language models & large scale QA systems.\n"
]
},
{
"cell_type": "markdown",
"id": "d65be46f",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/deepset/roberta-base-squad2-distilled](https://huggingface.co/deepset/roberta-base-squad2-distilled) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: conll2003
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: mit
Model_Info: Model_Info:
name: "dslim/bert-base-NER" description: bert-base-NER
description: "bert-base-NER" description_en: bert-base-NER
description_en: "bert-base-NER" from_repo: https://huggingface.co/dslim/bert-base-NER
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/dslim/bert-base-NER" name: dslim/bert-base-NER
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Token Classification"
sub_tag: "Token分类"
Example:
Datasets: "conll2003"
Publisher: "dslim"
License: "mit"
Language: "English"
Paper: Paper:
- title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding' - title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding'
url: 'http://arxiv.org/abs/1810.04805v2' url: http://arxiv.org/abs/1810.04805v2
IfTraining: 0 Publisher: dslim
IfOnlineDemo: 0 Task:
\ No newline at end of file - sub_tag: Token分类
sub_tag_en: Token Classification
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "4dd2d9a8",
"metadata": {},
"source": [
"# bert-base-NER\n"
]
},
{
"cell_type": "markdown",
"id": "39c0b4be",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "0961b24f",
"metadata": {},
"source": [
"**bert-base-NER** is a fine-tuned BERT model that is ready to use for **Named Entity Recognition** and achieves **state-of-the-art performance** for the NER task. It has been trained to recognize four types of entities: location (LOC), organizations (ORG), person (PER) and Miscellaneous (MISC).\n"
]
},
{
"cell_type": "markdown",
"id": "58641459",
"metadata": {},
"source": [
"Specifically, this model is a *bert-base-cased* model that was fine-tuned on the English version of the standard [CoNLL-2003 Named Entity Recognition](https://www.aclweb.org/anthology/W03-0419.pdf) dataset.\n"
]
},
{
"cell_type": "markdown",
"id": "9da0ddda",
"metadata": {},
"source": [
"If you'd like to use a larger BERT-large model fine-tuned on the same dataset, a **bert-large-NER** version is also available.\n"
]
},
{
"cell_type": "markdown",
"id": "4d5adc68",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "88f3ea49",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "76ef1f0e",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dslim/bert-base-NER\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "137e381c",
"metadata": {},
"source": [
"## Citation\n",
"```\n",
"@inproceedings{tjong-kim-sang-de-meulder-2003-introduction,\n",
"title = \"Introduction to the {C}o{NLL}-2003 Shared Task: Language-Independent Named Entity Recognition\",\n",
"author = \"Tjong Kim Sang, Erik F. and\n",
"De Meulder, Fien\",\n",
"booktitle = \"Proceedings of the Seventh Conference on Natural Language Learning at {HLT}-{NAACL} 2003\",\n",
"year = \"2003\",\n",
"url = \"https://www.aclweb.org/anthology/W03-0419\",\n",
"pages = \"142--147\",\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "3a632df2",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/dslim/bert-base-NER](https://huggingface.co/dslim/bert-base-NER),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "c5180cf2",
"metadata": {},
"source": [
"# bert-base-NER\n"
]
},
{
"cell_type": "markdown",
"id": "dbf08fd8",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "11690dda",
"metadata": {},
"source": [
"**bert-base-NER** is a fine-tuned BERT model that is ready to use for **Named Entity Recognition** and achieves **state-of-the-art performance** for the NER task. It has been trained to recognize four types of entities: location (LOC), organizations (ORG), person (PER) and Miscellaneous (MISC).\n"
]
},
{
"cell_type": "markdown",
"id": "738f98db",
"metadata": {},
"source": [
"Specifically, this model is a *bert-base-cased* model that was fine-tuned on the English version of the standard [CoNLL-2003 Named Entity Recognition](https://www.aclweb.org/anthology/W03-0419.pdf) dataset.\n"
]
},
{
"cell_type": "markdown",
"id": "03c5db03",
"metadata": {},
"source": [
"If you'd like to use a larger BERT-large model fine-tuned on the same dataset, a **bert-large-NER** version is also available.\n"
]
},
{
"cell_type": "markdown",
"id": "da040b29",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "726ee6e9",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "73564a0c",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dslim/bert-base-NER\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "c08bc233",
"metadata": {},
"source": [
"## Citation\n",
"\n",
"```\n",
"@inproceedings{tjong-kim-sang-de-meulder-2003-introduction,\n",
"title = \"Introduction to the {C}o{NLL}-2003 Shared Task: Language-Independent Named Entity Recognition\",\n",
"author = \"Tjong Kim Sang, Erik F. and\n",
"De Meulder, Fien\",\n",
"booktitle = \"Proceedings of the Seventh Conference on Natural Language Learning at {HLT}-{NAACL} 2003\",\n",
"year = \"2003\",\n",
"url = \"https://www.aclweb.org/anthology/W03-0419\",\n",
"pages = \"142--147\",\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "a56e1055",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/dslim/bert-base-NER](https://huggingface.co/dslim/bert-base-NER) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: conll2003
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: mit
Model_Info: Model_Info:
name: "dslim/bert-large-NER" description: bert-base-NER
description: "bert-base-NER" description_en: bert-base-NER
description_en: "bert-base-NER" from_repo: https://huggingface.co/dslim/bert-large-NER
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/dslim/bert-large-NER" name: dslim/bert-large-NER
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Token Classification"
sub_tag: "Token分类"
Example:
Datasets: "conll2003"
Publisher: "dslim"
License: "mit"
Language: "English"
Paper: Paper:
- title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding' - title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding'
url: 'http://arxiv.org/abs/1810.04805v2' url: http://arxiv.org/abs/1810.04805v2
IfTraining: 0 Publisher: dslim
IfOnlineDemo: 0 Task:
\ No newline at end of file - sub_tag: Token分类
sub_tag_en: Token Classification
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "70a24d70",
"metadata": {},
"source": [
"# bert-base-NER\n"
]
},
{
"cell_type": "markdown",
"id": "72df2cc8",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "d1aabc01",
"metadata": {},
"source": [
"**bert-large-NER** is a fine-tuned BERT model that is ready to use for **Named Entity Recognition** and achieves **state-of-the-art performance** for the NER task. It has been trained to recognize four types of entities: location (LOC), organizations (ORG), person (PER) and Miscellaneous (MISC).\n"
]
},
{
"cell_type": "markdown",
"id": "2d53a70b",
"metadata": {},
"source": [
"Specifically, this model is a *bert-large-cased* model that was fine-tuned on the English version of the standard [CoNLL-2003 Named Entity Recognition](https://www.aclweb.org/anthology/W03-0419.pdf) dataset.\n"
]
},
{
"cell_type": "markdown",
"id": "60c2ceb7",
"metadata": {},
"source": [
"If you'd like to use a smaller BERT model fine-tuned on the same dataset, a **bert-base-NER** version is also available.\n"
]
},
{
"cell_type": "markdown",
"id": "42d984a9",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9ef70bc3",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1177b32e",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dslim/bert-large-NER\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "353c5156",
"metadata": {},
"source": [
"## Citation\n",
"\n",
"```\n",
"@inproceedings{tjong-kim-sang-de-meulder-2003-introduction,\n",
"title = \"Introduction to the {C}o{NLL}-2003 Shared Task: Language-Independent Named Entity Recognition\",\n",
"author = \"Tjong Kim Sang, Erik F. and\n",
"De Meulder, Fien\",\n",
"booktitle = \"Proceedings of the Seventh Conference on Natural Language Learning at {HLT}-{NAACL} 2003\",\n",
"year = \"2003\",\n",
"url = \"https://www.aclweb.org/anthology/W03-0419\",\n",
"pages = \"142--147\",\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "5705ae48",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/dslim/bert-large-NER](https://huggingface.co/dslim/bert-large-NER),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "574c41aa",
"metadata": {},
"source": [
"# bert-base-NER\n"
]
},
{
"cell_type": "markdown",
"id": "430be48c",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "4bdfd881",
"metadata": {},
"source": [
"**bert-large-NER** is a fine-tuned BERT model that is ready to use for **Named Entity Recognition** and achieves **state-of-the-art performance** for the NER task. It has been trained to recognize four types of entities: location (LOC), organizations (ORG), person (PER) and Miscellaneous (MISC).\n"
]
},
{
"cell_type": "markdown",
"id": "84ed4f8f",
"metadata": {},
"source": [
"Specifically, this model is a *bert-large-cased* model that was fine-tuned on the English version of the standard [CoNLL-2003 Named Entity Recognition](https://www.aclweb.org/anthology/W03-0419.pdf) dataset.\n"
]
},
{
"cell_type": "markdown",
"id": "b4acb8a4",
"metadata": {},
"source": [
"If you'd like to use a smaller BERT model fine-tuned on the same dataset, a **bert-base-NER** version is also available.\n"
]
},
{
"cell_type": "markdown",
"id": "46ad4f1f",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b7e64545",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "380aa0dc",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dslim/bert-large-NER\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "220eb907",
"metadata": {},
"source": [
"## Citation\n",
"\n",
"```\n",
"@inproceedings{tjong-kim-sang-de-meulder-2003-introduction,\n",
"title = \"Introduction to the {C}o{NLL}-2003 Shared Task: Language-Independent Named Entity Recognition\",\n",
"author = \"Tjong Kim Sang, Erik F. and\n",
"De Meulder, Fien\",\n",
"booktitle = \"Proceedings of the Seventh Conference on Natural Language Learning at {HLT}-{NAACL} 2003\",\n",
"year = \"2003\",\n",
"url = \"https://www.aclweb.org/anthology/W03-0419\",\n",
"pages = \"142--147\",\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "dc0b9503",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/dslim/bert-large-NER](https://huggingface.co/dslim/bert-large-NER) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: Romanian
License: mit
Model_Info: Model_Info:
name: "dumitrescustefan/bert-base-romanian-cased-v1" description: bert-base-romanian-cased-v1
description: "bert-base-romanian-cased-v1" description_en: bert-base-romanian-cased-v1
description_en: "bert-base-romanian-cased-v1" from_repo: https://huggingface.co/dumitrescustefan/bert-base-romanian-cased-v1
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/dumitrescustefan/bert-base-romanian-cased-v1" name: dumitrescustefan/bert-base-romanian-cased-v1
Paper: null
Publisher: dumitrescustefan
Task: Task:
- tag_en: "Natural Language Processing" - sub_tag: 槽位填充
tag: "自然语言处理" sub_tag_en: Fill-Mask
sub_tag_en: "Fill-Mask" tag: 自然语言处理
sub_tag: "槽位填充" tag_en: Natural Language Processing
Example:
Datasets: ""
Publisher: "dumitrescustefan"
License: "mit"
Language: "Romanian"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
{
"cells": [
{
"cell_type": "markdown",
"id": "2a485f7a",
"metadata": {},
"source": [
"# bert-base-romanian-cased-v1\n"
]
},
{
"cell_type": "markdown",
"id": "5f911938",
"metadata": {},
"source": [
"The BERT **base**, **cased** model for Romanian, trained on a 15GB corpus.\n"
]
},
{
"cell_type": "markdown",
"id": "e2cccf2e",
"metadata": {},
"source": [
"### How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "64b86b26",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f84498aa",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dumitrescustefan/bert-base-romanian-cased-v1\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "40176abc",
"metadata": {},
"source": [
"## Citation\n",
"\n",
"```\n",
"@inproceedings{dumitrescu-etal-2020-birth,\n",
"title = \"The birth of {R}omanian {BERT}\",\n",
"author = \"Dumitrescu, Stefan and\n",
"Avram, Andrei-Marius and\n",
"Pyysalo, Sampo\",\n",
"booktitle = \"Findings of the Association for Computational Linguistics: EMNLP 2020\",\n",
"month = nov,\n",
"year = \"2020\",\n",
"address = \"Online\",\n",
"publisher = \"Association for Computational Linguistics\",\n",
"url = \"https://aclanthology.org/2020.findings-emnlp.387\",\n",
"doi = \"10.18653/v1/2020.findings-emnlp.387\",\n",
"pages = \"4324--4328\",\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "550b09f3",
"metadata": {},
"source": [
"#### Acknowledgements\n"
]
},
{
"cell_type": "markdown",
"id": "fda88b7c",
"metadata": {},
"source": [
"- We'd like to thank [Sampo Pyysalo](https://github.com/spyysalo) from TurkuNLP for helping us out with the compute needed to pretrain the v1.0 BERT models. He's awesome!\n",
"> 此模型介绍及权重来源于[https://huggingface.co/dumitrescustefan/bert-base-romanian-cased-v1](https://huggingface.co/dumitrescustefan/bert-base-romanian-cased-v1),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "28a330b8",
"metadata": {},
"source": [
"# bert-base-romanian-cased-v1\n"
]
},
{
"cell_type": "markdown",
"id": "36f0d74f",
"metadata": {},
"source": [
"The BERT **base**, **cased** model for Romanian, trained on a 15GB corpus."
]
},
{
"cell_type": "markdown",
"id": "0104e14e",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b4ca4271",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9f3ca553",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dumitrescustefan/bert-base-romanian-cased-v1\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "51754d3f",
"metadata": {},
"source": [
"## Citation\n",
"\n",
"```\n",
"@inproceedings{dumitrescu-etal-2020-birth,\n",
"title = \"The birth of {R}omanian {BERT}\",\n",
"author = \"Dumitrescu, Stefan and\n",
"Avram, Andrei-Marius and\n",
"Pyysalo, Sampo\",\n",
"booktitle = \"Findings of the Association for Computational Linguistics: EMNLP 2020\",\n",
"month = nov,\n",
"year = \"2020\",\n",
"address = \"Online\",\n",
"publisher = \"Association for Computational Linguistics\",\n",
"url = \"https://aclanthology.org/2020.findings-emnlp.387\",\n",
"doi = \"10.18653/v1/2020.findings-emnlp.387\",\n",
"pages = \"4324--4328\",\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "2143146f",
"metadata": {},
"source": [
"#### Acknowledgements\n"
]
},
{
"cell_type": "markdown",
"id": "d983ac22",
"metadata": {},
"source": [
"- We'd like to thank [Sampo Pyysalo](https://github.com/spyysalo) from TurkuNLP for helping us out with the compute needed to pretrain the v1.0 BERT models. He's awesome!\n",
"> The model introduction and model weights originate from [https://huggingface.co/dumitrescustefan/bert-base-romanian-cased-v1](https://huggingface.co/dumitrescustefan/bert-base-romanian-cased-v1) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: Romanian
License: mit
Model_Info: Model_Info:
name: "dumitrescustefan/bert-base-romanian-uncased-v1" description: bert-base-romanian-uncased-v1
description: "bert-base-romanian-uncased-v1" description_en: bert-base-romanian-uncased-v1
description_en: "bert-base-romanian-uncased-v1" from_repo: https://huggingface.co/dumitrescustefan/bert-base-romanian-uncased-v1
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/dumitrescustefan/bert-base-romanian-uncased-v1" name: dumitrescustefan/bert-base-romanian-uncased-v1
Paper: null
Publisher: dumitrescustefan
Task: Task:
- tag_en: "Natural Language Processing" - sub_tag: 槽位填充
tag: "自然语言处理" sub_tag_en: Fill-Mask
sub_tag_en: "Fill-Mask" tag: 自然语言处理
sub_tag: "槽位填充" tag_en: Natural Language Processing
Example:
Datasets: ""
Publisher: "dumitrescustefan"
License: "mit"
Language: "Romanian"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
{
"cells": [
{
"cell_type": "markdown",
"id": "922f44e2",
"metadata": {},
"source": [
"# bert-base-romanian-uncased-v1\n"
]
},
{
"cell_type": "markdown",
"id": "2f5259bd",
"metadata": {},
"source": [
"The BERT **base**, **uncased** model for Romanian, trained on a 15GB corpus, version ![v1.0](https://img.shields.io/badge/v1.0-21%20Apr%202020-ff6666)\n"
]
},
{
"cell_type": "markdown",
"id": "408f4468",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "acd14372",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "cc5d539c",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dumitrescustefan/bert-base-romanian-uncased-v1\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "adbbab44",
"metadata": {},
"source": [
"## Citation\n",
"\n",
"```\n",
"@inproceedings{dumitrescu-etal-2020-birth,\n",
"title = \"The birth of {R}omanian {BERT}\",\n",
"author = \"Dumitrescu, Stefan and\n",
"Avram, Andrei-Marius and\n",
"Pyysalo, Sampo\",\n",
"booktitle = \"Findings of the Association for Computational Linguistics: EMNLP 2020\",\n",
"month = nov,\n",
"year = \"2020\",\n",
"address = \"Online\",\n",
"publisher = \"Association for Computational Linguistics\",\n",
"url = \"https://aclanthology.org/2020.findings-emnlp.387\",\n",
"doi = \"10.18653/v1/2020.findings-emnlp.387\",\n",
"pages = \"4324--4328\",\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "4276651e",
"metadata": {},
"source": [
"#### Acknowledgements\n"
]
},
{
"cell_type": "markdown",
"id": "84a91796",
"metadata": {},
"source": [
"- We'd like to thank [Sampo Pyysalo](https://github.com/spyysalo) from TurkuNLP for helping us out with the compute needed to pretrain the v1.0 BERT models. He's awesome!\n",
"> 此模型介绍及权重来源于[https://huggingface.co/dumitrescustefan/bert-base-romanian-uncased-v1](https://huggingface.co/dumitrescustefan/bert-base-romanian-uncased-v1),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "cf268d0f",
"metadata": {},
"source": [
"# bert-base-romanian-uncased-v1\n"
]
},
{
"cell_type": "markdown",
"id": "453405af",
"metadata": {},
"source": [
"The BERT **base**, **uncased** model for Romanian, trained on a 15GB corpus, version ![v1.0](https://img.shields.io/badge/v1.0-21%20Apr%202020-ff6666)\n"
]
},
{
"cell_type": "markdown",
"id": "4100824e",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "cd182732",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1c16cd09",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dumitrescustefan/bert-base-romanian-uncased-v1\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "ba32e8ff",
"metadata": {},
"source": [
"## Citation\n",
"\n",
"```\n",
"@inproceedings{dumitrescu-etal-2020-birth,\n",
"title = \"The birth of {R}omanian {BERT}\",\n",
"author = \"Dumitrescu, Stefan and\n",
"Avram, Andrei-Marius and\n",
"Pyysalo, Sampo\",\n",
"booktitle = \"Findings of the Association for Computational Linguistics: EMNLP 2020\",\n",
"month = nov,\n",
"year = \"2020\",\n",
"address = \"Online\",\n",
"publisher = \"Association for Computational Linguistics\",\n",
"url = \"https://aclanthology.org/2020.findings-emnlp.387\",\n",
"doi = \"10.18653/v1/2020.findings-emnlp.387\",\n",
"pages = \"4324--4328\",\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "faa200c7",
"metadata": {},
"source": [
"#### Acknowledgements\n"
]
},
{
"cell_type": "markdown",
"id": "cb74a943",
"metadata": {},
"source": [
"- We'd like to thank [Sampo Pyysalo](https://github.com/spyysalo) from TurkuNLP for helping us out with the compute needed to pretrain the v1.0 BERT models. He's awesome!\n",
"> The model introduction and model weights originate from [https://huggingface.co/dumitrescustefan/bert-base-romanian-uncased-v1](https://huggingface.co/dumitrescustefan/bert-base-romanian-uncased-v1) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: mit
Model_Info: Model_Info:
name: "emilyalsentzer/Bio_ClinicalBERT" description: ClinicalBERT - Bio + Clinical BERT Model
description: "ClinicalBERT - Bio + Clinical BERT Model" description_en: ClinicalBERT - Bio + Clinical BERT Model
description_en: "ClinicalBERT - Bio + Clinical BERT Model" from_repo: https://huggingface.co/emilyalsentzer/Bio_ClinicalBERT
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/emilyalsentzer/Bio_ClinicalBERT" name: emilyalsentzer/Bio_ClinicalBERT
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: ""
Publisher: "emilyalsentzer"
License: "mit"
Language: "English"
Paper: Paper:
- title: 'Publicly Available Clinical BERT Embeddings' - title: Publicly Available Clinical BERT Embeddings
url: 'http://arxiv.org/abs/1904.03323v3' url: http://arxiv.org/abs/1904.03323v3
- title: 'BioBERT: a pre-trained biomedical language representation model for biomedical text mining' - title: 'BioBERT: a pre-trained biomedical language representation model for biomedical
url: 'http://arxiv.org/abs/1901.08746v4' text mining'
IfTraining: 0 url: http://arxiv.org/abs/1901.08746v4
IfOnlineDemo: 0 Publisher: emilyalsentzer
\ No newline at end of file Task:
- sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "22b0e4db",
"metadata": {},
"source": [
"# ClinicalBERT - Bio + Clinical BERT Model\n"
]
},
{
"cell_type": "markdown",
"id": "f9d9ac37",
"metadata": {},
"source": [
"The [Publicly Available Clinical BERT Embeddings](https://arxiv.org/abs/1904.03323) paper contains four unique clinicalBERT models: initialized with BERT-Base (`cased_L-12_H-768_A-12`) or BioBERT (`BioBERT-Base v1.0 + PubMed 200K + PMC 270K`) & trained on either all MIMIC notes or only discharge summaries.\n"
]
},
{
"cell_type": "markdown",
"id": "24aaa0b1",
"metadata": {},
"source": [
"This model card describes the Bio+Clinical BERT model, which was initialized from [BioBERT](https://arxiv.org/abs/1901.08746) & trained on all MIMIC notes.\n"
]
},
{
"cell_type": "markdown",
"id": "1449fef2",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "be5241ea",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f4c3cf6f",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"emilyalsentzer/Bio_ClinicalBERT\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "451e4ff6",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/emilyalsentzer/Bio_ClinicalBERT](https://huggingface.co/emilyalsentzer/Bio_ClinicalBERT),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "91c5f94f",
"metadata": {},
"source": [
"# ClinicalBERT - Bio + Clinical BERT Model\n"
]
},
{
"cell_type": "markdown",
"id": "ec471b16",
"metadata": {},
"source": [
"The [Publicly Available Clinical BERT Embeddings](https://arxiv.org/abs/1904.03323) paper contains four unique clinicalBERT models: initialized with BERT-Base (`cased_L-12_H-768_A-12`) or BioBERT (`BioBERT-Base v1.0 + PubMed 200K + PMC 270K`) & trained on either all MIMIC notes or only discharge summaries.\n"
]
},
{
"cell_type": "markdown",
"id": "9ab166b8",
"metadata": {},
"source": [
"This model card describes the Bio+Clinical BERT model, which was initialized from [BioBERT](https://arxiv.org/abs/1901.08746) & trained on all MIMIC notes.\n"
]
},
{
"cell_type": "markdown",
"id": "69f6ed08",
"metadata": {},
"source": [
"## How to use the model\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "62913fa8",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7b055241",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"emilyalsentzer/Bio_ClinicalBERT\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "0716a06f",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/emilyalsentzer/Bio_ClinicalBERT](https://huggingface.co/emilyalsentzer/Bio_ClinicalBERT) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: mit
Model_Info: Model_Info:
name: "emilyalsentzer/Bio_Discharge_Summary_BERT" description: ClinicalBERT - Bio + Discharge Summary BERT Model
description: "ClinicalBERT - Bio + Discharge Summary BERT Model" description_en: ClinicalBERT - Bio + Discharge Summary BERT Model
description_en: "ClinicalBERT - Bio + Discharge Summary BERT Model" from_repo: https://huggingface.co/emilyalsentzer/Bio_Discharge_Summary_BERT
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/emilyalsentzer/Bio_Discharge_Summary_BERT" name: emilyalsentzer/Bio_Discharge_Summary_BERT
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: ""
Publisher: "emilyalsentzer"
License: "mit"
Language: "English"
Paper: Paper:
- title: 'Publicly Available Clinical BERT Embeddings' - title: Publicly Available Clinical BERT Embeddings
url: 'http://arxiv.org/abs/1904.03323v3' url: http://arxiv.org/abs/1904.03323v3
- title: 'BioBERT: a pre-trained biomedical language representation model for biomedical text mining' - title: 'BioBERT: a pre-trained biomedical language representation model for biomedical
url: 'http://arxiv.org/abs/1901.08746v4' text mining'
IfTraining: 0 url: http://arxiv.org/abs/1901.08746v4
IfOnlineDemo: 0 Publisher: emilyalsentzer
\ No newline at end of file Task:
- sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "67503ba7",
"metadata": {},
"source": [
"# ClinicalBERT - Bio + Discharge Summary BERT Model\n"
]
},
{
"cell_type": "markdown",
"id": "e2d38260",
"metadata": {},
"source": [
"The [Publicly Available Clinical BERT Embeddings](https://arxiv.org/abs/1904.03323) paper contains four unique clinicalBERT models: initialized with BERT-Base (`cased_L-12_H-768_A-12`) or BioBERT (`BioBERT-Base v1.0 + PubMed 200K + PMC 270K`) & trained on either all MIMIC notes or only discharge summaries.\n"
]
},
{
"cell_type": "markdown",
"id": "1c92755e",
"metadata": {},
"source": [
"This model card describes the Bio+Discharge Summary BERT model, which was initialized from [BioBERT](https://arxiv.org/abs/1901.08746) & trained on only discharge summaries from MIMIC.\n"
]
},
{
"cell_type": "markdown",
"id": "068ba168",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "24e8b203",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4bcd1b84",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"emilyalsentzer/Bio_Discharge_Summary_BERT\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "0cebe09b",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/emilyalsentzer/Bio_Discharge_Summary_BERT](https://huggingface.co/emilyalsentzer/Bio_Discharge_Summary_BERT),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "a786c8f0",
"metadata": {},
"source": [
"# ClinicalBERT - Bio + Discharge Summary BERT Model\n"
]
},
{
"cell_type": "markdown",
"id": "4d8e4f1f",
"metadata": {},
"source": [
"The [Publicly Available Clinical BERT Embeddings](https://arxiv.org/abs/1904.03323) paper contains four unique clinicalBERT models: initialized with BERT-Base (`cased_L-12_H-768_A-12`) or BioBERT (`BioBERT-Base v1.0 + PubMed 200K + PMC 270K`) & trained on either all MIMIC notes or only discharge summaries.\n"
]
},
{
"cell_type": "markdown",
"id": "83bf8287",
"metadata": {},
"source": [
"This model card describes the Bio+Discharge Summary BERT model, which was initialized from [BioBERT](https://arxiv.org/abs/1901.08746) & trained on only discharge summaries from MIMIC.\n"
]
},
{
"cell_type": "markdown",
"id": "ee7d03ef",
"metadata": {},
"source": [
"## How to use the model\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3ef75bb2",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c04f99b3",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"emilyalsentzer/Bio_Discharge_Summary_BERT\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "e4459a1c",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/emilyalsentzer/Bio_Discharge_Summary_BERT](https://huggingface.co/emilyalsentzer/Bio_Discharge_Summary_BERT) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: c4
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info: Model_Info:
name: "google/t5-base-lm-adapt" description: Version 1.1 - LM-Adapted
description: "Version 1.1 - LM-Adapted" description_en: Version 1.1 - LM-Adapted
description_en: "Version 1.1 - LM-Adapted" from_repo: https://huggingface.co/google/t5-base-lm-adapt
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/google/t5-base-lm-adapt" name: google/t5-base-lm-adapt
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text2Text Generation"
sub_tag: "文本生成"
Example:
Datasets: "c4"
Publisher: "google"
License: "apache-2.0"
Language: "English"
Paper: Paper:
- title: 'GLU Variants Improve Transformer' - title: GLU Variants Improve Transformer
url: 'http://arxiv.org/abs/2002.05202v1' url: http://arxiv.org/abs/2002.05202v1
- title: 'Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer' - title: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
url: 'http://arxiv.org/abs/1910.10683v3' url: http://arxiv.org/abs/1910.10683v3
IfTraining: 0 Publisher: google
IfOnlineDemo: 0 Task:
\ No newline at end of file - sub_tag: 文本生成
sub_tag_en: Text2Text Generation
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "c59ed826",
"metadata": {},
"source": [
"[Google's T5](https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html) Version 1.1 - LM-Adapted\n",
"\n",
"\n",
"## Version 1.1 - LM-Adapted\n",
"\n",
"[T5 Version 1.1 - LM Adapted](https://github.com/google-research/text-to-text-transfer-transformer/blob/main/released_checkpoints.md#lm-adapted-t511lm100k) includes the following improvements compared to the original T5 model:\n",
"\n",
"- GEGLU activation in feed-forward hidden layer, rather than ReLU - see [here](https://arxiv.org/abs/2002.05202).\n",
"\n",
"- Dropout was turned off in pre-training (quality win). Dropout should be re-enabled during fine-tuning.\n",
"\n",
"- Pre-trained on C4 only without mixing in the downstream tasks.\n",
"\n",
"- no parameter sharing between embedding and classifier layer\n",
"\n",
"- \"xl\" and \"xxl\" replace \"3B\" and \"11B\". The model shapes are a bit different - larger `d_model` and smaller `num_heads` and `d_ff`.\n",
"\n",
"and is pretrained on both the denoising and language modeling objective.\n",
"\n",
"More specifically, this checkpoint is initialized from T5 Version 1.1 - Base\n",
"and then trained for an additional 100K steps on the LM objective discussed in the [T5 paper](https://arxiv.org/pdf/1910.10683.pdf).\n",
"This adaptation improves the ability of the model to be used for prompt tuning.\n",
"\n",
"**Note**: A popular fine-tuned version of the *T5 Version 1.1 - LM Adapted* model is BigScience's T0pp.\n",
"\n",
"Pretraining Dataset: C4\n",
"\n",
"Paper: [Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/pdf/1910.10683.pdf)\n",
"\n",
"Authors: *Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu*\n",
"\n",
"\n",
"## Abstract\n",
"\n",
"Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts every language problem into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new “Colossal Clean Crawled Corpus”, we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our dataset, pre-trained models, and code.\n",
"\n",
"![model image](https://camo.githubusercontent.com/623b4dea0b653f2ad3f36c71ebfe749a677ac0a1/68747470733a2f2f6d69726f2e6d656469756d2e636f6d2f6d61782f343030362f312a44304a31674e51663876727255704b657944387750412e706e67)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6db9a194",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b70ecb24",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"google/t5-base-lm-adapt\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "46aee335",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/google/t5-base-lm-adapt](https://huggingface.co/google/t5-base-lm-adapt),并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "35f226d7",
"metadata": {},
"source": [
"[Google's T5](https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html) Version 1.1 - LM-Adapted\n",
"\n",
"\n",
"## Version 1.1 - LM-Adapted\n",
"\n",
"[T5 Version 1.1 - LM Adapted](https://github.com/google-research/text-to-text-transfer-transformer/blob/main/released_checkpoints.md#lm-adapted-t511lm100k) includes the following improvements compared to the original T5 model:\n",
"\n",
"- GEGLU activation in feed-forward hidden layer, rather than ReLU - see [here](https://arxiv.org/abs/2002.05202).\n",
"\n",
"- Dropout was turned off in pre-training (quality win). Dropout should be re-enabled during fine-tuning.\n",
"\n",
"- Pre-trained on C4 only without mixing in the downstream tasks.\n",
"\n",
"- no parameter sharing between embedding and classifier layer\n",
"\n",
"- \"xl\" and \"xxl\" replace \"3B\" and \"11B\". The model shapes are a bit different - larger `d_model` and smaller `num_heads` and `d_ff`.\n",
"\n",
"and is pretrained on both the denoising and language modeling objective.\n",
"\n",
"More specifically, this checkpoint is initialized from T5 Version 1.1 - Base\n",
"and then trained for an additional 100K steps on the LM objective discussed in the [T5 paper](https://arxiv.org/pdf/1910.10683.pdf).\n",
"This adaptation improves the ability of the model to be used for prompt tuning.\n",
"\n",
"**Note**: A popular fine-tuned version of the *T5 Version 1.1 - LM Adapted* model is BigScience's T0pp.\n",
"\n",
"Pretraining Dataset: C4\n",
"\n",
"Paper: [Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/pdf/1910.10683.pdf)\n",
"\n",
"Authors: *Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu*\n",
"\n",
"\n",
"## Abstract\n",
"\n",
"Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts every language problem into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new “Colossal Clean Crawled Corpus”, we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our dataset, pre-trained models, and code.\n",
"\n",
"![model image](https://camo.githubusercontent.com/623b4dea0b653f2ad3f36c71ebfe749a677ac0a1/68747470733a2f2f6d69726f2e6d656469756d2e636f6d2f6d61782f343030362f312a44304a31674e51663876727255704b657944387750412e706e67)\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b471855d",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f74ec3ef",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"google/t5-base-lm-adapt\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "e431d080",
"metadata": {},
"source": [
"\n",
"> The model introduction and model weights originate from [https://huggingface.co/google/t5-base-lm-adapt](https://huggingface.co/google/t5-base-lm-adapt) and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: c4
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info: Model_Info:
name: "google/t5-large-lm-adapt" description: Version 1.1 - LM-Adapted
description: "Version 1.1 - LM-Adapted" description_en: Version 1.1 - LM-Adapted
description_en: "Version 1.1 - LM-Adapted" from_repo: https://huggingface.co/google/t5-large-lm-adapt
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/google/t5-large-lm-adapt" name: google/t5-large-lm-adapt
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text2Text Generation"
sub_tag: "文本生成"
Example:
Datasets: "c4"
Publisher: "google"
License: "apache-2.0"
Language: "English"
Paper: Paper:
- title: 'GLU Variants Improve Transformer' - title: GLU Variants Improve Transformer
url: 'http://arxiv.org/abs/2002.05202v1' url: http://arxiv.org/abs/2002.05202v1
- title: 'Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer' - title: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
url: 'http://arxiv.org/abs/1910.10683v3' url: http://arxiv.org/abs/1910.10683v3
IfTraining: 0 Publisher: google
IfOnlineDemo: 0 Task:
\ No newline at end of file - sub_tag: 文本生成
sub_tag_en: Text2Text Generation
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "71352026",
"metadata": {},
"source": [
"[Google's T5](https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html) Version 1.1 - LM-Adapted\n",
"\n",
"\n",
"## Version 1.1 - LM-Adapted\n",
"\n",
"[T5 Version 1.1 - LM Adapted](https://github.com/google-research/text-to-text-transfer-transformer/blob/main/released_checkpoints.md#lm-adapted-t511lm100k) includes the following improvements compared to the original [T5 model](https://huggingface.co/t5-large):\n",
"\n",
"- GEGLU activation in feed-forward hidden layer, rather than ReLU - see [here](https://arxiv.org/abs/2002.05202).\n",
"\n",
"- Dropout was turned off in pre-training (quality win). Dropout should be re-enabled during fine-tuning.\n",
"\n",
"- Pre-trained on C4 only without mixing in the downstream tasks.\n",
"\n",
"- no parameter sharing between embedding and classifier layer\n",
"\n",
"- \"xl\" and \"xxl\" replace \"3B\" and \"11B\". The model shapes are a bit different - larger `d_model` and smaller `num_heads` and `d_ff`.\n",
"\n",
"and is pretrained on both the denoising and language modeling objective.\n",
"\n",
"More specifically, this checkpoint is initialized from T5 Version 1.1 - Large\n",
"and then trained for an additional 100K steps on the LM objective discussed in the [T5 paper](https://arxiv.org/pdf/1910.10683.pdf).\n",
"This adaptation improves the ability of the model to be used for prompt tuning.\n",
"\n",
"**Note**: A popular fine-tuned version of the *T5 Version 1.1 - LM Adapted* model is BigScience's T0pp.\n",
"\n",
"Pretraining Dataset: C4\n",
"\n",
"Paper: [Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/pdf/1910.10683.pdf)\n",
"\n",
"Authors: *Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu*\n",
"\n",
"\n",
"## Abstract\n",
"\n",
"Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts every language problem into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new “Colossal Clean Crawled Corpus”, we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our dataset, pre-trained models, and code.\n",
"\n",
"![model image](https://camo.githubusercontent.com/623b4dea0b653f2ad3f36c71ebfe749a677ac0a1/68747470733a2f2f6d69726f2e6d656469756d2e636f6d2f6d61782f343030362f312a44304a31674e51663876727255704b657944387750412e706e67)"
]
},
{
"cell_type": "markdown",
"id": "d41870db",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b9451fd3",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6392031c",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"google/t5-large-lm-adapt\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "3b35551f",
"metadata": {},
"source": [
"\n",
"> 此模型介绍及权重来源于[https://huggingface.co/google/t5-large-lm-adapt](https://huggingface.co/google/t5-large-lm-adapt),并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "70cb903f",
"metadata": {},
"source": [
"[Google's T5](https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html) Version 1.1 - LM-Adapted\n",
"\n",
"\n",
"## Version 1.1 - LM-Adapted\n",
"\n",
"[T5 Version 1.1 - LM Adapted](https://github.com/google-research/text-to-text-transfer-transformer/blob/main/released_checkpoints.md#lm-adapted-t511lm100k) includes the following improvements compared to the original [T5 model](https://huggingface.co/t5-large):\n",
"\n",
"- GEGLU activation in feed-forward hidden layer, rather than ReLU - see [here](https://arxiv.org/abs/2002.05202).\n",
"\n",
"- Dropout was turned off in pre-training (quality win). Dropout should be re-enabled during fine-tuning.\n",
"\n",
"- Pre-trained on C4 only without mixing in the downstream tasks.\n",
"\n",
"- no parameter sharing between embedding and classifier layer\n",
"\n",
"- \"xl\" and \"xxl\" replace \"3B\" and \"11B\". The model shapes are a bit different - larger `d_model` and smaller `num_heads` and `d_ff`.\n",
"\n",
"and is pretrained on both the denoising and language modeling objective.\n",
"\n",
"More specifically, this checkpoint is initialized from T5 Version 1.1 - Large\n",
"and then trained for an additional 100K steps on the LM objective discussed in the [T5 paper](https://arxiv.org/pdf/1910.10683.pdf).\n",
"This adaptation improves the ability of the model to be used for prompt tuning.\n",
"\n",
"**Note**: A popular fine-tuned version of the *T5 Version 1.1 - LM Adapted* model is BigScience's T0pp.\n",
"\n",
"Pretraining Dataset: C4\n",
"\n",
"Paper: [Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/pdf/1910.10683.pdf)\n",
"\n",
"Authors: *Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu*\n",
"\n",
"\n",
"## Abstract\n",
"\n",
"Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts every language problem into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new “Colossal Clean Crawled Corpus”, we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our dataset, pre-trained models, and code.\n",
"\n",
"![model image](https://camo.githubusercontent.com/623b4dea0b653f2ad3f36c71ebfe749a677ac0a1/68747470733a2f2f6d69726f2e6d656469756d2e636f6d2f6d61782f343030362f312a44304a31674e51663876727255704b657944387750412e706e67)"
]
},
{
"cell_type": "markdown",
"id": "cab0c1ea",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e37976b5",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0673661a",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"google/t5-large-lm-adapt\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "7b24e77f",
"metadata": {},
"source": [
"\n",
"> The model introduction and model weights originate from [https://huggingface.co/google/t5-large-lm-adapt](https://huggingface.co/google/t5-large-lm-adapt) and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
# 模型列表
## google/t5-large-ssm
| 模型名称 | 模型介绍 | 模型大小 | 模型下载 |
| --- | --- | --- | --- |
|google/t5-large-ssm| | 3.12G | [model_config.json](https://bj.bcebos.com/paddlenlp/models/community/google/t5-large-ssm/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/google/t5-large-ssm/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/google/t5-large-ssm/tokenizer_config.json) |
也可以通过`paddlenlp` cli 工具来下载对应的模型权重,使用步骤如下所示:
* 安装paddlenlp
```shell
pip install --upgrade paddlenlp
```
* 下载命令行
```shell
paddlenlp download --cache-dir ./pretrained_models google/t5-large-ssm
```
有任何下载的问题都可以到[PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP)中发Issue提问。
\ No newline at end of file
# model list
##
| model | description | model_size | download |
| --- | --- | --- | --- |
|google/t5-large-ssm| | 3.12G | [model_config.json](https://bj.bcebos.com/paddlenlp/models/community/google/t5-large-ssm/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/google/t5-large-ssm/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/google/t5-large-ssm/tokenizer_config.json) |
or you can download all of model file with the following steps:
* install paddlenlp
```shell
pip install --upgrade paddlenlp
```
* download model with cli tool
```shell
paddlenlp download --cache-dir ./pretrained_models google/t5-large-ssm
```
If you have any problems with it, you can post issue on [PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP) to get support.
Model_Info:
name: "google/t5-large-ssm"
description: "Abstract"
description_en: "Abstract"
icon: ""
from_repo: "https://huggingface.co/google/t5-large-ssm"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text2Text Generation"
sub_tag: "文本生成"
Example:
Datasets: "c4,wikipedia"
Publisher: "google"
License: "apache-2.0"
Language: "English"
Paper:
- title: 'REALM: Retrieval-Augmented Language Model Pre-Training'
url: 'http://arxiv.org/abs/2002.08909v1'
- title: 'Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer'
url: 'http://arxiv.org/abs/1910.10683v3'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
Datasets: c4
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info: Model_Info:
name: "google/t5-small-lm-adapt" description: Version 1.1 - LM-Adapted
description: "Version 1.1 - LM-Adapted" description_en: Version 1.1 - LM-Adapted
description_en: "Version 1.1 - LM-Adapted" from_repo: https://huggingface.co/google/t5-small-lm-adapt
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/google/t5-small-lm-adapt" name: google/t5-small-lm-adapt
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text2Text Generation"
sub_tag: "文本生成"
Example:
Datasets: "c4"
Publisher: "google"
License: "apache-2.0"
Language: "English"
Paper: Paper:
- title: 'GLU Variants Improve Transformer' - title: GLU Variants Improve Transformer
url: 'http://arxiv.org/abs/2002.05202v1' url: http://arxiv.org/abs/2002.05202v1
- title: 'Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer' - title: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
url: 'http://arxiv.org/abs/1910.10683v3' url: http://arxiv.org/abs/1910.10683v3
IfTraining: 0 Publisher: google
IfOnlineDemo: 0 Task:
\ No newline at end of file - sub_tag: 文本生成
sub_tag_en: Text2Text Generation
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "9dca1445",
"metadata": {},
"source": [
"[Google's T5](https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html) Version 1.1 - LM-Adapted\n",
"\n",
"\n",
"## Version 1.1 - LM-Adapted\n",
"\n",
"[T5 Version 1.1 - LM Adapted](https://github.com/google-research/text-to-text-transfer-transformer/blob/main/released_checkpoints.md#lm-adapted-t511lm100k) includes the following improvements compared to the original [T5 model](https://huggingface.co/t5-small):\n",
"\n",
"- GEGLU activation in feed-forward hidden layer, rather than ReLU - see [here](https://arxiv.org/abs/2002.05202).\n",
"\n",
"- Dropout was turned off in pre-training (quality win). Dropout should be re-enabled during fine-tuning.\n",
"\n",
"- Pre-trained on C4 only without mixing in the downstream tasks.\n",
"\n",
"- no parameter sharing between embedding and classifier layer\n",
"\n",
"- \"xl\" and \"xxl\" replace \"3B\" and \"11B\". The model shapes are a bit different - larger `d_model` and smaller `num_heads` and `d_ff`.\n",
"\n",
"and is pretrained on both the denoising and language modeling objective.\n",
"\n",
"More specifically, this checkpoint is initialized from T5 Version 1.1 - Small\n",
"and then trained for an additional 100K steps on the LM objective discussed in the [T5 paper](https://arxiv.org/pdf/1910.10683.pdf).\n",
"This adaptation improves the ability of the model to be used for prompt tuning.\n",
"\n",
"**Note**: A popular fine-tuned version of the *T5 Version 1.1 - LM Adapted* model is BigScience's T0pp.\n",
"\n",
"Pretraining Dataset: C4\n",
"\n",
"Paper: [Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/pdf/1910.10683.pdf)\n",
"\n",
"Authors: *Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu*\n",
"\n",
"\n",
"## Abstract\n",
"\n",
"Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts every language problem into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new “Colossal Clean Crawled Corpus”, we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our dataset, pre-trained models, and code.\n",
"\n",
"![model image](https://camo.githubusercontent.com/623b4dea0b653f2ad3f36c71ebfe749a677ac0a1/68747470733a2f2f6d69726f2e6d656469756d2e636f6d2f6d61782f343030362f312a44304a31674e51663876727255704b657944387750412e706e67)\n"
]
},
{
"cell_type": "markdown",
"id": "4c63de98",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8030fcb4",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8f6f14dd",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"google/t5-small-lm-adapt\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "b8dd698b",
"metadata": {},
"source": [
"\n",
"> 此模型介绍及权重来源于[https://huggingface.co/google/t5-small-lm-adapt](https://huggingface.co/google/t5-small-lm-adapt),并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "42de6200",
"metadata": {},
"source": [
"[Google's T5](https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html) Version 1.1 - LM-Adapted\n",
"\n",
"\n",
"## Version 1.1 - LM-Adapted\n",
"\n",
"[T5 Version 1.1 - LM Adapted](https://github.com/google-research/text-to-text-transfer-transformer/blob/main/released_checkpoints.md#lm-adapted-t511lm100k) includes the following improvements compared to the original [T5 model](https://huggingface.co/t5-small):\n",
"\n",
"- GEGLU activation in feed-forward hidden layer, rather than ReLU - see [here](https://arxiv.org/abs/2002.05202).\n",
"\n",
"- Dropout was turned off in pre-training (quality win). Dropout should be re-enabled during fine-tuning.\n",
"\n",
"- Pre-trained on C4 only without mixing in the downstream tasks.\n",
"\n",
"- no parameter sharing between embedding and classifier layer\n",
"\n",
"- \"xl\" and \"xxl\" replace \"3B\" and \"11B\". The model shapes are a bit different - larger `d_model` and smaller `num_heads` and `d_ff`.\n",
"\n",
"and is pretrained on both the denoising and language modeling objective.\n",
"\n",
"More specifically, this checkpoint is initialized from [T5 Version 1.1 - Small](https://huggingface.co/google/https://huggingface.co/google/t5-v1_1-small)\n",
"and then trained for an additional 100K steps on the LM objective discussed in the [T5 paper](https://arxiv.org/pdf/1910.10683.pdf).\n",
"This adaptation improves the ability of the model to be used for prompt tuning.\n",
"\n",
"**Note**: A popular fine-tuned version of the *T5 Version 1.1 - LM Adapted* model is BigScience's T0pp.\n",
"\n",
"Pretraining Dataset: C4\n",
"\n",
"Paper: [Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/pdf/1910.10683.pdf)\n",
"\n",
"Authors: *Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu*\n",
"\n",
"\n",
"## Abstract\n",
"\n",
"Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts every language problem into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new “Colossal Clean Crawled Corpus”, we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our dataset, pre-trained models, and code.\n",
"\n",
"![model image](https://camo.githubusercontent.com/623b4dea0b653f2ad3f36c71ebfe749a677ac0a1/68747470733a2f2f6d69726f2e6d656469756d2e636f6d2f6d61782f343030362f312a44304a31674e51663876727255704b657944387750412e706e67)\n"
]
},
{
"cell_type": "markdown",
"id": "39071317",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "31a774de",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a1cd7a50",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"google/t5-small-lm-adapt\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "4d283d3d",
"metadata": {},
"source": [
"\n",
"> The model introduction and model weights originate from [https://huggingface.co/google/t5-small-lm-adapt](https://huggingface.co/google/t5-small-lm-adapt) and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: c4
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info: Model_Info:
name: "google/t5-v1_1-base" description: Version 1.1
description: "Version 1.1" description_en: Version 1.1
description_en: "Version 1.1" from_repo: https://huggingface.co/google/t5-v1_1-base
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/google/t5-v1_1-base" name: google/t5-v1_1-base
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text2Text Generation"
sub_tag: "文本生成"
Example:
Datasets: "c4"
Publisher: "google"
License: "apache-2.0"
Language: "English"
Paper: Paper:
- title: 'GLU Variants Improve Transformer' - title: GLU Variants Improve Transformer
url: 'http://arxiv.org/abs/2002.05202v1' url: http://arxiv.org/abs/2002.05202v1
- title: 'Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer' - title: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
url: 'http://arxiv.org/abs/1910.10683v3' url: http://arxiv.org/abs/1910.10683v3
IfTraining: 0 Publisher: google
IfOnlineDemo: 0 Task:
\ No newline at end of file - sub_tag: 文本生成
sub_tag_en: Text2Text Generation
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "83a0cdfd",
"metadata": {},
"source": [
"[Google's T5](https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html) Version 1.1\n",
"\n",
"\n",
"## Version 1.1\n",
"\n",
"[T5 Version 1.1](https://github.com/google-research/text-to-text-transfer-transformer/blob/master/released_checkpoints.md#t511) includes the following improvements compared to the original T5 model- GEGLU activation in feed-forward hidden layer, rather than ReLU - see [here](https://arxiv.org/abs/2002.05202).\n",
"\n",
"- Dropout was turned off in pre-training (quality win). Dropout should be re-enabled during fine-tuning.\n",
"\n",
"- Pre-trained on C4 only without mixing in the downstream tasks.\n",
"\n",
"- no parameter sharing between embedding and classifier layer\n",
"\n",
"- \"xl\" and \"xxl\" replace \"3B\" and \"11B\". The model shapes are a bit different - larger `d_model` and smaller `num_heads` and `d_ff`.\n",
"\n",
"**Note**: T5 Version 1.1 was only pre-trained on C4 excluding any supervised training. Therefore, this model has to be fine-tuned before it is useable on a downstream task.\n",
"Pretraining Dataset: C4\n",
"\n",
"Paper: [Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/pdf/1910.10683.pdf)\n",
"\n",
"Authors: *Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu*\n",
"\n",
"\n",
"## Abstract\n",
"\n",
"Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts every language problem into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new “Colossal Clean Crawled Corpus”, we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our dataset, pre-trained models, and code.\n",
"\n",
"![model image](https://camo.githubusercontent.com/623b4dea0b653f2ad3f36c71ebfe749a677ac0a1/68747470733a2f2f6d69726f2e6d656469756d2e636f6d2f6d61782f343030362f312a44304a31674e51663876727255704b657944387750412e706e67)"
]
},
{
"cell_type": "markdown",
"id": "6196f74c",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6c13cb82",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1d62f626",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"google/t5-v1_1-base\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "016545f2",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/google/t5-v1_1-base](https://huggingface.co/google/t5-v1_1-base),并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "2656a571",
"metadata": {},
"source": [
"[Google's T5](https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html) Version 1.1\n",
"\n",
"\n",
"## Version 1.1\n",
"\n",
"[T5 Version 1.1](https://github.com/google-research/text-to-text-transfer-transformer/blob/master/released_checkpoints.md#t511) includes the following improvements compared to the original T5 model- GEGLU activation in feed-forward hidden layer, rather than ReLU - see [here](https://arxiv.org/abs/2002.05202).\n",
"\n",
"- Dropout was turned off in pre-training (quality win). Dropout should be re-enabled during fine-tuning.\n",
"\n",
"- Pre-trained on C4 only without mixing in the downstream tasks.\n",
"\n",
"- no parameter sharing between embedding and classifier layer\n",
"\n",
"- \"xl\" and \"xxl\" replace \"3B\" and \"11B\". The model shapes are a bit different - larger `d_model` and smaller `num_heads` and `d_ff`.\n",
"\n",
"**Note**: T5 Version 1.1 was only pre-trained on C4 excluding any supervised training. Therefore, this model has to be fine-tuned before it is useable on a downstream task.\n",
"Pretraining Dataset: C4\n",
"\n",
"Paper: [Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/pdf/1910.10683.pdf)\n",
"\n",
"Authors: *Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu*\n",
"\n",
"\n",
"## Abstract\n",
"\n",
"Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts every language problem into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new “Colossal Clean Crawled Corpus”, we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our dataset, pre-trained models, and code.\n",
"\n",
"![model image](https://camo.githubusercontent.com/623b4dea0b653f2ad3f36c71ebfe749a677ac0a1/68747470733a2f2f6d69726f2e6d656469756d2e636f6d2f6d61782f343030362f312a44304a31674e51663876727255704b657944387750412e706e67)"
]
},
{
"cell_type": "markdown",
"id": "c0cd9f02",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9323615d",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0b9994b3",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"google/t5-v1_1-base\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "8daa264b",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/google/t5-v1_1-base](https://huggingface.co/google/t5-v1_1-base) and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: c4
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info: Model_Info:
name: "google/t5-v1_1-large" description: Version 1.1
description: "Version 1.1" description_en: Version 1.1
description_en: "Version 1.1" from_repo: https://huggingface.co/google/t5-v1_1-large
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/google/t5-v1_1-large" name: google/t5-v1_1-large
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text2Text Generation"
sub_tag: "文本生成"
Example:
Datasets: "c4"
Publisher: "google"
License: "apache-2.0"
Language: "English"
Paper: Paper:
- title: 'GLU Variants Improve Transformer' - title: GLU Variants Improve Transformer
url: 'http://arxiv.org/abs/2002.05202v1' url: http://arxiv.org/abs/2002.05202v1
- title: 'Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer' - title: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
url: 'http://arxiv.org/abs/1910.10683v3' url: http://arxiv.org/abs/1910.10683v3
IfTraining: 0 Publisher: google
IfOnlineDemo: 0 Task:
\ No newline at end of file - sub_tag: 文本生成
sub_tag_en: Text2Text Generation
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "11d36429",
"metadata": {},
"source": [
"[Google's T5](https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html) Version 1.1\n",
"\n",
"\n",
"## Version 1.1\n",
"\n",
"[T5 Version 1.1](https://github.com/google-research/text-to-text-transfer-transformer/blob/master/released_checkpoints.md#t511) includes the following improvements compared to the original T5 model- GEGLU activation in feed-forward hidden layer, rather than ReLU - see [here](https://arxiv.org/abs/2002.05202).\n",
"\n",
"- Dropout was turned off in pre-training (quality win). Dropout should be re-enabled during fine-tuning.\n",
"\n",
"- Pre-trained on C4 only without mixing in the downstream tasks.\n",
"\n",
"- no parameter sharing between embedding and classifier layer\n",
"\n",
"- \"xl\" and \"xxl\" replace \"3B\" and \"11B\". The model shapes are a bit different - larger `d_model` and smaller `num_heads` and `d_ff`.\n",
"\n",
"**Note**: T5 Version 1.1 was only pre-trained on C4 excluding any supervised training. Therefore, this model has to be fine-tuned before it is useable on a downstream task.\n",
"Pretraining Dataset: C4\n",
"\n",
"Paper: [Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/pdf/1910.10683.pdf)\n",
"\n",
"Authors: *Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu*\n",
"\n",
"\n",
"## Abstract\n",
"\n",
"Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts every language problem into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new “Colossal Clean Crawled Corpus”, we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our dataset, pre-trained models, and code.\n",
"\n",
"![model image](https://camo.githubusercontent.com/623b4dea0b653f2ad3f36c71ebfe749a677ac0a1/68747470733a2f2f6d69726f2e6d656469756d2e636f6d2f6d61782f343030362f312a44304a31674e51663876727255704b657944387750412e706e67)"
]
},
{
"cell_type": "markdown",
"id": "c11ad8cf",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "480104f2",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e1323ff9",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"google/t5-v1_1-large\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "4348828e",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/google/t5-v1_1-large](https://huggingface.co/google/t5-v1_1-large),并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "5f0c769f",
"metadata": {},
"source": [
"[Google's T5](https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html) Version 1.1\n",
"\n",
"\n",
"## Version 1.1\n",
"\n",
"[T5 Version 1.1](https://github.com/google-research/text-to-text-transfer-transformer/blob/master/released_checkpoints.md#t511) includes the following improvements compared to the original T5 model- GEGLU activation in feed-forward hidden layer, rather than ReLU - see [here](https://arxiv.org/abs/2002.05202).\n",
"\n",
"- Dropout was turned off in pre-training (quality win). Dropout should be re-enabled during fine-tuning.\n",
"\n",
"- Pre-trained on C4 only without mixing in the downstream tasks.\n",
"\n",
"- no parameter sharing between embedding and classifier layer\n",
"\n",
"- \"xl\" and \"xxl\" replace \"3B\" and \"11B\". The model shapes are a bit different - larger `d_model` and smaller `num_heads` and `d_ff`.\n",
"\n",
"**Note**: T5 Version 1.1 was only pre-trained on C4 excluding any supervised training. Therefore, this model has to be fine-tuned before it is useable on a downstream task.\n",
"Pretraining Dataset: C4\n",
"\n",
"Paper: [Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/pdf/1910.10683.pdf)\n",
"\n",
"Authors: *Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu*\n",
"\n",
"\n",
"## Abstract\n",
"\n",
"Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts every language problem into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new “Colossal Clean Crawled Corpus”, we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our dataset, pre-trained models, and code.\n",
"\n",
"![model image](https://camo.githubusercontent.com/623b4dea0b653f2ad3f36c71ebfe749a677ac0a1/68747470733a2f2f6d69726f2e6d656469756d2e636f6d2f6d61782f343030362f312a44304a31674e51663876727255704b657944387750412e706e67)"
]
},
{
"cell_type": "markdown",
"id": "27e206b9",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3cf23148",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "467b7ff7",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"google/t5-v1_1-large\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "a5616bca",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/google/t5-v1_1-large](https://huggingface.co/google/t5-v1_1-large) and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: c4
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info: Model_Info:
name: "google/t5-v1_1-small" description: Version 1.1
description: "Version 1.1" description_en: Version 1.1
description_en: "Version 1.1" from_repo: https://huggingface.co/google/t5-v1_1-small
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/google/t5-v1_1-small" name: google/t5-v1_1-small
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text2Text Generation"
sub_tag: "文本生成"
Example:
Datasets: "c4"
Publisher: "google"
License: "apache-2.0"
Language: "English"
Paper: Paper:
- title: 'GLU Variants Improve Transformer' - title: GLU Variants Improve Transformer
url: 'http://arxiv.org/abs/2002.05202v1' url: http://arxiv.org/abs/2002.05202v1
- title: 'Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer' - title: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
url: 'http://arxiv.org/abs/1910.10683v3' url: http://arxiv.org/abs/1910.10683v3
IfTraining: 0 Publisher: google
IfOnlineDemo: 0 Task:
\ No newline at end of file - sub_tag: 文本生成
sub_tag_en: Text2Text Generation
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "51d7e9ca",
"metadata": {},
"source": [
"[Google's T5](https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html) Version 1.1\n",
"\n",
"\n",
"## Version 1.1\n",
"\n",
"[T5 Version 1.1](https://github.com/google-research/text-to-text-transfer-transformer/blob/master/released_checkpoints.md#t511) includes the following improvements compared to the original T5 model- GEGLU activation in feed-forward hidden layer, rather than ReLU - see [here](https://arxiv.org/abs/2002.05202).\n",
"\n",
"- Dropout was turned off in pre-training (quality win). Dropout should be re-enabled during fine-tuning.\n",
"\n",
"- Pre-trained on C4 only without mixing in the downstream tasks.\n",
"\n",
"- no parameter sharing between embedding and classifier layer\n",
"\n",
"- \"xl\" and \"xxl\" replace \"3B\" and \"11B\". The model shapes are a bit different - larger `d_model` and smaller `num_heads` and `d_ff`.\n",
"\n",
"**Note**: T5 Version 1.1 was only pre-trained on C4 excluding any supervised training. Therefore, this model has to be fine-tuned before it is useable on a downstream task.\n",
"Pretraining Dataset: [C4](https://huggingface.co/datasets/c4)\n",
"\n",
"Other Community Checkpoints: [here](https://huggingface.co/models?search=t5-v1_1)\n",
"\n",
"Paper: [Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/pdf/1910.10683.pdf)\n",
"\n",
"Authors: *Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu*\n",
"\n",
"\n",
"## Abstract\n",
"\n",
"Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts every language problem into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new “Colossal Clean Crawled Corpus”, we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our dataset, pre-trained models, and code.\n",
"\n",
"![model image](https://camo.githubusercontent.com/623b4dea0b653f2ad3f36c71ebfe749a677ac0a1/68747470733a2f2f6d69726f2e6d656469756d2e636f6d2f6d61782f343030362f312a44304a31674e51663876727255704b657944387750412e706e67)\n"
]
},
{
"cell_type": "markdown",
"id": "b4b5fc59",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ae31cbc9",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "81d25d09",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"google/t5-v1_1-small\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "6e0459f7",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/google/t5-v1_1-small](https://huggingface.co/google/t5-v1_1-small),并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "95b64b6f",
"metadata": {},
"source": [
"[Google's T5](https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html) Version 1.1\n",
"\n",
"\n",
"## Version 1.1\n",
"\n",
"[T5 Version 1.1](https://github.com/google-research/text-to-text-transfer-transformer/blob/master/released_checkpoints.md#t511) includes the following improvements compared to the original T5 model- GEGLU activation in feed-forward hidden layer, rather than ReLU - see [here](https://arxiv.org/abs/2002.05202).\n",
"\n",
"- Dropout was turned off in pre-training (quality win). Dropout should be re-enabled during fine-tuning.\n",
"\n",
"- Pre-trained on C4 only without mixing in the downstream tasks.\n",
"\n",
"- no parameter sharing between embedding and classifier layer\n",
"\n",
"- \"xl\" and \"xxl\" replace \"3B\" and \"11B\". The model shapes are a bit different - larger `d_model` and smaller `num_heads` and `d_ff`.\n",
"\n",
"**Note**: T5 Version 1.1 was only pre-trained on C4 excluding any supervised training. Therefore, this model has to be fine-tuned before it is useable on a downstream task.\n",
"Pretraining Dataset: [C4](https://huggingface.co/datasets/c4)\n",
"\n",
"Other Community Checkpoints: [here](https://huggingface.co/models?search=t5-v1_1)\n",
"\n",
"Paper: [Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/pdf/1910.10683.pdf)\n",
"\n",
"Authors: *Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu*\n",
"\n",
"\n",
"## Abstract\n",
"\n",
"Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts every language problem into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new “Colossal Clean Crawled Corpus”, we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our dataset, pre-trained models, and code.\n",
"\n",
"![model image](https://camo.githubusercontent.com/623b4dea0b653f2ad3f36c71ebfe749a677ac0a1/68747470733a2f2f6d69726f2e6d656469756d2e636f6d2f6d61782f343030362f312a44304a31674e51663876727255704b657944387750412e706e67)\n"
]
},
{
"cell_type": "markdown",
"id": "88ec53f3",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "082cae7f",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "05f7f4d0",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"google/t5-v1_1-small\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "c7a95cdf",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/google/t5-v1_1-small](https://huggingface.co/google/t5-v1_1-small) and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: Chinese
License: apache-2.0
Model_Info: Model_Info:
name: "hfl/chinese-bert-wwm-ext" description: Chinese BERT with Whole Word Masking
description: "Chinese BERT with Whole Word Masking" description_en: Chinese BERT with Whole Word Masking
description_en: "Chinese BERT with Whole Word Masking" from_repo: https://huggingface.co/hfl/chinese-bert-wwm-ext
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/hfl/chinese-bert-wwm-ext" name: hfl/chinese-bert-wwm-ext
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: ""
Publisher: "hfl"
License: "apache-2.0"
Language: "Chinese"
Paper: Paper:
- title: 'Pre-Training with Whole Word Masking for Chinese BERT' - title: Pre-Training with Whole Word Masking for Chinese BERT
url: 'http://arxiv.org/abs/1906.08101v3' url: http://arxiv.org/abs/1906.08101v3
- title: 'Revisiting Pre-Trained Models for Chinese Natural Language Processing' - title: Revisiting Pre-Trained Models for Chinese Natural Language Processing
url: 'http://arxiv.org/abs/2004.13922v2' url: http://arxiv.org/abs/2004.13922v2
IfTraining: 0 Publisher: hfl
IfOnlineDemo: 0 Task:
\ No newline at end of file - sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "a5e1e8bd",
"metadata": {},
"source": [
"## Chinese BERT with Whole Word Masking\n",
"For further accelerating Chinese natural language processing, we provide **Chinese pre-trained BERT with Whole Word Masking**.\n",
"\n",
"**[Pre-Training with Whole Word Masking for Chinese BERT](https://arxiv.org/abs/1906.08101)**\n",
"Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Ziqing Yang, Shijin Wang, Guoping Hu\n",
"\n",
"This repository is developed based on:https://github.com/google-research/bert\n",
"\n",
"You may also interested in,\n",
"- Chinese BERT series: https://github.com/ymcui/Chinese-BERT-wwm\n",
"- Chinese MacBERT: https://github.com/ymcui/MacBERT\n",
"- Chinese ELECTRA: https://github.com/ymcui/Chinese-ELECTRA\n",
"- Chinese XLNet: https://github.com/ymcui/Chinese-XLNet\n",
"- Knowledge Distillation Toolkit - TextBrewer: https://github.com/airaria/TextBrewer\n",
"\n",
"More resources by HFL: https://github.com/ymcui/HFL-Anthology\n"
]
},
{
"cell_type": "markdown",
"id": "be498a8f",
"metadata": {},
"source": [
"## How to Use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0199d11d",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b71b0698",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"hfl/chinese-bert-wwm-ext\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "5d6bd99f",
"metadata": {},
"source": [
"\n",
"## Citation\n",
"If you find the technical report or resource is useful, please cite the following technical report in your paper.\n",
"- Primary: https://arxiv.org/abs/2004.13922"
]
},
{
"cell_type": "markdown",
"id": "456616b7",
"metadata": {},
"source": [
"```\n",
"@inproceedings{cui-etal-2020-revisiting,\n",
"title = \"Revisiting Pre-Trained Models for {C}hinese Natural Language Processing\",\n",
"author = \"Cui, Yiming and\n",
"Che, Wanxiang and\n",
"Liu, Ting and\n",
"Qin, Bing and\n",
"Wang, Shijin and\n",
"Hu, Guoping\",\n",
"booktitle = \"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings\",\n",
"month = nov,\n",
"year = \"2020\",\n",
"address = \"Online\",\n",
"publisher = \"Association for Computational Linguistics\",\n",
"url = \"https://www.aclweb.org/anthology/2020.findings-emnlp.58\",\n",
"pages = \"657--668\",\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "9784d9b7",
"metadata": {},
"source": [
"- Secondary: https://arxiv.org/abs/1906.08101\n"
]
},
{
"cell_type": "markdown",
"id": "15ed9adf",
"metadata": {},
"source": [
"```\n",
"@article{chinese-bert-wwm,\n",
"title={Pre-Training with Whole Word Masking for Chinese BERT},\n",
"author={Cui, Yiming and Che, Wanxiang and Liu, Ting and Qin, Bing and Yang, Ziqing and Wang, Shijin and Hu, Guoping},\n",
"journal={arXiv preprint arXiv:1906.08101},\n",
"year={2019}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "3593ecc9",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/hfl/chinese-bert-wwm-ext](https://huggingface.co/hfl/chinese-bert-wwm-ext),并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
},
"vscode": {
"interpreter": {
"hash": "606ea184b8fed3419d714b545dc1784fad6c99d0cc940b6b9d787dccf225faa5"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "faeb5f50",
"metadata": {},
"source": [
"## Chinese BERT with Whole Word Masking\n",
"For further accelerating Chinese natural language processing, we provide **Chinese pre-trained BERT with Whole Word Masking**.\n",
"\n",
"**[Pre-Training with Whole Word Masking for Chinese BERT](https://arxiv.org/abs/1906.08101)**\n",
"Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Ziqing Yang, Shijin Wang, Guoping Hu\n",
"\n",
"This repository is developed based on:https://github.com/google-research/bert\n",
"\n",
"You may also interested in,\n",
"- Chinese BERT series: https://github.com/ymcui/Chinese-BERT-wwm\n",
"- Chinese MacBERT: https://github.com/ymcui/MacBERT\n",
"- Chinese ELECTRA: https://github.com/ymcui/Chinese-ELECTRA\n",
"- Chinese XLNet: https://github.com/ymcui/Chinese-XLNet\n",
"- Knowledge Distillation Toolkit - TextBrewer: https://github.com/airaria/TextBrewer\n",
"\n",
"More resources by HFL: https://github.com/ymcui/HFL-Anthology\n"
]
},
{
"cell_type": "markdown",
"id": "fbf98c0e",
"metadata": {},
"source": [
"## How to Use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5f6b3ac7",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f380cab7",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"hfl/chinese-bert-wwm-ext\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "a39bca7c",
"metadata": {},
"source": [
"\n",
"## Citation\n",
"If you find the technical report or resource is useful, please cite the following technical report in your paper.\n",
"- Primary: https://arxiv.org/abs/2004.13922"
]
},
{
"cell_type": "markdown",
"id": "5cff4b49",
"metadata": {},
"source": [
"```\n",
"@inproceedings{cui-etal-2020-revisiting,\n",
"title = \"Revisiting Pre-Trained Models for {C}hinese Natural Language Processing\",\n",
"author = \"Cui, Yiming and\n",
"Che, Wanxiang and\n",
"Liu, Ting and\n",
"Qin, Bing and\n",
"Wang, Shijin and\n",
"Hu, Guoping\",\n",
"booktitle = \"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings\",\n",
"month = nov,\n",
"year = \"2020\",\n",
"address = \"Online\",\n",
"publisher = \"Association for Computational Linguistics\",\n",
"url = \"https://www.aclweb.org/anthology/2020.findings-emnlp.58\",\n",
"pages = \"657--668\",\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "a8781cbe",
"metadata": {},
"source": [
"- Secondary: https://arxiv.org/abs/1906.08101\n"
]
},
{
"cell_type": "markdown",
"id": "b7acc10f",
"metadata": {},
"source": [
"```\n",
"@article{chinese-bert-wwm,\n",
"title={Pre-Training with Whole Word Masking for Chinese BERT},\n",
"author={Cui, Yiming and Che, Wanxiang and Liu, Ting and Qin, Bing and Yang, Ziqing and Wang, Shijin and Hu, Guoping},\n",
"journal={arXiv preprint arXiv:1906.08101},\n",
"year={2019}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "86de1995",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/hfl/chinese-bert-wwm-ext](https://huggingface.co/hfl/chinese-bert-wwm-ext) and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
},
"vscode": {
"interpreter": {
"hash": "606ea184b8fed3419d714b545dc1784fad6c99d0cc940b6b9d787dccf225faa5"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: Chinese
License: apache-2.0
Model_Info: Model_Info:
name: "hfl/chinese-bert-wwm" description: Chinese BERT with Whole Word Masking
description: "Chinese BERT with Whole Word Masking" description_en: Chinese BERT with Whole Word Masking
description_en: "Chinese BERT with Whole Word Masking" from_repo: https://huggingface.co/hfl/chinese-bert-wwm
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/hfl/chinese-bert-wwm" name: hfl/chinese-bert-wwm
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: ""
Publisher: "hfl"
License: "apache-2.0"
Language: "Chinese"
Paper: Paper:
- title: 'Pre-Training with Whole Word Masking for Chinese BERT' - title: Pre-Training with Whole Word Masking for Chinese BERT
url: 'http://arxiv.org/abs/1906.08101v3' url: http://arxiv.org/abs/1906.08101v3
- title: 'Revisiting Pre-Trained Models for Chinese Natural Language Processing' - title: Revisiting Pre-Trained Models for Chinese Natural Language Processing
url: 'http://arxiv.org/abs/2004.13922v2' url: http://arxiv.org/abs/2004.13922v2
IfTraining: 0 Publisher: hfl
IfOnlineDemo: 0 Task:
\ No newline at end of file - sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "a5e1e8bd",
"metadata": {},
"source": [
"## Chinese BERT with Whole Word Masking\n",
"For further accelerating Chinese natural language processing, we provide **Chinese pre-trained BERT with Whole Word Masking**.\n",
"\n",
"**[Pre-Training with Whole Word Masking for Chinese BERT](https://arxiv.org/abs/1906.08101)**\n",
"Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Ziqing Yang, Shijin Wang, Guoping Hu\n",
"\n",
"This repository is developed based on:https://github.com/google-research/bert\n",
"\n",
"You may also interested in,\n",
"- Chinese BERT series: https://github.com/ymcui/Chinese-BERT-wwm\n",
"- Chinese MacBERT: https://github.com/ymcui/MacBERT\n",
"- Chinese ELECTRA: https://github.com/ymcui/Chinese-ELECTRA\n",
"- Chinese XLNet: https://github.com/ymcui/Chinese-XLNet\n",
"- Knowledge Distillation Toolkit - TextBrewer: https://github.com/airaria/TextBrewer\n",
"\n",
"More resources by HFL: https://github.com/ymcui/HFL-Anthology\n"
]
},
{
"cell_type": "markdown",
"id": "be498a8f",
"metadata": {},
"source": [
"## How to Use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0199d11d",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b71b0698",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"hfl/chinese-bert-wwm\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "5d6bd99f",
"metadata": {},
"source": [
"\n",
"## Citation\n",
"If you find the technical report or resource is useful, please cite the following technical report in your paper.\n",
"- Primary: https://arxiv.org/abs/2004.13922"
]
},
{
"cell_type": "markdown",
"id": "376186df",
"metadata": {},
"source": [
"```\n",
"@inproceedings{cui-etal-2020-revisiting,\n",
"title = \"Revisiting Pre-Trained Models for {C}hinese Natural Language Processing\",\n",
"author = \"Cui, Yiming and\n",
"Che, Wanxiang and\n",
"Liu, Ting and\n",
"Qin, Bing and\n",
"Wang, Shijin and\n",
"Hu, Guoping\",\n",
"booktitle = \"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings\",\n",
"month = nov,\n",
"year = \"2020\",\n",
"address = \"Online\",\n",
"publisher = \"Association for Computational Linguistics\",\n",
"url = \"https://www.aclweb.org/anthology/2020.findings-emnlp.58\",\n",
"pages = \"657--668\",\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "9784d9b7",
"metadata": {},
"source": [
"- Secondary: https://arxiv.org/abs/1906.08101\n"
]
},
{
"cell_type": "markdown",
"id": "478fe6be",
"metadata": {},
"source": [
"```\n",
"@article{chinese-bert-wwm,\n",
"title={Pre-Training with Whole Word Masking for Chinese BERT},\n",
"author={Cui, Yiming and Che, Wanxiang and Liu, Ting and Qin, Bing and Yang, Ziqing and Wang, Shijin and Hu, Guoping},\n",
"journal={arXiv preprint arXiv:1906.08101},\n",
"year={2019}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "3593ecc9",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/hfl/chinese-bert-wwm](https://huggingface.co/hfl/chinese-bert-wwm),并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
},
"vscode": {
"interpreter": {
"hash": "606ea184b8fed3419d714b545dc1784fad6c99d0cc940b6b9d787dccf225faa5"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "faeb5f50",
"metadata": {},
"source": [
"## Chinese BERT with Whole Word Masking\n",
"For further accelerating Chinese natural language processing, we provide **Chinese pre-trained BERT with Whole Word Masking**.\n",
"\n",
"**[Pre-Training with Whole Word Masking for Chinese BERT](https://arxiv.org/abs/1906.08101)**\n",
"Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Ziqing Yang, Shijin Wang, Guoping Hu\n",
"\n",
"This repository is developed based on:https://github.com/google-research/bert\n",
"\n",
"You may also interested in,\n",
"- Chinese BERT series: https://github.com/ymcui/Chinese-BERT-wwm\n",
"- Chinese MacBERT: https://github.com/ymcui/MacBERT\n",
"- Chinese ELECTRA: https://github.com/ymcui/Chinese-ELECTRA\n",
"- Chinese XLNet: https://github.com/ymcui/Chinese-XLNet\n",
"- Knowledge Distillation Toolkit - TextBrewer: https://github.com/airaria/TextBrewer\n",
"\n",
"More resources by HFL: https://github.com/ymcui/HFL-Anthology\n"
]
},
{
"cell_type": "markdown",
"id": "fbf98c0e",
"metadata": {},
"source": [
"## How to Use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5f6b3ac7",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f380cab7",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"hfl/chinese-bert-wwm\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "a39bca7c",
"metadata": {},
"source": [
"\n",
"## Citation\n",
"If you find the technical report or resource is useful, please cite the following technical report in your paper.\n",
"- Primary: https://arxiv.org/abs/2004.13922"
]
},
{
"cell_type": "markdown",
"id": "0ebe185e",
"metadata": {},
"source": [
"```\n",
"@inproceedings{cui-etal-2020-revisiting,\n",
"title = \"Revisiting Pre-Trained Models for {C}hinese Natural Language Processing\",\n",
"author = \"Cui, Yiming and\n",
"Che, Wanxiang and\n",
"Liu, Ting and\n",
"Qin, Bing and\n",
"Wang, Shijin and\n",
"Hu, Guoping\",\n",
"booktitle = \"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings\",\n",
"month = nov,\n",
"year = \"2020\",\n",
"address = \"Online\",\n",
"publisher = \"Association for Computational Linguistics\",\n",
"url = \"https://www.aclweb.org/anthology/2020.findings-emnlp.58\",\n",
"pages = \"657--668\",\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "a8781cbe",
"metadata": {},
"source": [
"- Secondary: https://arxiv.org/abs/1906.08101\n"
]
},
{
"cell_type": "markdown",
"id": "85d2437a",
"metadata": {},
"source": [
"```\n",
"@article{chinese-bert-wwm,\n",
"title={Pre-Training with Whole Word Masking for Chinese BERT},\n",
"author={Cui, Yiming and Che, Wanxiang and Liu, Ting and Qin, Bing and Yang, Ziqing and Wang, Shijin and Hu, Guoping},\n",
"journal={arXiv preprint arXiv:1906.08101},\n",
"year={2019}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "86de1995",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/hfl/chinese-bert-wwm-ext](https://huggingface.co/hfl/chinese-bert-wwm-ext) and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
},
"vscode": {
"interpreter": {
"hash": "606ea184b8fed3419d714b545dc1784fad6c99d0cc940b6b9d787dccf225faa5"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: Chinese
License: apache-2.0
Model_Info: Model_Info:
name: "hfl/chinese-roberta-wwm-ext-large" description: Please use 'Bert' related functions to load this model!
description: "Please use 'Bert' related functions to load this model!" description_en: Please use 'Bert' related functions to load this model!
description_en: "Please use 'Bert' related functions to load this model!" from_repo: https://huggingface.co/hfl/chinese-roberta-wwm-ext-large
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/hfl/chinese-roberta-wwm-ext-large" name: hfl/chinese-roberta-wwm-ext-large
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: ""
Publisher: "hfl"
License: "apache-2.0"
Language: "Chinese"
Paper: Paper:
- title: 'Pre-Training with Whole Word Masking for Chinese BERT' - title: Pre-Training with Whole Word Masking for Chinese BERT
url: 'http://arxiv.org/abs/1906.08101v3' url: http://arxiv.org/abs/1906.08101v3
- title: 'Revisiting Pre-Trained Models for Chinese Natural Language Processing' - title: Revisiting Pre-Trained Models for Chinese Natural Language Processing
url: 'http://arxiv.org/abs/2004.13922v2' url: http://arxiv.org/abs/2004.13922v2
IfTraining: 0 Publisher: hfl
IfOnlineDemo: 0 Task:
\ No newline at end of file - sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "a5e1e8bd",
"metadata": {},
"source": [
"## Chinese BERT with Whole Word Masking\n",
"\n",
"### Please use 'Bert' related functions to load this model!\n",
"\n",
"For further accelerating Chinese natural language processing, we provide **Chinese pre-trained BERT with Whole Word Masking**.\n",
"\n",
"**[Pre-Training with Whole Word Masking for Chinese BERT](https://arxiv.org/abs/1906.08101)**\n",
"Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Ziqing Yang, Shijin Wang, Guoping Hu\n",
"\n",
"This repository is developed based on:https://github.com/google-research/bert\n",
"\n",
"You may also interested in,\n",
"- Chinese BERT series: https://github.com/ymcui/Chinese-BERT-wwm\n",
"- Chinese MacBERT: https://github.com/ymcui/MacBERT\n",
"- Chinese ELECTRA: https://github.com/ymcui/Chinese-ELECTRA\n",
"- Chinese XLNet: https://github.com/ymcui/Chinese-XLNet\n",
"- Knowledge Distillation Toolkit - TextBrewer: https://github.com/airaria/TextBrewer\n",
"\n",
"More resources by HFL: https://github.com/ymcui/HFL-Anthology\n"
]
},
{
"cell_type": "markdown",
"id": "be498a8f",
"metadata": {},
"source": [
"## How to Use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0199d11d",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b71b0698",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"hfl/chinese-roberta-wwm-ext-large\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "5d6bd99f",
"metadata": {},
"source": [
"\n",
"## Citation\n",
"If you find the technical report or resource is useful, please cite the following technical report in your paper.\n",
"- Primary: https://arxiv.org/abs/2004.13922"
]
},
{
"cell_type": "markdown",
"id": "9429c396",
"metadata": {},
"source": [
"```\n",
"@inproceedings{cui-etal-2020-revisiting,\n",
"title = \"Revisiting Pre-Trained Models for {C}hinese Natural Language Processing\",\n",
"author = \"Cui, Yiming and\n",
"Che, Wanxiang and\n",
"Liu, Ting and\n",
"Qin, Bing and\n",
"Wang, Shijin and\n",
"Hu, Guoping\",\n",
"booktitle = \"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings\",\n",
"month = nov,\n",
"year = \"2020\",\n",
"address = \"Online\",\n",
"publisher = \"Association for Computational Linguistics\",\n",
"url = \"https://www.aclweb.org/anthology/2020.findings-emnlp.58\",\n",
"pages = \"657--668\",\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "9784d9b7",
"metadata": {},
"source": [
"- Secondary: https://arxiv.org/abs/1906.08101\n"
]
},
{
"cell_type": "markdown",
"id": "eb3e56a1",
"metadata": {},
"source": [
"```\n",
"@article{chinese-bert-wwm,\n",
"title={Pre-Training with Whole Word Masking for Chinese BERT},\n",
"author={Cui, Yiming and Che, Wanxiang and Liu, Ting and Qin, Bing and Yang, Ziqing and Wang, Shijin and Hu, Guoping},\n",
"journal={arXiv preprint arXiv:1906.08101},\n",
"year={2019}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "3593ecc9",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/hfl/chinese-roberta-wwm-ext-large](https://huggingface.co/hfl/chinese-roberta-wwm-ext-large),并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
},
"vscode": {
"interpreter": {
"hash": "606ea184b8fed3419d714b545dc1784fad6c99d0cc940b6b9d787dccf225faa5"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "faeb5f50",
"metadata": {},
"source": [
"## Chinese BERT with Whole Word Masking\n",
"\n",
"### Please use 'Bert' related functions to load this model!\n",
"\n",
"For further accelerating Chinese natural language processing, we provide **Chinese pre-trained BERT with Whole Word Masking**.\n",
"\n",
"**[Pre-Training with Whole Word Masking for Chinese BERT](https://arxiv.org/abs/1906.08101)**\n",
"Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Ziqing Yang, Shijin Wang, Guoping Hu\n",
"\n",
"This repository is developed based on:https://github.com/google-research/bert\n",
"\n",
"You may also interested in,\n",
"- Chinese BERT series: https://github.com/ymcui/Chinese-BERT-wwm\n",
"- Chinese MacBERT: https://github.com/ymcui/MacBERT\n",
"- Chinese ELECTRA: https://github.com/ymcui/Chinese-ELECTRA\n",
"- Chinese XLNet: https://github.com/ymcui/Chinese-XLNet\n",
"- Knowledge Distillation Toolkit - TextBrewer: https://github.com/airaria/TextBrewer\n",
"\n",
"More resources by HFL: https://github.com/ymcui/HFL-Anthology\n"
]
},
{
"cell_type": "markdown",
"id": "fbf98c0e",
"metadata": {},
"source": [
"## How to Use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5f6b3ac7",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f380cab7",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"hfl/chinese-roberta-wwm-ext-large\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "a39bca7c",
"metadata": {},
"source": [
"\n",
"## Citation\n",
"If you find the technical report or resource is useful, please cite the following technical report in your paper.\n",
"- Primary: https://arxiv.org/abs/2004.13922"
]
},
{
"cell_type": "markdown",
"id": "b01c1973",
"metadata": {},
"source": [
"```\n",
"@inproceedings{cui-etal-2020-revisiting,\n",
"title = \"Revisiting Pre-Trained Models for {C}hinese Natural Language Processing\",\n",
"author = \"Cui, Yiming and\n",
"Che, Wanxiang and\n",
"Liu, Ting and\n",
"Qin, Bing and\n",
"Wang, Shijin and\n",
"Hu, Guoping\",\n",
"booktitle = \"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings\",\n",
"month = nov,\n",
"year = \"2020\",\n",
"address = \"Online\",\n",
"publisher = \"Association for Computational Linguistics\",\n",
"url = \"https://www.aclweb.org/anthology/2020.findings-emnlp.58\",\n",
"pages = \"657--668\",\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "a8781cbe",
"metadata": {},
"source": [
"- Secondary: https://arxiv.org/abs/1906.08101\n"
]
},
{
"cell_type": "markdown",
"id": "7ad8a810",
"metadata": {},
"source": [
"```\n",
"@article{chinese-bert-wwm,\n",
"title={Pre-Training with Whole Word Masking for Chinese BERT},\n",
"author={Cui, Yiming and Che, Wanxiang and Liu, Ting and Qin, Bing and Yang, Ziqing and Wang, Shijin and Hu, Guoping},\n",
"journal={arXiv preprint arXiv:1906.08101},\n",
"year={2019}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "86de1995",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/hfl/chinese-roberta-wwm-ext-large](https://huggingface.co/hfl/chinese-roberta-wwm-ext-large) and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
},
"vscode": {
"interpreter": {
"hash": "606ea184b8fed3419d714b545dc1784fad6c99d0cc940b6b9d787dccf225faa5"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: Chinese
License: apache-2.0
Model_Info: Model_Info:
name: "hfl/chinese-roberta-wwm-ext" description: Please use 'Bert' related functions to load this model!
description: "Please use 'Bert' related functions to load this model!" description_en: Please use 'Bert' related functions to load this model!
description_en: "Please use 'Bert' related functions to load this model!" from_repo: https://huggingface.co/hfl/chinese-roberta-wwm-ext
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/hfl/chinese-roberta-wwm-ext" name: hfl/chinese-roberta-wwm-ext
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: ""
Publisher: "hfl"
License: "apache-2.0"
Language: "Chinese"
Paper: Paper:
- title: 'Pre-Training with Whole Word Masking for Chinese BERT' - title: Pre-Training with Whole Word Masking for Chinese BERT
url: 'http://arxiv.org/abs/1906.08101v3' url: http://arxiv.org/abs/1906.08101v3
- title: 'Revisiting Pre-Trained Models for Chinese Natural Language Processing' - title: Revisiting Pre-Trained Models for Chinese Natural Language Processing
url: 'http://arxiv.org/abs/2004.13922v2' url: http://arxiv.org/abs/2004.13922v2
IfTraining: 0 Publisher: hfl
IfOnlineDemo: 0 Task:
\ No newline at end of file - sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "a5e1e8bd",
"metadata": {},
"source": [
"## Chinese BERT with Whole Word Masking\n",
"For further accelerating Chinese natural language processing, we provide **Chinese pre-trained BERT with Whole Word Masking**.\n",
"\n",
"**[Pre-Training with Whole Word Masking for Chinese BERT](https://arxiv.org/abs/1906.08101)**\n",
"Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Ziqing Yang, Shijin Wang, Guoping Hu\n",
"\n",
"This repository is developed based on:https://github.com/google-research/bert\n",
"\n",
"You may also interested in,\n",
"- Chinese BERT series: https://github.com/ymcui/Chinese-BERT-wwm\n",
"- Chinese MacBERT: https://github.com/ymcui/MacBERT\n",
"- Chinese ELECTRA: https://github.com/ymcui/Chinese-ELECTRA\n",
"- Chinese XLNet: https://github.com/ymcui/Chinese-XLNet\n",
"- Knowledge Distillation Toolkit - TextBrewer: https://github.com/airaria/TextBrewer\n",
"\n",
"More resources by HFL: https://github.com/ymcui/HFL-Anthology\n"
]
},
{
"cell_type": "markdown",
"id": "be498a8f",
"metadata": {},
"source": [
"## How to Use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0199d11d",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b71b0698",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"hfl/chinese-roberta-wwm-ext\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "5d6bd99f",
"metadata": {},
"source": [
"\n",
"## Citation\n",
"If you find the technical report or resource is useful, please cite the following technical report in your paper.\n",
"- Primary: https://arxiv.org/abs/2004.13922"
]
},
{
"cell_type": "markdown",
"id": "737822b2",
"metadata": {},
"source": [
"```\n",
"@inproceedings{cui-etal-2020-revisiting,\n",
"title = \"Revisiting Pre-Trained Models for {C}hinese Natural Language Processing\",\n",
"author = \"Cui, Yiming and\n",
"Che, Wanxiang and\n",
"Liu, Ting and\n",
"Qin, Bing and\n",
"Wang, Shijin and\n",
"Hu, Guoping\",\n",
"booktitle = \"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings\",\n",
"month = nov,\n",
"year = \"2020\",\n",
"address = \"Online\",\n",
"publisher = \"Association for Computational Linguistics\",\n",
"url = \"https://www.aclweb.org/anthology/2020.findings-emnlp.58\",\n",
"pages = \"657--668\",\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "9784d9b7",
"metadata": {},
"source": [
"- Secondary: https://arxiv.org/abs/1906.08101\n"
]
},
{
"cell_type": "markdown",
"id": "22d0c28d",
"metadata": {},
"source": [
"```\n",
"@article{chinese-bert-wwm,\n",
"title={Pre-Training with Whole Word Masking for Chinese BERT},\n",
"author={Cui, Yiming and Che, Wanxiang and Liu, Ting and Qin, Bing and Yang, Ziqing and Wang, Shijin and Hu, Guoping},\n",
"journal={arXiv preprint arXiv:1906.08101},\n",
"year={2019}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "3593ecc9",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/hfl/chinese-roberta-wwm-ext](https://huggingface.co/hfl/chinese-roberta-wwm-ext),并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
},
"vscode": {
"interpreter": {
"hash": "606ea184b8fed3419d714b545dc1784fad6c99d0cc940b6b9d787dccf225faa5"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "faeb5f50",
"metadata": {},
"source": [
"## Chinese BERT with Whole Word Masking\n",
"For further accelerating Chinese natural language processing, we provide **Chinese pre-trained BERT with Whole Word Masking**.\n",
"\n",
"**[Pre-Training with Whole Word Masking for Chinese BERT](https://arxiv.org/abs/1906.08101)**\n",
"Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Ziqing Yang, Shijin Wang, Guoping Hu\n",
"\n",
"This repository is developed based on:https://github.com/google-research/bert\n",
"\n",
"You may also interested in,\n",
"- Chinese BERT series: https://github.com/ymcui/Chinese-BERT-wwm\n",
"- Chinese MacBERT: https://github.com/ymcui/MacBERT\n",
"- Chinese ELECTRA: https://github.com/ymcui/Chinese-ELECTRA\n",
"- Chinese XLNet: https://github.com/ymcui/Chinese-XLNet\n",
"- Knowledge Distillation Toolkit - TextBrewer: https://github.com/airaria/TextBrewer\n",
"\n",
"More resources by HFL: https://github.com/ymcui/HFL-Anthology\n"
]
},
{
"cell_type": "markdown",
"id": "fbf98c0e",
"metadata": {},
"source": [
"## How to Use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5f6b3ac7",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f380cab7",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"hfl/chinese-roberta-wwm-ext\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "a39bca7c",
"metadata": {},
"source": [
"\n",
"## Citation\n",
"If you find the technical report or resource is useful, please cite the following technical report in your paper.\n",
"- Primary: https://arxiv.org/abs/2004.13922"
]
},
{
"cell_type": "markdown",
"id": "f495aec9",
"metadata": {},
"source": [
"```\n",
"@inproceedings{cui-etal-2020-revisiting,\n",
"title = \"Revisiting Pre-Trained Models for {C}hinese Natural Language Processing\",\n",
"author = \"Cui, Yiming and\n",
"Che, Wanxiang and\n",
"Liu, Ting and\n",
"Qin, Bing and\n",
"Wang, Shijin and\n",
"Hu, Guoping\",\n",
"booktitle = \"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings\",\n",
"month = nov,\n",
"year = \"2020\",\n",
"address = \"Online\",\n",
"publisher = \"Association for Computational Linguistics\",\n",
"url = \"https://www.aclweb.org/anthology/2020.findings-emnlp.58\",\n",
"pages = \"657--668\",\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "a8781cbe",
"metadata": {},
"source": [
"- Secondary: https://arxiv.org/abs/1906.08101\n"
]
},
{
"cell_type": "markdown",
"id": "8eebfbf4",
"metadata": {},
"source": [
"```\n",
"@article{chinese-bert-wwm,\n",
"title={Pre-Training with Whole Word Masking for Chinese BERT},\n",
"author={Cui, Yiming and Che, Wanxiang and Liu, Ting and Qin, Bing and Yang, Ziqing and Wang, Shijin and Hu, Guoping},\n",
"journal={arXiv preprint arXiv:1906.08101},\n",
"year={2019}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "86de1995",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/hfl/chinese-roberta-wwm-ext](https://huggingface.co/hfl/chinese-roberta-wwm-ext) and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
},
"vscode": {
"interpreter": {
"hash": "606ea184b8fed3419d714b545dc1784fad6c99d0cc940b6b9d787dccf225faa5"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: Chinese
License: apache-2.0
Model_Info: Model_Info:
name: "hfl/rbt3" description: This is a re-trained 3-layer RoBERTa-wwm-ext model.
description: "This is a re-trained 3-layer RoBERTa-wwm-ext model." description_en: This is a re-trained 3-layer RoBERTa-wwm-ext model.
description_en: "This is a re-trained 3-layer RoBERTa-wwm-ext model." from_repo: https://huggingface.co/hfl/rbt3
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/hfl/rbt3" name: hfl/rbt3
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: ""
Publisher: "hfl"
License: "apache-2.0"
Language: "Chinese"
Paper: Paper:
- title: 'Pre-Training with Whole Word Masking for Chinese BERT' - title: Pre-Training with Whole Word Masking for Chinese BERT
url: 'http://arxiv.org/abs/1906.08101v3' url: http://arxiv.org/abs/1906.08101v3
- title: 'Revisiting Pre-Trained Models for Chinese Natural Language Processing' - title: Revisiting Pre-Trained Models for Chinese Natural Language Processing
url: 'http://arxiv.org/abs/2004.13922v2' url: http://arxiv.org/abs/2004.13922v2
IfTraining: 0 Publisher: hfl
IfOnlineDemo: 0 Task:
\ No newline at end of file - sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "a5e1e8bd",
"metadata": {},
"source": [
"## Chinese BERT with Whole Word Masking\n",
"\n",
"### Please use 'Bert' related functions to load this model!\n",
"\n",
"For further accelerating Chinese natural language processing, we provide **Chinese pre-trained BERT with Whole Word Masking**.\n",
"\n",
"**[Pre-Training with Whole Word Masking for Chinese BERT](https://arxiv.org/abs/1906.08101)**\n",
"Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Ziqing Yang, Shijin Wang, Guoping Hu\n",
"\n",
"This repository is developed based on:https://github.com/google-research/bert\n",
"\n",
"You may also interested in,\n",
"- Chinese BERT series: https://github.com/ymcui/Chinese-BERT-wwm\n",
"- Chinese MacBERT: https://github.com/ymcui/MacBERT\n",
"- Chinese ELECTRA: https://github.com/ymcui/Chinese-ELECTRA\n",
"- Chinese XLNet: https://github.com/ymcui/Chinese-XLNet\n",
"- Knowledge Distillation Toolkit - TextBrewer: https://github.com/airaria/TextBrewer\n",
"\n",
"More resources by HFL: https://github.com/ymcui/HFL-Anthology\n"
]
},
{
"cell_type": "markdown",
"id": "be498a8f",
"metadata": {},
"source": [
"## How to Use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0199d11d",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b71b0698",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"hfl/rbt3\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "5d6bd99f",
"metadata": {},
"source": [
"\n",
"## Citation\n",
"If you find the technical report or resource is useful, please cite the following technical report in your paper.\n",
"- Primary: https://arxiv.org/abs/2004.13922"
]
},
{
"cell_type": "markdown",
"id": "73e04675",
"metadata": {},
"source": [
"```\n",
"@inproceedings{cui-etal-2020-revisiting,\n",
"title = \"Revisiting Pre-Trained Models for {C}hinese Natural Language Processing\",\n",
"author = \"Cui, Yiming and\n",
"Che, Wanxiang and\n",
"Liu, Ting and\n",
"Qin, Bing and\n",
"Wang, Shijin and\n",
"Hu, Guoping\",\n",
"booktitle = \"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings\",\n",
"month = nov,\n",
"year = \"2020\",\n",
"address = \"Online\",\n",
"publisher = \"Association for Computational Linguistics\",\n",
"url = \"https://www.aclweb.org/anthology/2020.findings-emnlp.58\",\n",
"pages = \"657--668\",\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "9784d9b7",
"metadata": {},
"source": [
"- Secondary: https://arxiv.org/abs/1906.08101\n"
]
},
{
"cell_type": "markdown",
"id": "068895c6",
"metadata": {},
"source": [
"```\n",
"@article{chinese-bert-wwm,\n",
"title={Pre-Training with Whole Word Masking for Chinese BERT},\n",
"author={Cui, Yiming and Che, Wanxiang and Liu, Ting and Qin, Bing and Yang, Ziqing and Wang, Shijin and Hu, Guoping},\n",
"journal={arXiv preprint arXiv:1906.08101},\n",
"year={2019}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "3593ecc9",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/hfl/rbt3](https://huggingface.co/hfl/rbt3),并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
},
"vscode": {
"interpreter": {
"hash": "606ea184b8fed3419d714b545dc1784fad6c99d0cc940b6b9d787dccf225faa5"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "faeb5f50",
"metadata": {},
"source": [
"## Chinese BERT with Whole Word Masking\n",
"\n",
"### Please use 'Bert' related functions to load this model!\n",
"\n",
"For further accelerating Chinese natural language processing, we provide **Chinese pre-trained BERT with Whole Word Masking**.\n",
"\n",
"**[Pre-Training with Whole Word Masking for Chinese BERT](https://arxiv.org/abs/1906.08101)**\n",
"Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Ziqing Yang, Shijin Wang, Guoping Hu\n",
"\n",
"This repository is developed based on:https://github.com/google-research/bert\n",
"\n",
"You may also interested in,\n",
"- Chinese BERT series: https://github.com/ymcui/Chinese-BERT-wwm\n",
"- Chinese MacBERT: https://github.com/ymcui/MacBERT\n",
"- Chinese ELECTRA: https://github.com/ymcui/Chinese-ELECTRA\n",
"- Chinese XLNet: https://github.com/ymcui/Chinese-XLNet\n",
"- Knowledge Distillation Toolkit - TextBrewer: https://github.com/airaria/TextBrewer\n",
"\n",
"More resources by HFL: https://github.com/ymcui/HFL-Anthology\n"
]
},
{
"cell_type": "markdown",
"id": "fbf98c0e",
"metadata": {},
"source": [
"## How to Use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5f6b3ac7",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f380cab7",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"hfl/rbt3\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "a39bca7c",
"metadata": {},
"source": [
"\n",
"## Citation\n",
"If you find the technical report or resource is useful, please cite the following technical report in your paper.\n",
"- Primary: https://arxiv.org/abs/2004.13922"
]
},
{
"cell_type": "markdown",
"id": "370bfe67",
"metadata": {},
"source": [
"```\n",
"@inproceedings{cui-etal-2020-revisiting,\n",
"title = \"Revisiting Pre-Trained Models for {C}hinese Natural Language Processing\",\n",
"author = \"Cui, Yiming and\n",
"Che, Wanxiang and\n",
"Liu, Ting and\n",
"Qin, Bing and\n",
"Wang, Shijin and\n",
"Hu, Guoping\",\n",
"booktitle = \"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings\",\n",
"month = nov,\n",
"year = \"2020\",\n",
"address = \"Online\",\n",
"publisher = \"Association for Computational Linguistics\",\n",
"url = \"https://www.aclweb.org/anthology/2020.findings-emnlp.58\",\n",
"pages = \"657--668\",\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "a8781cbe",
"metadata": {},
"source": [
"- Secondary: https://arxiv.org/abs/1906.08101\n"
]
},
{
"cell_type": "markdown",
"id": "4a1fe5aa",
"metadata": {},
"source": [
"```\n",
"@article{chinese-bert-wwm,\n",
"title={Pre-Training with Whole Word Masking for Chinese BERT},\n",
"author={Cui, Yiming and Che, Wanxiang and Liu, Ting and Qin, Bing and Yang, Ziqing and Wang, Shijin and Hu, Guoping},\n",
"journal={arXiv preprint arXiv:1906.08101},\n",
"year={2019}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "86de1995",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/hfl/rbt3](https://huggingface.co/hfl/rbt3) and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
},
"vscode": {
"interpreter": {
"hash": "606ea184b8fed3419d714b545dc1784fad6c99d0cc940b6b9d787dccf225faa5"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: bookcorpus,wikipedia
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info: Model_Info:
name: "bert-base-cased" description: BERT base model (cased)
description: "BERT base model (cased)" description_en: BERT base model (cased)
description_en: "BERT base model (cased)" from_repo: https://huggingface.co/bert-base-cased
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/bert-base-cased" name: bert-base-cased
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: "bookcorpus,wikipedia"
Publisher: "huggingface"
License: "apache-2.0"
Language: "English"
Paper: Paper:
- title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding' - title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding'
url: 'http://arxiv.org/abs/1810.04805v2' url: http://arxiv.org/abs/1810.04805v2
IfTraining: 0 Publisher: huggingface
IfOnlineDemo: 0 Task:
\ No newline at end of file - sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "bb008e6f",
"metadata": {},
"source": [
"# BERT base model (cased)\n",
"\n",
"详细内容请看[Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "079266fb",
"metadata": {},
"source": [
"Pretrained model on English language using a masked language modeling (MLM) objective. It was introduced in\n",
"[this paper](https://arxiv.org/abs/1810.04805) and first released in\n",
"[this repository](https://github.com/google-research/bert). This model is case-sensitive: it makes a difference between\n",
"english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "5c8220aa",
"metadata": {},
"source": [
"Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by\n",
"the Hugging Face team.\n"
]
},
{
"cell_type": "markdown",
"id": "8564477f",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "7365685d",
"metadata": {},
"source": [
"BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it\n",
"was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of\n",
"publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it\n",
"was pretrained with two objectives:\n"
]
},
{
"cell_type": "markdown",
"id": "1c979d12",
"metadata": {},
"source": [
"- Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run\n",
"the entire masked sentence through the model and has to predict the masked words. This is different from traditional\n",
"recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like\n",
"GPT which internally mask the future tokens. It allows the model to learn a bidirectional representation of the\n",
"sentence.\n",
"- Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes\n",
"they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to\n",
"predict if the two sentences were following each other or not.\n"
]
},
{
"cell_type": "markdown",
"id": "cdc00722",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the English language that can then be used to extract features\n",
"useful for downstream tasks: if you have a dataset of labeled sentences for instance, you can train a standard\n",
"classifier using the features produced by the BERT model as inputs.\n"
]
},
{
"cell_type": "markdown",
"id": "9253a517",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9fbfcd0a",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6185db74",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-base-cased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "8e0ca3bd",
"metadata": {},
"source": [
"```\n",
"@article{DBLP:journals/corr/abs-1810-04805,\n",
"author = {Jacob Devlin and\n",
"Ming{-}Wei Chang and\n",
"Kenton Lee and\n",
"Kristina Toutanova},\n",
"title = {{BERT:} Pre-training of Deep Bidirectional Transformers for Language\n",
"Understanding},\n",
"journal = {CoRR},\n",
"volume = {abs/1810.04805},\n",
"year = {2018},\n",
"url = {http://arxiv.org/abs/1810.04805},\n",
"archivePrefix = {arXiv},\n",
"eprint = {1810.04805},\n",
"timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},\n",
"biburl = {https://dblp.org/rec/journals/corr/abs-1810-04805.bib},\n",
"bibsource = {dblp computer science bibliography, https://dblp.org}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "f14e9f06",
"metadata": {},
"source": [
"<a href=\"https://huggingface.co/exbert/?model=bert-base-cased\">\n",
"<img width=\"300px\" src=\"https://cdn-media.huggingface.co/exbert/button.png\">\n",
"</a>\n",
"\n",
"> 此模型介绍及权重来源于 https://huggingface.co/bert-base-cased ,并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "58235e68",
"metadata": {},
"source": [
"# BERT base model (cased)\n",
"\n",
"You can get more details from [Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "36c7d585",
"metadata": {},
"source": [
"Pretrained model on English language using a masked language modeling (MLM) objective. It was introduced in\n",
"[this paper](https://arxiv.org/abs/1810.04805) and first released in\n",
"[this repository](https://github.com/google-research/bert). This model is case-sensitive: it makes a difference between\n",
"english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "d361a880",
"metadata": {},
"source": [
"Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by\n",
"the Hugging Face team.\n"
]
},
{
"cell_type": "markdown",
"id": "47b0cf99",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "d1911491",
"metadata": {},
"source": [
"BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it\n",
"was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of\n",
"publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it\n",
"was pretrained with two objectives:\n"
]
},
{
"cell_type": "markdown",
"id": "94e45c66",
"metadata": {},
"source": [
"- Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run\n",
"the entire masked sentence through the model and has to predict the masked words. This is different from traditional\n",
"recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like\n",
"GPT which internally mask the future tokens. It allows the model to learn a bidirectional representation of the\n",
"sentence.\n",
"- Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes\n",
"they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to\n",
"predict if the two sentences were following each other or not.\n"
]
},
{
"cell_type": "markdown",
"id": "9fec6197",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the English language that can then be used to extract features\n",
"useful for downstream tasks: if you have a dataset of labeled sentences for instance, you can train a standard\n",
"classifier using the features produced by the BERT model as inputs.\n"
]
},
{
"cell_type": "markdown",
"id": "5e17ee3b",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "62ae31d8",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3c52bdd5",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-base-cased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "da7c4875",
"metadata": {},
"source": [
"```\n",
"@article{DBLP:journals/corr/abs-1810-04805,\n",
"author = {Jacob Devlin and\n",
"Ming{-}Wei Chang and\n",
"Kenton Lee and\n",
"Kristina Toutanova},\n",
"title = {{BERT:} Pre-training of Deep Bidirectional Transformers for Language\n",
"Understanding},\n",
"journal = {CoRR},\n",
"volume = {abs/1810.04805},\n",
"year = {2018},\n",
"url = {http://arxiv.org/abs/1810.04805},\n",
"archivePrefix = {arXiv},\n",
"eprint = {1810.04805},\n",
"timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},\n",
"biburl = {https://dblp.org/rec/journals/corr/abs-1810-04805.bib},\n",
"bibsource = {dblp computer science bibliography, https://dblp.org}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "86873e48",
"metadata": {},
"source": [
"<a href=\"https://huggingface.co/exbert/?model=bert-base-cased\">\n",
"<img width=\"300px\" src=\"https://cdn-media.huggingface.co/exbert/button.png\">\n",
"</a>\n",
"\n",
"\n",
"> The model introduction and model weights originate from https://huggingface.co/bert-base-cased and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: German
License: mit
Model_Info: Model_Info:
name: "bert-base-german-cased" description: German BERT
description: "German BERT" description_en: German BERT
description_en: "German BERT" from_repo: https://huggingface.co/bert-base-german-cased
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/bert-base-german-cased" name: bert-base-german-cased
Paper: null
Publisher: huggingface
Task: Task:
- tag_en: "Natural Language Processing" - sub_tag: 槽位填充
tag: "自然语言处理" sub_tag_en: Fill-Mask
sub_tag_en: "Fill-Mask" tag: 自然语言处理
sub_tag: "槽位填充" tag_en: Natural Language Processing
Example:
Datasets: ""
Publisher: "huggingface"
License: "mit"
Language: "German"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
{
"cells": [
{
"cell_type": "markdown",
"id": "0870a629",
"metadata": {},
"source": [
"# German BERT\n",
"![bert_image](https://static.tildacdn.com/tild6438-3730-4164-b266-613634323466/german_bert.png)\n",
"## Overview\n",
"**Language model:** bert-base-cased\n",
"**Language:** German\n",
"**Training data:** Wiki, OpenLegalData, News (~ 12GB)\n",
"**Eval data:** Conll03 (NER), GermEval14 (NER), GermEval18 (Classification), GNAD (Classification)\n",
"**Infrastructure**: 1x TPU v2\n",
"**Published**: Jun 14th, 2019\n",
"\n",
"详细内容请看[Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "b2a6c897",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1790135e",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "99c714ac",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-base-german-cased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "54c2f398",
"metadata": {},
"source": [
"## Authors\n",
"- Branden Chan: `branden.chan [at] deepset.ai`\n",
"- Timo Möller: `timo.moeller [at] deepset.ai`\n",
"- Malte Pietsch: `malte.pietsch [at] deepset.ai`\n",
"- Tanay Soni: `tanay.soni [at] deepset.ai`\n"
]
},
{
"cell_type": "markdown",
"id": "94b669bc",
"metadata": {},
"source": [
"## About us\n",
"![deepset logo](https://raw.githubusercontent.com/deepset-ai/FARM/master/docs/img/deepset_logo.png)\n"
]
},
{
"cell_type": "markdown",
"id": "ce90710a",
"metadata": {},
"source": [
"We bring NLP to the industry via open source!\n",
"Our focus: Industry specific language models & large scale QA systems.\n"
]
},
{
"cell_type": "markdown",
"id": "5dc8ba63",
"metadata": {},
"source": [
"Some of our work:\n",
"- [German BERT (aka \"bert-base-german-cased\")](https://deepset.ai/german-bert)\n",
"- [FARM](https://github.com/deepset-ai/FARM)\n",
"- [Haystack](https://github.com/deepset-ai/haystack/)\n"
]
},
{
"cell_type": "markdown",
"id": "56a1a360",
"metadata": {},
"source": [
"Get in touch:\n",
"[Twitter](https://twitter.com/deepset_ai) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Website](https://deepset.ai)\n",
"\n",
"> 此模型介绍及权重来源于 https://huggingface.co/bert-base-german-cased ,并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "7aa268f7",
"metadata": {},
"source": [
"# German BERT\n",
"![bert_image](https://static.tildacdn.com/tild6438-3730-4164-b266-613634323466/german_bert.png)\n",
"## Overview\n",
"**Language model:** bert-base-cased\n",
"**Language:** German\n",
"**Training data:** Wiki, OpenLegalData, News (~ 12GB)\n",
"**Eval data:** Conll03 (NER), GermEval14 (NER), GermEval18 (Classification), GNAD (Classification)\n",
"**Infrastructure**: 1x TPU v2\n",
"**Published**: Jun 14th, 2019\n",
"\n",
"You can get more details from [Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "f407e80e",
"metadata": {},
"source": [
"**Update April 3rd, 2020**: we updated the vocabulary file on deepset's s3 to conform with the default tokenization of punctuation tokens.\n",
"For details see the related [FARM issue](https://github.com/deepset-ai/FARM/issues/60). If you want to use the old vocab we have also uploaded a deepset/bert-base-german-cased-oldvocab model.\n"
]
},
{
"cell_type": "markdown",
"id": "18d2ad8e",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b80052bd",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4ea9d4e3",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-base-german-cased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "9d560e75",
"metadata": {},
"source": [
"## Authors\n",
"- Branden Chan: `branden.chan [at] deepset.ai`\n",
"- Timo Möller: `timo.moeller [at] deepset.ai`\n",
"- Malte Pietsch: `malte.pietsch [at] deepset.ai`\n",
"- Tanay Soni: `tanay.soni [at] deepset.ai`\n"
]
},
{
"cell_type": "markdown",
"id": "a0e43273",
"metadata": {},
"source": [
"## About us\n",
"![deepset logo](https://raw.githubusercontent.com/deepset-ai/FARM/master/docs/img/deepset_logo.png)\n"
]
},
{
"cell_type": "markdown",
"id": "c1b05e60",
"metadata": {},
"source": [
"We bring NLP to the industry via open source!\n",
"Our focus: Industry specific language models & large scale QA systems.\n"
]
},
{
"cell_type": "markdown",
"id": "5196bee9",
"metadata": {},
"source": [
"Some of our work:\n",
"- [German BERT (aka \"bert-base-german-cased\")](https://deepset.ai/german-bert)\n",
"- [FARM](https://github.com/deepset-ai/FARM)\n",
"- [Haystack](https://github.com/deepset-ai/haystack/)\n"
]
},
{
"cell_type": "markdown",
"id": "18fe01d5",
"metadata": {},
"source": [
"Get in touch:\n",
"[Twitter](https://twitter.com/deepset_ai) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Website](https://deepset.ai)\n",
"\n",
"> The model introduction and model weights originate from https://huggingface.co/bert-base-german-cased and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: wikipedia
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: ''
License: apache-2.0
Model_Info: Model_Info:
name: "bert-base-multilingual-cased" description: BERT multilingual base model (cased)
description: "BERT multilingual base model (cased)" description_en: BERT multilingual base model (cased)
description_en: "BERT multilingual base model (cased)" from_repo: https://huggingface.co/bert-base-multilingual-cased
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/bert-base-multilingual-cased" name: bert-base-multilingual-cased
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: "wikipedia"
Publisher: "huggingface"
License: "apache-2.0"
Language: ""
Paper: Paper:
- title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding' - title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding'
url: 'http://arxiv.org/abs/1810.04805v2' url: http://arxiv.org/abs/1810.04805v2
IfTraining: 0 Publisher: huggingface
IfOnlineDemo: 0 Task:
\ No newline at end of file - sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "92e18984",
"metadata": {},
"source": [
"# BERT multilingual base model (cased)\n",
"\n",
"详细内容请看[Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "cc38bad3",
"metadata": {},
"source": [
"Pretrained model on the top 104 languages with the largest Wikipedia using a masked language modeling (MLM) objective.\n",
"It was introduced in [this paper](https://arxiv.org/abs/1810.04805) and first released in\n",
"[this repository](https://github.com/google-research/bert). This model is case sensitive: it makes a difference\n",
"between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "a54cdf6e",
"metadata": {},
"source": [
"Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by\n",
"the Hugging Face team.\n"
]
},
{
"cell_type": "markdown",
"id": "3be641ef",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "93fd337b",
"metadata": {},
"source": [
"BERT is a transformers model pretrained on a large corpus of multilingual data in a self-supervised fashion. This means\n",
"it was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of\n",
"publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it\n",
"was pretrained with two objectives:\n"
]
},
{
"cell_type": "markdown",
"id": "2222d4b6",
"metadata": {},
"source": [
"- Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run\n",
"the entire masked sentence through the model and has to predict the masked words. This is different from traditional\n",
"recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like\n",
"GPT which internally mask the future tokens. It allows the model to learn a bidirectional representation of the\n",
"sentence.\n",
"- Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes\n",
"they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to\n",
"predict if the two sentences were following each other or not.\n"
]
},
{
"cell_type": "markdown",
"id": "2f9ea64e",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the languages in the training set that can then be used to\n",
"extract features useful for downstream tasks: if you have a dataset of labeled sentences for instance, you can train a\n",
"standard classifier using the features produced by the BERT model as inputs.\n"
]
},
{
"cell_type": "markdown",
"id": "7363abb0",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "780c0123",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5a325830",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-base-multilingual-cased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "81ca575a",
"metadata": {},
"source": [
"```\n",
"@article{DBLP:journals/corr/abs-1810-04805,\n",
"author = {Jacob Devlin and\n",
"Ming{-}Wei Chang and\n",
"Kenton Lee and\n",
"Kristina Toutanova},\n",
"title = {{BERT:} Pre-training of Deep Bidirectional Transformers for Language\n",
"Understanding},\n",
"journal = {CoRR},\n",
"volume = {abs/1810.04805},\n",
"year = {2018},\n",
"url = {http://arxiv.org/abs/1810.04805},\n",
"archivePrefix = {arXiv},\n",
"eprint = {1810.04805},\n",
"timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},\n",
"biburl = {https://dblp.org/rec/journals/corr/abs-1810-04805.bib},\n",
"bibsource = {dblp computer science bibliography, https://dblp.org}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "216555c3",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于 https://huggingface.co/bert-base-multilingual-cased ,并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "19d62907",
"metadata": {},
"source": [
"# BERT multilingual base model (cased)\n",
"\n",
"You can get more details from [Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "09809b94",
"metadata": {},
"source": [
"Pretrained model on the top 104 languages with the largest Wikipedia using a masked language modeling (MLM) objective.\n",
"It was introduced in [this paper](https://arxiv.org/abs/1810.04805) and first released in\n",
"[this repository](https://github.com/google-research/bert). This model is case sensitive: it makes a difference\n",
"between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "d3a52162",
"metadata": {},
"source": [
"Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by\n",
"the Hugging Face team.\n"
]
},
{
"cell_type": "markdown",
"id": "f67f02dc",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "bf05022f",
"metadata": {},
"source": [
"BERT is a transformers model pretrained on a large corpus of multilingual data in a self-supervised fashion. This means\n",
"it was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of\n",
"publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it\n",
"was pretrained with two objectives:\n"
]
},
{
"cell_type": "markdown",
"id": "081a7a88",
"metadata": {},
"source": [
"- Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run\n",
"the entire masked sentence through the model and has to predict the masked words. This is different from traditional\n",
"recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like\n",
"GPT which internally mask the future tokens. It allows the model to learn a bidirectional representation of the\n",
"sentence.\n",
"- Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes\n",
"they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to\n",
"predict if the two sentences were following each other or not.\n"
]
},
{
"cell_type": "markdown",
"id": "79e6eda9",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the languages in the training set that can then be used to\n",
"extract features useful for downstream tasks: if you have a dataset of labeled sentences for instance, you can train a\n",
"standard classifier using the features produced by the BERT model as inputs.\n"
]
},
{
"cell_type": "markdown",
"id": "1696fb24",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4f7d20fd",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3c369c9a",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-base-multilingual-cased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "6338f981",
"metadata": {},
"source": [
"```\n",
"@article{DBLP:journals/corr/abs-1810-04805,\n",
"author = {Jacob Devlin and\n",
"Ming{-}Wei Chang and\n",
"Kenton Lee and\n",
"Kristina Toutanova},\n",
"title = {{BERT:} Pre-training of Deep Bidirectional Transformers for Language\n",
"Understanding},\n",
"journal = {CoRR},\n",
"volume = {abs/1810.04805},\n",
"year = {2018},\n",
"url = {http://arxiv.org/abs/1810.04805},\n",
"archivePrefix = {arXiv},\n",
"eprint = {1810.04805},\n",
"timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},\n",
"biburl = {https://dblp.org/rec/journals/corr/abs-1810-04805.bib},\n",
"bibsource = {dblp computer science bibliography, https://dblp.org}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "c55dc64e",
"metadata": {},
"source": [
"\n",
"> The model introduction and model weights originate from https://huggingface.co/bert-base-multilingual-cased and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: wikipedia
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: ''
License: apache-2.0
Model_Info: Model_Info:
name: "bert-base-multilingual-uncased" description: BERT multilingual base model (uncased)
description: "BERT multilingual base model (uncased)" description_en: BERT multilingual base model (uncased)
description_en: "BERT multilingual base model (uncased)" from_repo: https://huggingface.co/bert-base-multilingual-uncased
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/bert-base-multilingual-uncased" name: bert-base-multilingual-uncased
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: "wikipedia"
Publisher: "huggingface"
License: "apache-2.0"
Language: ""
Paper: Paper:
- title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding' - title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding'
url: 'http://arxiv.org/abs/1810.04805v2' url: http://arxiv.org/abs/1810.04805v2
IfTraining: 0 Publisher: huggingface
IfOnlineDemo: 0 Task:
\ No newline at end of file - sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "867cb6e6",
"metadata": {},
"source": [
"# BERT multilingual base model (uncased)\n",
"\n",
"详细内容请看[Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "207ffc57",
"metadata": {},
"source": [
"Pretrained model on the top 102 languages with the largest Wikipedia using a masked language modeling (MLM) objective.\n",
"It was introduced in [this paper](https://arxiv.org/abs/1810.04805) and first released in\n",
"[this repository](https://github.com/google-research/bert). This model is uncased: it does not make a difference\n",
"between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "8b2e2c13",
"metadata": {},
"source": [
"Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by\n",
"the Hugging Face team.\n"
]
},
{
"cell_type": "markdown",
"id": "40d071c9",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "af4a1260",
"metadata": {},
"source": [
"BERT is a transformers model pretrained on a large corpus of multilingual data in a self-supervised fashion. This means\n",
"it was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of\n",
"publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it\n",
"was pretrained with two objectives:\n"
]
},
{
"cell_type": "markdown",
"id": "81abfbcb",
"metadata": {},
"source": [
"- Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run\n",
"the entire masked sentence through the model and has to predict the masked words. This is different from traditional\n",
"recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like\n",
"GPT which internally mask the future tokens. It allows the model to learn a bidirectional representation of the\n",
"sentence.\n",
"- Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes\n",
"they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to\n",
"predict if the two sentences were following each other or not.\n"
]
},
{
"cell_type": "markdown",
"id": "64988b6b",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the languages in the training set that can then be used to\n",
"extract features useful for downstream tasks: if you have a dataset of labeled sentences for instance, you can train a\n",
"standard classifier using the features produced by the BERT model as inputs.\n"
]
},
{
"cell_type": "markdown",
"id": "79c3e104",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d7b2d0ec",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "52f8d16d",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-base-multilingual-uncased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "f11b298a",
"metadata": {},
"source": [
"```\n",
"@article{DBLP:journals/corr/abs-1810-04805,\n",
"author = {Jacob Devlin and\n",
"Ming{-}Wei Chang and\n",
"Kenton Lee and\n",
"Kristina Toutanova},\n",
"title = {{BERT:} Pre-training of Deep Bidirectional Transformers for Language\n",
"Understanding},\n",
"journal = {CoRR},\n",
"volume = {abs/1810.04805},\n",
"year = {2018},\n",
"url = {http://arxiv.org/abs/1810.04805},\n",
"archivePrefix = {arXiv},\n",
"eprint = {1810.04805},\n",
"timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},\n",
"biburl = {https://dblp.org/rec/journals/corr/abs-1810-04805.bib},\n",
"bibsource = {dblp computer science bibliography, https://dblp.org}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "548a9d6c",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于 https://huggingface.co/bert-base-multilingual-uncased ,并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "ff0c69a5",
"metadata": {},
"source": [
"# BERT multilingual base model (uncased)\n",
"\n",
"You can get more details from [Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "1ad499a9",
"metadata": {},
"source": [
"Pretrained model on the top 102 languages with the largest Wikipedia using a masked language modeling (MLM) objective.\n",
"It was introduced in [this paper](https://arxiv.org/abs/1810.04805) and first released in\n",
"[this repository](https://github.com/google-research/bert). This model is uncased: it does not make a difference\n",
"between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "a8878d0c",
"metadata": {},
"source": [
"Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by\n",
"the Hugging Face team.\n"
]
},
{
"cell_type": "markdown",
"id": "4581e670",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "c8d5f59f",
"metadata": {},
"source": [
"BERT is a transformers model pretrained on a large corpus of multilingual data in a self-supervised fashion. This means\n",
"it was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of\n",
"publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it\n",
"was pretrained with two objectives:\n"
]
},
{
"cell_type": "markdown",
"id": "836834df",
"metadata": {},
"source": [
"- Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run\n",
"the entire masked sentence through the model and has to predict the masked words. This is different from traditional\n",
"recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like\n",
"GPT which internally mask the future tokens. It allows the model to learn a bidirectional representation of the\n",
"sentence.\n",
"- Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes\n",
"they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to\n",
"predict if the two sentences were following each other or not.\n"
]
},
{
"cell_type": "markdown",
"id": "bafe70e4",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the languages in the training set that can then be used to\n",
"extract features useful for downstream tasks: if you have a dataset of labeled sentences for instance, you can train a\n",
"standard classifier using the features produced by the BERT model as inputs.\n"
]
},
{
"cell_type": "markdown",
"id": "cf2a29e2",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "dc792d6e",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7a6faf50",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-base-multilingual-uncased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "5b616f23",
"metadata": {},
"source": [
"```\n",
"@article{DBLP:journals/corr/abs-1810-04805,\n",
"author = {Jacob Devlin and\n",
"Ming{-}Wei Chang and\n",
"Kenton Lee and\n",
"Kristina Toutanova},\n",
"title = {{BERT:} Pre-training of Deep Bidirectional Transformers for Language\n",
"Understanding},\n",
"journal = {CoRR},\n",
"volume = {abs/1810.04805},\n",
"year = {2018},\n",
"url = {http://arxiv.org/abs/1810.04805},\n",
"archivePrefix = {arXiv},\n",
"eprint = {1810.04805},\n",
"timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},\n",
"biburl = {https://dblp.org/rec/journals/corr/abs-1810-04805.bib},\n",
"bibsource = {dblp computer science bibliography, https://dblp.org}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "67e01093",
"metadata": {},
"source": [
"> The model introduction and model weights originate from https://huggingface.co/bert-base-multilingual-uncased and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: bookcorpus,wikipedia
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info: Model_Info:
name: "bert-base-uncased" description: BERT base model (uncased)
description: "BERT base model (uncased)" description_en: BERT base model (uncased)
description_en: "BERT base model (uncased)" from_repo: https://huggingface.co/bert-base-uncased
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/bert-base-uncased" name: bert-base-uncased
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: "bookcorpus,wikipedia"
Publisher: "huggingface"
License: "apache-2.0"
Language: "English"
Paper: Paper:
- title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding' - title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding'
url: 'http://arxiv.org/abs/1810.04805v2' url: http://arxiv.org/abs/1810.04805v2
IfTraining: 0 Publisher: huggingface
IfOnlineDemo: 0 Task:
\ No newline at end of file - sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "a14866e7",
"metadata": {},
"source": [
"# BERT base model (uncased)\n",
"\n",
"详细内容请看[Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "d348c680",
"metadata": {},
"source": [
"Pretrained model on English language using a masked language modeling (MLM) objective. It was introduced in\n",
"[this paper](https://arxiv.org/abs/1810.04805) and first released in\n",
"[this repository](https://github.com/google-research/bert). This model is uncased: it does not make a difference\n",
"between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "9a790b40",
"metadata": {},
"source": [
"Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by\n",
"the Hugging Face team.\n"
]
},
{
"cell_type": "markdown",
"id": "985d2894",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "985bd7ee",
"metadata": {},
"source": [
"BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it\n",
"was pretrained on the raw texts only, with no humans labeling them in any way (which is why it can use lots of\n",
"publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it\n",
"was pretrained with two objectives:\n"
]
},
{
"cell_type": "markdown",
"id": "2e1ee5f4",
"metadata": {},
"source": [
"- Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run\n",
"the entire masked sentence through the model and has to predict the masked words. This is different from traditional\n",
"recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like\n",
"GPT which internally masks the future tokens. It allows the model to learn a bidirectional representation of the\n",
"sentence.\n",
"- Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes\n",
"they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to\n",
"predict if the two sentences were following each other or not.\n"
]
},
{
"cell_type": "markdown",
"id": "ae584a51",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the English language that can then be used to extract features\n",
"useful for downstream tasks: if you have a dataset of labeled sentences, for instance, you can train a standard\n",
"classifier using the features produced by the BERT model as inputs.\n"
]
},
{
"cell_type": "markdown",
"id": "a4d48848",
"metadata": {},
"source": [
"## Model variations\n"
]
},
{
"cell_type": "markdown",
"id": "dcb46068",
"metadata": {},
"source": [
"BERT has originally been released in base and large variations, for cased and uncased input text. The uncased models also strips out an accent markers.\n",
"Chinese and multilingual uncased and cased versions followed shortly after.\n",
"Modified preprocessing with whole word masking has replaced subpiece masking in a following work, with the release of two models.\n",
"Other 24 smaller models are released afterward.\n"
]
},
{
"cell_type": "markdown",
"id": "bdf3ec7e",
"metadata": {},
"source": [
"The detailed release history can be found on the [google-research/bert readme](https://github.com/google-research/bert/blob/master/README.md) on github.\n"
]
},
{
"cell_type": "markdown",
"id": "d66e6fc4",
"metadata": {},
"source": [
"| Model | #params | Language |\n",
"|------------------------|--------------------------------|-------|\n",
"| [`bert-base-uncased`](https://huggingface.co/bert-base-uncased) | 110M | English |\n",
"| [`bert-large-uncased`](https://huggingface.co/bert-large-uncased) | 340M | English | sub\n",
"| [`bert-base-cased`](https://huggingface.co/bert-base-cased) | 110M | English |\n",
"| [`bert-large-cased`](https://huggingface.co/bert-large-cased) | 340M | English |\n",
"| [`bert-base-chinese`](https://huggingface.co/bert-base-chinese) | 110M | Chinese |\n",
"| [`bert-base-multilingual-cased`](https://huggingface.co/bert-base-multilingual-cased) | 110M | Multiple |\n",
"| [`bert-large-uncased-whole-word-masking`](https://huggingface.co/bert-large-uncased-whole-word-masking) | 340M | English |\n",
"| [`bert-large-cased-whole-word-masking`](https://huggingface.co/bert-large-cased-whole-word-masking) | 340M | English |\n"
]
},
{
"cell_type": "markdown",
"id": "93c97712",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e4daab88",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "09dec4f3",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-base-uncased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "85541d34",
"metadata": {},
"source": [
"```\n",
"@article{DBLP:journals/corr/abs-1810-04805,\n",
"author = {Jacob Devlin and\n",
"Ming{-}Wei Chang and\n",
"Kenton Lee and\n",
"Kristina Toutanova},\n",
"title = {{BERT:} Pre-training of Deep Bidirectional Transformers for Language\n",
"Understanding},\n",
"journal = {CoRR},\n",
"volume = {abs/1810.04805},\n",
"year = {2018},\n",
"url = {http://arxiv.org/abs/1810.04805},\n",
"archivePrefix = {arXiv},\n",
"eprint = {1810.04805},\n",
"timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},\n",
"biburl = {https://dblp.org/rec/journals/corr/abs-1810-04805.bib},\n",
"bibsource = {dblp computer science bibliography, https://dblp.org}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "82898490",
"metadata": {},
"source": [
"<a href=\"https://huggingface.co/exbert/?model=bert-base-uncased\">\n",
"<img width=\"300px\" src=\"https://cdn-media.huggingface.co/exbert/button.png\">\n",
"</a>\n",
"\n",
"> 此模型介绍及权重来源于 https://huggingface.co/bert-base-uncased ,并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "86c2dd31",
"metadata": {},
"source": [
"# BERT base model (uncased)\n",
"\n",
"You can get more details from [Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "e25590e2",
"metadata": {},
"source": [
"Pretrained model on English language using a masked language modeling (MLM) objective. It was introduced in\n",
"[this paper](https://arxiv.org/abs/1810.04805) and first released in\n",
"[this repository](https://github.com/google-research/bert). This model is uncased: it does not make a difference\n",
"between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "632646c9",
"metadata": {},
"source": [
"Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by\n",
"the Hugging Face team.\n"
]
},
{
"cell_type": "markdown",
"id": "6d37733d",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "20eb0099",
"metadata": {},
"source": [
"BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it\n",
"was pretrained on the raw texts only, with no humans labeling them in any way (which is why it can use lots of\n",
"publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it\n",
"was pretrained with two objectives:\n"
]
},
{
"cell_type": "markdown",
"id": "a43bc44c",
"metadata": {},
"source": [
"- Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run\n",
"the entire masked sentence through the model and has to predict the masked words. This is different from traditional\n",
"recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like\n",
"GPT which internally masks the future tokens. It allows the model to learn a bidirectional representation of the\n",
"sentence.\n",
"- Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes\n",
"they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to\n",
"predict if the two sentences were following each other or not.\n"
]
},
{
"cell_type": "markdown",
"id": "3ea31760",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the English language that can then be used to extract features\n",
"useful for downstream tasks: if you have a dataset of labeled sentences, for instance, you can train a standard\n",
"classifier using the features produced by the BERT model as inputs.\n"
]
},
{
"cell_type": "markdown",
"id": "c44e01b0",
"metadata": {},
"source": [
"## Model variations\n"
]
},
{
"cell_type": "markdown",
"id": "6cb3e530",
"metadata": {},
"source": [
"BERT has originally been released in base and large variations, for cased and uncased input text. The uncased models also strips out an accent markers.\n",
"Chinese and multilingual uncased and cased versions followed shortly after.\n",
"Modified preprocessing with whole word masking has replaced subpiece masking in a following work, with the release of two models.\n",
"Other 24 smaller models are released afterward.\n"
]
},
{
"cell_type": "markdown",
"id": "557a417a",
"metadata": {},
"source": [
"The detailed release history can be found on the [google-research/bert readme](https://github.com/google-research/bert/blob/master/README.md) on github.\n"
]
},
{
"cell_type": "markdown",
"id": "0f4bf9e0",
"metadata": {},
"source": [
"| Model | #params | Language |\n",
"|------------------------|--------------------------------|-------|\n",
"| [`bert-base-uncased`](https://huggingface.co/bert-base-uncased) | 110M | English |\n",
"| [`bert-large-uncased`](https://huggingface.co/bert-large-uncased) | 340M | English | sub\n",
"| [`bert-base-cased`](https://huggingface.co/bert-base-cased) | 110M | English |\n",
"| [`bert-large-cased`](https://huggingface.co/bert-large-cased) | 340M | English |\n",
"| [`bert-base-chinese`](https://huggingface.co/bert-base-chinese) | 110M | Chinese |\n",
"| [`bert-base-multilingual-cased`](https://huggingface.co/bert-base-multilingual-cased) | 110M | Multiple |\n",
"| [`bert-large-uncased-whole-word-masking`](https://huggingface.co/bert-large-uncased-whole-word-masking) | 340M | English |\n",
"| [`bert-large-cased-whole-word-masking`](https://huggingface.co/bert-large-cased-whole-word-masking) | 340M | English |\n"
]
},
{
"cell_type": "markdown",
"id": "909c1c8d",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "68db3da7",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "04d6a56d",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-base-uncased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "76d1a4dc",
"metadata": {},
"source": [
"```\n",
"@article{DBLP:journals/corr/abs-1810-04805,\n",
"author = {Jacob Devlin and\n",
"Ming{-}Wei Chang and\n",
"Kenton Lee and\n",
"Kristina Toutanova},\n",
"title = {{BERT:} Pre-training of Deep Bidirectional Transformers for Language\n",
"Understanding},\n",
"journal = {CoRR},\n",
"volume = {abs/1810.04805},\n",
"year = {2018},\n",
"url = {http://arxiv.org/abs/1810.04805},\n",
"archivePrefix = {arXiv},\n",
"eprint = {1810.04805},\n",
"timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},\n",
"biburl = {https://dblp.org/rec/journals/corr/abs-1810-04805.bib},\n",
"bibsource = {dblp computer science bibliography, https://dblp.org}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "1bcee897",
"metadata": {},
"source": [
"<a href=\"https://huggingface.co/exbert/?model=bert-base-uncased\">\n",
"<img width=\"300px\" src=\"https://cdn-media.huggingface.co/exbert/button.png\">\n",
"</a>\n",
"\n",
"> The model introduction and model weights originate from https://huggingface.co/bert-base-uncased and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: bookcorpus,wikipedia
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info: Model_Info:
name: "bert-large-cased-whole-word-masking-finetuned-squad" description: BERT large model (cased) whole word masking finetuned on SQuAD
description: "BERT large model (cased) whole word masking finetuned on SQuAD" description_en: BERT large model (cased) whole word masking finetuned on SQuAD
description_en: "BERT large model (cased) whole word masking finetuned on SQuAD" from_repo: https://huggingface.co/bert-large-cased-whole-word-masking-finetuned-squad
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/bert-large-cased-whole-word-masking-finetuned-squad" name: bert-large-cased-whole-word-masking-finetuned-squad
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Question Answering"
sub_tag: "回答问题"
Example:
Datasets: "bookcorpus,wikipedia"
Publisher: "huggingface"
License: "apache-2.0"
Language: "English"
Paper: Paper:
- title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding' - title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding'
url: 'http://arxiv.org/abs/1810.04805v2' url: http://arxiv.org/abs/1810.04805v2
IfTraining: 0 Publisher: huggingface
IfOnlineDemo: 0 Task:
\ No newline at end of file - sub_tag: 回答问题
sub_tag_en: Question Answering
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "7b02f8e4",
"metadata": {},
"source": [
"# BERT large model (cased) whole word masking finetuned on SQuAD\n",
"\n",
"详细内容请看[Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "7804aeec",
"metadata": {},
"source": [
"Pretrained model on English language using a masked language modeling (MLM) objective. It was introduced in\n",
"[this paper](https://arxiv.org/abs/1810.04805) and first released in\n",
"[this repository](https://github.com/google-research/bert). This model is cased: it makes a difference between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "9ee7c4ee",
"metadata": {},
"source": [
"Differently to other BERT models, this model was trained with a new technique: Whole Word Masking. In this case, all of the tokens corresponding to a word are masked at once. The overall masking rate remains the same.\n"
]
},
{
"cell_type": "markdown",
"id": "2198ff25",
"metadata": {},
"source": [
"The training is identical -- each masked WordPiece token is predicted independently.\n"
]
},
{
"cell_type": "markdown",
"id": "159c04c3",
"metadata": {},
"source": [
"After pre-training, this model was fine-tuned on the SQuAD dataset with one of our fine-tuning scripts. See below for more information regarding this fine-tuning.\n"
]
},
{
"cell_type": "markdown",
"id": "cec53443",
"metadata": {},
"source": [
"Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by\n",
"the Hugging Face team.\n"
]
},
{
"cell_type": "markdown",
"id": "0a6d113e",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "3776a729",
"metadata": {},
"source": [
"BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it\n",
"was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of\n",
"publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it\n",
"was pretrained with two objectives:\n"
]
},
{
"cell_type": "markdown",
"id": "8ef5e147",
"metadata": {},
"source": [
"- Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run\n",
"the entire masked sentence through the model and has to predict the masked words. This is different from traditional\n",
"recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like\n",
"GPT which internally mask the future tokens. It allows the model to learn a bidirectional representation of the\n",
"sentence.\n",
"- Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes\n",
"they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to\n",
"predict if the two sentences were following each other or not.\n"
]
},
{
"cell_type": "markdown",
"id": "f494c97f",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the English language that can then be used to extract features\n",
"useful for downstream tasks: if you have a dataset of labeled sentences for instance, you can train a standard\n",
"classifier using the features produced by the BERT model as inputs.\n"
]
},
{
"cell_type": "markdown",
"id": "eeffccad",
"metadata": {},
"source": [
"This model has the following configuration:\n"
]
},
{
"cell_type": "markdown",
"id": "7754e7ed",
"metadata": {},
"source": [
"- 24-layer\n",
"- 1024 hidden dimension\n",
"- 16 attention heads\n",
"- 336M parameters.\n"
]
},
{
"cell_type": "markdown",
"id": "dc30e3d4",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6c0a8e7e",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3eb39f84",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-large-cased-whole-word-masking-finetuned-squad\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "82b3ff37",
"metadata": {},
"source": [
"```\n",
"@article{DBLP:journals/corr/abs-1810-04805,\n",
"author = {Jacob Devlin and\n",
"Ming{-}Wei Chang and\n",
"Kenton Lee and\n",
"Kristina Toutanova},\n",
"title = {{BERT:} Pre-training of Deep Bidirectional Transformers for Language\n",
"Understanding},\n",
"journal = {CoRR},\n",
"volume = {abs/1810.04805},\n",
"year = {2018},\n",
"url = {http://arxiv.org/abs/1810.04805},\n",
"archivePrefix = {arXiv},\n",
"eprint = {1810.04805},\n",
"timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},\n",
"biburl = {https://dblp.org/rec/journals/corr/abs-1810-04805.bib},\n",
"bibsource = {dblp computer science bibliography, https://dblp.org}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "cb789a5a",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于 https://huggingface.co/bert-large-cased-whole-word-masking-finetuned-squad ,并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "1dfd11c1",
"metadata": {},
"source": [
"# BERT large model (cased) whole word masking finetuned on SQuAD\n",
"\n",
"You can get more details from [Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "7105fb8c",
"metadata": {},
"source": [
"Pretrained model on English language using a masked language modeling (MLM) objective. It was introduced in\n",
"[this paper](https://arxiv.org/abs/1810.04805) and first released in\n",
"[this repository](https://github.com/google-research/bert). This model is cased: it makes a difference between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "e3d8b394",
"metadata": {},
"source": [
"Differently to other BERT models, this model was trained with a new technique: Whole Word Masking. In this case, all of the tokens corresponding to a word are masked at once. The overall masking rate remains the same.\n"
]
},
{
"cell_type": "markdown",
"id": "be078628",
"metadata": {},
"source": [
"The training is identical -- each masked WordPiece token is predicted independently.\n"
]
},
{
"cell_type": "markdown",
"id": "278aee7f",
"metadata": {},
"source": [
"After pre-training, this model was fine-tuned on the SQuAD dataset with one of our fine-tuning scripts. See below for more information regarding this fine-tuning.\n"
]
},
{
"cell_type": "markdown",
"id": "ce69aca2",
"metadata": {},
"source": [
"Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by\n",
"the Hugging Face team.\n"
]
},
{
"cell_type": "markdown",
"id": "89b52c17",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "2ebe9e94",
"metadata": {},
"source": [
"BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it\n",
"was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of\n",
"publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it\n",
"was pretrained with two objectives:\n"
]
},
{
"cell_type": "markdown",
"id": "7131c024",
"metadata": {},
"source": [
"- Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run\n",
"the entire masked sentence through the model and has to predict the masked words. This is different from traditional\n",
"recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like\n",
"GPT which internally mask the future tokens. It allows the model to learn a bidirectional representation of the\n",
"sentence.\n",
"- Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes\n",
"they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to\n",
"predict if the two sentences were following each other or not.\n"
]
},
{
"cell_type": "markdown",
"id": "4a8e4aea",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the English language that can then be used to extract features\n",
"useful for downstream tasks: if you have a dataset of labeled sentences for instance, you can train a standard\n",
"classifier using the features produced by the BERT model as inputs.\n"
]
},
{
"cell_type": "markdown",
"id": "717dd1f6",
"metadata": {},
"source": [
"This model has the following configuration:\n"
]
},
{
"cell_type": "markdown",
"id": "6778930f",
"metadata": {},
"source": [
"- 24-layer\n",
"- 1024 hidden dimension\n",
"- 16 attention heads\n",
"- 336M parameters.\n"
]
},
{
"cell_type": "markdown",
"id": "1ffc0609",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "678acd58",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c5318a0c",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-large-cased-whole-word-masking-finetuned-squad\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "f930fd97",
"metadata": {},
"source": [
"```\n",
"@article{DBLP:journals/corr/abs-1810-04805,\n",
"author = {Jacob Devlin and\n",
"Ming{-}Wei Chang and\n",
"Kenton Lee and\n",
"Kristina Toutanova},\n",
"title = {{BERT:} Pre-training of Deep Bidirectional Transformers for Language\n",
"Understanding},\n",
"journal = {CoRR},\n",
"volume = {abs/1810.04805},\n",
"year = {2018},\n",
"url = {http://arxiv.org/abs/1810.04805},\n",
"archivePrefix = {arXiv},\n",
"eprint = {1810.04805},\n",
"timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},\n",
"biburl = {https://dblp.org/rec/journals/corr/abs-1810-04805.bib},\n",
"bibsource = {dblp computer science bibliography, https://dblp.org}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "b3240bd3",
"metadata": {},
"source": [
"> The model introduction and model weights originate from https://huggingface.co/bert-large-cased-whole-word-masking-finetuned-squad and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: bookcorpus,wikipedia
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info: Model_Info:
name: "bert-large-cased-whole-word-masking" description: BERT large model (cased) whole word masking
description: "BERT large model (cased) whole word masking" description_en: BERT large model (cased) whole word masking
description_en: "BERT large model (cased) whole word masking" from_repo: https://huggingface.co/bert-large-cased-whole-word-masking
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/bert-large-cased-whole-word-masking" name: bert-large-cased-whole-word-masking
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: "bookcorpus,wikipedia"
Publisher: "huggingface"
License: "apache-2.0"
Language: "English"
Paper: Paper:
- title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding' - title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding'
url: 'http://arxiv.org/abs/1810.04805v2' url: http://arxiv.org/abs/1810.04805v2
IfTraining: 0 Publisher: huggingface
IfOnlineDemo: 0 Task:
\ No newline at end of file - sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "1d5ffd6a",
"metadata": {},
"source": [
"# BERT large model (cased) whole word masking\n",
"\n",
"详细内容请看[Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "9e7590bd",
"metadata": {},
"source": [
"Pretrained model on English language using a masked language modeling (MLM) objective. It was introduced in\n",
"[this paper](https://arxiv.org/abs/1810.04805) and first released in\n",
"[this repository](https://github.com/google-research/bert). This model is cased: it makes a difference between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "2456751c",
"metadata": {},
"source": [
"Differently to other BERT models, this model was trained with a new technique: Whole Word Masking. In this case, all of the tokens corresponding to a word are masked at once. The overall masking rate remains the same.\n"
]
},
{
"cell_type": "markdown",
"id": "204d6ee6",
"metadata": {},
"source": [
"The training is identical -- each masked WordPiece token is predicted independently.\n"
]
},
{
"cell_type": "markdown",
"id": "743ff269",
"metadata": {},
"source": [
"Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by\n",
"the Hugging Face team.\n"
]
},
{
"cell_type": "markdown",
"id": "bce1ffcc",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "d5d83b7c",
"metadata": {},
"source": [
"BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it\n",
"was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of\n",
"publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it\n",
"was pretrained with two objectives:\n"
]
},
{
"cell_type": "markdown",
"id": "38a98598",
"metadata": {},
"source": [
"- Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run\n",
"the entire masked sentence through the model and has to predict the masked words. This is different from traditional\n",
"recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like\n",
"GPT which internally mask the future tokens. It allows the model to learn a bidirectional representation of the\n",
"sentence.\n",
"- Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes\n",
"they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to\n",
"predict if the two sentences were following each other or not.\n"
]
},
{
"cell_type": "markdown",
"id": "89b5e554",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the English language that can then be used to extract features\n",
"useful for downstream tasks: if you have a dataset of labeled sentences for instance, you can train a standard\n",
"classifier using the features produced by the BERT model as inputs.\n"
]
},
{
"cell_type": "markdown",
"id": "3f205174",
"metadata": {},
"source": [
"This model has the following configuration:\n"
]
},
{
"cell_type": "markdown",
"id": "6b9cf751",
"metadata": {},
"source": [
"- 24-layer\n",
"- 1024 hidden dimension\n",
"- 16 attention heads\n",
"- 336M parameters.\n"
]
},
{
"cell_type": "markdown",
"id": "74a0400e",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d8952fcc",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "365e04c2",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-large-cased-whole-word-masking\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "1cef8f18",
"metadata": {},
"source": [
"```\n",
"@article{DBLP:journals/corr/abs-1810-04805,\n",
"author = {Jacob Devlin and\n",
"Ming{-}Wei Chang and\n",
"Kenton Lee and\n",
"Kristina Toutanova},\n",
"title = {{BERT:} Pre-training of Deep Bidirectional Transformers for Language\n",
"Understanding},\n",
"journal = {CoRR},\n",
"volume = {abs/1810.04805},\n",
"year = {2018},\n",
"url = {http://arxiv.org/abs/1810.04805},\n",
"archivePrefix = {arXiv},\n",
"eprint = {1810.04805},\n",
"timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},\n",
"biburl = {https://dblp.org/rec/journals/corr/abs-1810-04805.bib},\n",
"bibsource = {dblp computer science bibliography, https://dblp.org}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "0d54ff2d",
"metadata": {},
"source": [
"\n",
"> 此模型介绍及权重来源于 https://huggingface.co/bert-large-cased-whole-word-masking ,并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "58f64e54",
"metadata": {},
"source": [
"# BERT large model (cased) whole word masking\n",
"\n",
"You can get more details from [Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "6814fe73",
"metadata": {},
"source": [
"Pretrained model on English language using a masked language modeling (MLM) objective. It was introduced in\n",
"[this paper](https://arxiv.org/abs/1810.04805) and first released in\n",
"[this repository](https://github.com/google-research/bert). This model is cased: it makes a difference between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "7a6b1b28",
"metadata": {},
"source": [
"Differently to other BERT models, this model was trained with a new technique: Whole Word Masking. In this case, all of the tokens corresponding to a word are masked at once. The overall masking rate remains the same.\n"
]
},
{
"cell_type": "markdown",
"id": "e6c8ddc5",
"metadata": {},
"source": [
"The training is identical -- each masked WordPiece token is predicted independently.\n"
]
},
{
"cell_type": "markdown",
"id": "dfcd9c6b",
"metadata": {},
"source": [
"Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by\n",
"the Hugging Face team.\n"
]
},
{
"cell_type": "markdown",
"id": "d758dbd9",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "c4e44287",
"metadata": {},
"source": [
"BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it\n",
"was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of\n",
"publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it\n",
"was pretrained with two objectives:\n"
]
},
{
"cell_type": "markdown",
"id": "d07abc2a",
"metadata": {},
"source": [
"- Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run\n",
"the entire masked sentence through the model and has to predict the masked words. This is different from traditional\n",
"recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like\n",
"GPT which internally mask the future tokens. It allows the model to learn a bidirectional representation of the\n",
"sentence.\n",
"- Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes\n",
"they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to\n",
"predict if the two sentences were following each other or not.\n"
]
},
{
"cell_type": "markdown",
"id": "5fcb83d6",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the English language that can then be used to extract features\n",
"useful for downstream tasks: if you have a dataset of labeled sentences for instance, you can train a standard\n",
"classifier using the features produced by the BERT model as inputs.\n"
]
},
{
"cell_type": "markdown",
"id": "1be2f6a5",
"metadata": {},
"source": [
"This model has the following configuration:\n"
]
},
{
"cell_type": "markdown",
"id": "cd047a65",
"metadata": {},
"source": [
"- 24-layer\n",
"- 1024 hidden dimension\n",
"- 16 attention heads\n",
"- 336M parameters.\n"
]
},
{
"cell_type": "markdown",
"id": "93925c79",
"metadata": {},
"source": [
"## Intended uses & limitations\n"
]
},
{
"cell_type": "markdown",
"id": "f6c1f9b9",
"metadata": {},
"source": [
"You can use the raw model for either masked language modeling or next sentence prediction, but it's mostly intended to\n",
"be fine-tuned on a downstream task. See the [model hub](https://huggingface.co/models?filter=bert) to look for\n",
"fine-tuned versions on a task that interests you.\n"
]
},
{
"cell_type": "markdown",
"id": "a682ee5c",
"metadata": {},
"source": [
"Note that this model is primarily aimed at being fine-tuned on tasks that use the whole sentence (potentially masked)\n",
"to make decisions, such as sequence classification, token classification or question answering. For tasks such as text\n",
"generation you should look at model like GPT2.\n"
]
},
{
"cell_type": "markdown",
"id": "394e6456",
"metadata": {},
"source": [
"### How to use\n"
]
},
{
"cell_type": "markdown",
"id": "9e5fdb9a",
"metadata": {},
"source": [
"You can use this model directly with a pipeline for masked language modeling:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "77af91fe",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ae5caf8d",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-large-cased-whole-word-masking\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "0f43705d",
"metadata": {},
"source": [
"```\n",
"@article{DBLP:journals/corr/abs-1810-04805,\n",
"author = {Jacob Devlin and\n",
"Ming{-}Wei Chang and\n",
"Kenton Lee and\n",
"Kristina Toutanova},\n",
"title = {{BERT:} Pre-training of Deep Bidirectional Transformers for Language\n",
"Understanding},\n",
"journal = {CoRR},\n",
"volume = {abs/1810.04805},\n",
"year = {2018},\n",
"url = {http://arxiv.org/abs/1810.04805},\n",
"archivePrefix = {arXiv},\n",
"eprint = {1810.04805},\n",
"timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},\n",
"biburl = {https://dblp.org/rec/journals/corr/abs-1810-04805.bib},\n",
"bibsource = {dblp computer science bibliography, https://dblp.org}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "54ae4165",
"metadata": {},
"source": [
"\n",
"> The model introduction and model weights originate from https://huggingface.co/bert-large-cased-whole-word-masking and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: bookcorpus,wikipedia
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info: Model_Info:
name: "bert-large-cased" description: BERT large model (cased)
description: "BERT large model (cased)" description_en: BERT large model (cased)
description_en: "BERT large model (cased)" from_repo: https://huggingface.co/bert-large-cased
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/bert-large-cased" name: bert-large-cased
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: "bookcorpus,wikipedia"
Publisher: "huggingface"
License: "apache-2.0"
Language: "English"
Paper: Paper:
- title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding' - title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding'
url: 'http://arxiv.org/abs/1810.04805v2' url: http://arxiv.org/abs/1810.04805v2
IfTraining: 0 Publisher: huggingface
IfOnlineDemo: 0 Task:
\ No newline at end of file - sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "360e146a",
"metadata": {},
"source": [
"# BERT large model (cased)\n",
"\n",
"详细内容请看[Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "bb3eb868",
"metadata": {},
"source": [
"Pretrained model on English language using a masked language modeling (MLM) objective. It was introduced in\n",
"[this paper](https://arxiv.org/abs/1810.04805) and first released in\n",
"[this repository](https://github.com/google-research/bert). This model is cased: it makes a difference\n",
"between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "0f512012",
"metadata": {},
"source": [
"Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by\n",
"the Hugging Face team.\n"
]
},
{
"cell_type": "markdown",
"id": "8dfae0e4",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "29d97a32",
"metadata": {},
"source": [
"BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it\n",
"was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of\n",
"publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it\n",
"was pretrained with two objectives:\n"
]
},
{
"cell_type": "markdown",
"id": "84dd3c36",
"metadata": {},
"source": [
"- Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run\n",
"the entire masked sentence through the model and has to predict the masked words. This is different from traditional\n",
"recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like\n",
"GPT which internally mask the future tokens. It allows the model to learn a bidirectional representation of the\n",
"sentence.\n",
"- Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes\n",
"they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to\n",
"predict if the two sentences were following each other or not.\n"
]
},
{
"cell_type": "markdown",
"id": "dbb66981",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the English language that can then be used to extract features\n",
"useful for downstream tasks: if you have a dataset of labeled sentences for instance, you can train a standard\n",
"classifier using the features produced by the BERT model as inputs.\n"
]
},
{
"cell_type": "markdown",
"id": "4a3d9a5c",
"metadata": {},
"source": [
"This model has the following configuration:\n"
]
},
{
"cell_type": "markdown",
"id": "85a286cd",
"metadata": {},
"source": [
"- 24-layer\n",
"- 1024 hidden dimension\n",
"- 16 attention heads\n",
"- 336M parameters.\n"
]
},
{
"cell_type": "markdown",
"id": "e9f5c5f1",
"metadata": {},
"source": [
"## Intended uses & limitations\n"
]
},
{
"cell_type": "markdown",
"id": "d3ae1617",
"metadata": {},
"source": [
"You can use the raw model for either masked language modeling or next sentence prediction, but it's mostly intended to\n",
"be fine-tuned on a downstream task. See the [model hub](https://huggingface.co/models?filter=bert) to look for\n",
"fine-tuned versions on a task that interests you.\n"
]
},
{
"cell_type": "markdown",
"id": "1d814aa3",
"metadata": {},
"source": [
"Note that this model is primarily aimed at being fine-tuned on tasks that use the whole sentence (potentially masked)\n",
"to make decisions, such as sequence classification, token classification or question answering. For tasks such as text\n",
"generation you should look at model like GPT2.\n"
]
},
{
"cell_type": "markdown",
"id": "7c9cb698",
"metadata": {},
"source": [
"### How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "266349de",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e0d0fb84",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-large-cased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "d58fffcd",
"metadata": {},
"source": [
"```\n",
"@article{DBLP:journals/corr/abs-1810-04805,\n",
"author = {Jacob Devlin and\n",
"Ming{-}Wei Chang and\n",
"Kenton Lee and\n",
"Kristina Toutanova},\n",
"title = {{BERT:} Pre-training of Deep Bidirectional Transformers for Language\n",
"Understanding},\n",
"journal = {CoRR},\n",
"volume = {abs/1810.04805},\n",
"year = {2018},\n",
"url = {http://arxiv.org/abs/1810.04805},\n",
"archivePrefix = {arXiv},\n",
"eprint = {1810.04805},\n",
"timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},\n",
"biburl = {https://dblp.org/rec/journals/corr/abs-1810-04805.bib},\n",
"bibsource = {dblp computer science bibliography, https://dblp.org}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "8591ee7f",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于 https://huggingface.co/bert-large-cased ,并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "2460ffb6",
"metadata": {},
"source": [
"# BERT large model (cased)\n",
"\n",
"You can get more details from [Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "07c2aecf",
"metadata": {},
"source": [
"Pretrained model on English language using a masked language modeling (MLM) objective. It was introduced in\n",
"[this paper](https://arxiv.org/abs/1810.04805) and first released in\n",
"[this repository](https://github.com/google-research/bert). This model is cased: it makes a difference\n",
"between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "fb6201f0",
"metadata": {},
"source": [
"Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by\n",
"the Hugging Face team.\n"
]
},
{
"cell_type": "markdown",
"id": "ffd4c0b9",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "0b465123",
"metadata": {},
"source": [
"BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it\n",
"was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of\n",
"publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it\n",
"was pretrained with two objectives:\n"
]
},
{
"cell_type": "markdown",
"id": "7a5eb557",
"metadata": {},
"source": [
"- Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run\n",
"the entire masked sentence through the model and has to predict the masked words. This is different from traditional\n",
"recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like\n",
"GPT which internally mask the future tokens. It allows the model to learn a bidirectional representation of the\n",
"sentence.\n",
"- Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes\n",
"they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to\n",
"predict if the two sentences were following each other or not.\n"
]
},
{
"cell_type": "markdown",
"id": "d40678bb",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the English language that can then be used to extract features\n",
"useful for downstream tasks: if you have a dataset of labeled sentences for instance, you can train a standard\n",
"classifier using the features produced by the BERT model as inputs.\n"
]
},
{
"cell_type": "markdown",
"id": "8fc24335",
"metadata": {},
"source": [
"This model has the following configuration:\n"
]
},
{
"cell_type": "markdown",
"id": "355e9553",
"metadata": {},
"source": [
"- 24-layer\n",
"- 1024 hidden dimension\n",
"- 16 attention heads\n",
"- 336M parameters.\n"
]
},
{
"cell_type": "markdown",
"id": "47e2e497",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c4d80b50",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f73f3925",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-large-cased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "2873617b",
"metadata": {},
"source": [
"```\n",
"@article{DBLP:journals/corr/abs-1810-04805,\n",
"author = {Jacob Devlin and\n",
"Ming{-}Wei Chang and\n",
"Kenton Lee and\n",
"Kristina Toutanova},\n",
"title = {{BERT:} Pre-training of Deep Bidirectional Transformers for Language\n",
"Understanding},\n",
"journal = {CoRR},\n",
"volume = {abs/1810.04805},\n",
"year = {2018},\n",
"url = {http://arxiv.org/abs/1810.04805},\n",
"archivePrefix = {arXiv},\n",
"eprint = {1810.04805},\n",
"timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},\n",
"biburl = {https://dblp.org/rec/journals/corr/abs-1810-04805.bib},\n",
"bibsource = {dblp computer science bibliography, https://dblp.org}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "bc4aea4d",
"metadata": {},
"source": [
"> The model introduction and model weights originate from https://huggingface.co/bert-large-cased and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: bookcorpus,wikipedia
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info: Model_Info:
name: "bert-large-uncased-whole-word-masking-finetuned-squad" description: BERT large model (uncased) whole word masking finetuned on SQuAD
description: "BERT large model (uncased) whole word masking finetuned on SQuAD" description_en: BERT large model (uncased) whole word masking finetuned on SQuAD
description_en: "BERT large model (uncased) whole word masking finetuned on SQuAD" from_repo: https://huggingface.co/bert-large-uncased-whole-word-masking-finetuned-squad
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/bert-large-uncased-whole-word-masking-finetuned-squad" name: bert-large-uncased-whole-word-masking-finetuned-squad
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Question Answering"
sub_tag: "回答问题"
Example:
Datasets: "bookcorpus,wikipedia"
Publisher: "huggingface"
License: "apache-2.0"
Language: "English"
Paper: Paper:
- title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding' - title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding'
url: 'http://arxiv.org/abs/1810.04805v2' url: http://arxiv.org/abs/1810.04805v2
IfTraining: 0 Publisher: huggingface
IfOnlineDemo: 0 Task:
\ No newline at end of file - sub_tag: 回答问题
sub_tag_en: Question Answering
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "aad9532a",
"metadata": {},
"source": [
"# BERT large model (uncased) whole word masking finetuned on SQuAD\n"
]
},
{
"cell_type": "markdown",
"id": "724df271",
"metadata": {},
"source": [
"Pretrained model on English language using a masked language modeling (MLM) objective. It was introduced in\n",
"[this paper](https://arxiv.org/abs/1810.04805) and first released in\n",
"[this repository](https://github.com/google-research/bert). This model is uncased: it does not make a difference\n",
"between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "f2b9e3bf",
"metadata": {},
"source": [
"Differently to other BERT models, this model was trained with a new technique: Whole Word Masking. In this case, all of the tokens corresponding to a word are masked at once. The overall masking rate remains the same.\n"
]
},
{
"cell_type": "markdown",
"id": "6566eb12",
"metadata": {},
"source": [
"The training is identical -- each masked WordPiece token is predicted independently.\n"
]
},
{
"cell_type": "markdown",
"id": "7b45422b",
"metadata": {},
"source": [
"After pre-training, this model was fine-tuned on the SQuAD dataset with one of our fine-tuning scripts. See below for more information regarding this fine-tuning.\n"
]
},
{
"cell_type": "markdown",
"id": "c9957f91",
"metadata": {},
"source": [
"Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by\n",
"the Hugging Face team.\n"
]
},
{
"cell_type": "markdown",
"id": "43cba468",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "457bfeee",
"metadata": {},
"source": [
"BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it\n",
"was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of\n",
"publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it\n",
"was pretrained with two objectives:\n"
]
},
{
"cell_type": "markdown",
"id": "77c83270",
"metadata": {},
"source": [
"- Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run\n",
"the entire masked sentence through the model and has to predict the masked words. This is different from traditional\n",
"recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like\n",
"GPT which internally mask the future tokens. It allows the model to learn a bidirectional representation of the\n",
"sentence.\n",
"- Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes\n",
"they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to\n",
"predict if the two sentences were following each other or not.\n"
]
},
{
"cell_type": "markdown",
"id": "0ba87de6",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the English language that can then be used to extract features\n",
"useful for downstream tasks: if you have a dataset of labeled sentences for instance, you can train a standard\n",
"classifier using the features produced by the BERT model as inputs.\n"
]
},
{
"cell_type": "markdown",
"id": "f363132f",
"metadata": {},
"source": [
"This model has the following configuration:\n"
]
},
{
"cell_type": "markdown",
"id": "83a4e49f",
"metadata": {},
"source": [
"- 24-layer\n",
"- 1024 hidden dimension\n",
"- 16 attention heads\n",
"- 336M parameters.\n"
]
},
{
"cell_type": "markdown",
"id": "68565c6d",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "457a1c54",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a9369c0d",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-large-uncased-whole-word-masking-finetuned-squad\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "c5fefb8f",
"metadata": {},
"source": [
"```\n",
"@article{DBLP:journals/corr/abs-1810-04805,\n",
"author = {Jacob Devlin and\n",
"Ming{-}Wei Chang and\n",
"Kenton Lee and\n",
"Kristina Toutanova},\n",
"title = {{BERT:} Pre-training of Deep Bidirectional Transformers for Language\n",
"Understanding},\n",
"journal = {CoRR},\n",
"volume = {abs/1810.04805},\n",
"year = {2018},\n",
"url = {http://arxiv.org/abs/1810.04805},\n",
"archivePrefix = {arXiv},\n",
"eprint = {1810.04805},\n",
"timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},\n",
"biburl = {https://dblp.org/rec/journals/corr/abs-1810-04805.bib},\n",
"bibsource = {dblp computer science bibliography, https://dblp.org}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "654c0920",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于 https://huggingface.co/bert-large-uncased-whole-word-masking-finetuned-squad ,并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "2d4f4368",
"metadata": {},
"source": [
"# BERT large model (uncased) whole word masking finetuned on SQuAD\n",
"\n",
"You can get more details from [Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "afef45e0",
"metadata": {},
"source": [
"Pretrained model on English language using a masked language modeling (MLM) objective. It was introduced in\n",
"[this paper](https://arxiv.org/abs/1810.04805) and first released in\n",
"[this repository](https://github.com/google-research/bert). This model is uncased: it does not make a difference\n",
"between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "c94536b9",
"metadata": {},
"source": [
"Differently to other BERT models, this model was trained with a new technique: Whole Word Masking. In this case, all of the tokens corresponding to a word are masked at once. The overall masking rate remains the same.\n"
]
},
{
"cell_type": "markdown",
"id": "50254dea",
"metadata": {},
"source": [
"The training is identical -- each masked WordPiece token is predicted independently.\n"
]
},
{
"cell_type": "markdown",
"id": "4b482be9",
"metadata": {},
"source": [
"After pre-training, this model was fine-tuned on the SQuAD dataset with one of our fine-tuning scripts. See below for more information regarding this fine-tuning.\n"
]
},
{
"cell_type": "markdown",
"id": "adfc36af",
"metadata": {},
"source": [
"Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by\n",
"the Hugging Face team.\n"
]
},
{
"cell_type": "markdown",
"id": "22f554a7",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "eccd3048",
"metadata": {},
"source": [
"BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it\n",
"was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of\n",
"publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it\n",
"was pretrained with two objectives:\n"
]
},
{
"cell_type": "markdown",
"id": "3d4098e8",
"metadata": {},
"source": [
"- Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run\n",
"the entire masked sentence through the model and has to predict the masked words. This is different from traditional\n",
"recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like\n",
"GPT which internally mask the future tokens. It allows the model to learn a bidirectional representation of the\n",
"sentence.\n",
"- Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes\n",
"they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to\n",
"predict if the two sentences were following each other or not.\n"
]
},
{
"cell_type": "markdown",
"id": "1047d1ad",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the English language that can then be used to extract features\n",
"useful for downstream tasks: if you have a dataset of labeled sentences for instance, you can train a standard\n",
"classifier using the features produced by the BERT model as inputs.\n"
]
},
{
"cell_type": "markdown",
"id": "7046db0c",
"metadata": {},
"source": [
"This model has the following configuration:\n"
]
},
{
"cell_type": "markdown",
"id": "09659088",
"metadata": {},
"source": [
"- 24-layer\n",
"- 1024 hidden dimension\n",
"- 16 attention heads\n",
"- 336M parameters.\n"
]
},
{
"cell_type": "markdown",
"id": "65769919",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4449cfac",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1e8dcf70",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-large-uncased-whole-word-masking-finetuned-squad\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "49471f4b",
"metadata": {},
"source": [
"```\n",
"@article{DBLP:journals/corr/abs-1810-04805,\n",
"author = {Jacob Devlin and\n",
"Ming{-}Wei Chang and\n",
"Kenton Lee and\n",
"Kristina Toutanova},\n",
"title = {{BERT:} Pre-training of Deep Bidirectional Transformers for Language\n",
"Understanding},\n",
"journal = {CoRR},\n",
"volume = {abs/1810.04805},\n",
"year = {2018},\n",
"url = {http://arxiv.org/abs/1810.04805},\n",
"archivePrefix = {arXiv},\n",
"eprint = {1810.04805},\n",
"timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},\n",
"biburl = {https://dblp.org/rec/journals/corr/abs-1810-04805.bib},\n",
"bibsource = {dblp computer science bibliography, https://dblp.org}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "d783c8fc",
"metadata": {},
"source": [
"> The model introduction and model weights originate from https://huggingface.co/bert-large-uncased-whole-word-masking-finetuned-squad and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: bookcorpus,wikipedia
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info: Model_Info:
name: "bert-large-uncased-whole-word-masking" description: BERT large model (uncased) whole word masking
description: "BERT large model (uncased) whole word masking" description_en: BERT large model (uncased) whole word masking
description_en: "BERT large model (uncased) whole word masking" from_repo: https://huggingface.co/bert-large-uncased-whole-word-masking
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/bert-large-uncased-whole-word-masking" name: bert-large-uncased-whole-word-masking
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: "bookcorpus,wikipedia"
Publisher: "huggingface"
License: "apache-2.0"
Language: "English"
Paper: Paper:
- title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding' - title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding'
url: 'http://arxiv.org/abs/1810.04805v2' url: http://arxiv.org/abs/1810.04805v2
IfTraining: 0 Publisher: huggingface
IfOnlineDemo: 0 Task:
\ No newline at end of file - sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "cf43e770",
"metadata": {},
"source": [
"# BERT large model (uncased) whole word masking\n",
"\n",
"详细内容请看[Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "af8c3816",
"metadata": {},
"source": [
"Pretrained model on English language using a masked language modeling (MLM) objective. It was introduced in\n",
"[this paper](https://arxiv.org/abs/1810.04805) and first released in\n",
"[this repository](https://github.com/google-research/bert). This model is uncased: it does not make a difference\n",
"between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "c103e84b",
"metadata": {},
"source": [
"Differently to other BERT models, this model was trained with a new technique: Whole Word Masking. In this case, all of the tokens corresponding to a word are masked at once. The overall masking rate remains the same.\n"
]
},
{
"cell_type": "markdown",
"id": "19a76368",
"metadata": {},
"source": [
"The training is identical -- each masked WordPiece token is predicted independently.\n"
]
},
{
"cell_type": "markdown",
"id": "67f11a2c",
"metadata": {},
"source": [
"Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by\n",
"the Hugging Face team.\n"
]
},
{
"cell_type": "markdown",
"id": "778cf97d",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "dddbb307",
"metadata": {},
"source": [
"BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it\n",
"was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of\n",
"publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it\n",
"was pretrained with two objectives:\n"
]
},
{
"cell_type": "markdown",
"id": "40becad1",
"metadata": {},
"source": [
"- Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run\n",
"the entire masked sentence through the model and has to predict the masked words. This is different from traditional\n",
"recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like\n",
"GPT which internally mask the future tokens. It allows the model to learn a bidirectional representation of the\n",
"sentence.\n",
"- Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes\n",
"they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to\n",
"predict if the two sentences were following each other or not.\n"
]
},
{
"cell_type": "markdown",
"id": "3fc265b6",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the English language that can then be used to extract features\n",
"useful for downstream tasks: if you have a dataset of labeled sentences for instance, you can train a standard\n",
"classifier using the features produced by the BERT model as inputs.\n"
]
},
{
"cell_type": "markdown",
"id": "65e4a308",
"metadata": {},
"source": [
"This model has the following configuration:\n"
]
},
{
"cell_type": "markdown",
"id": "6d0b86c1",
"metadata": {},
"source": [
"- 24-layer\n",
"- 1024 hidden dimension\n",
"- 16 attention heads\n",
"- 336M parameters.\n"
]
},
{
"cell_type": "markdown",
"id": "dd94b8be",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "bc669f99",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4580650d",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-large-uncased-whole-word-masking\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "475fd35d",
"metadata": {},
"source": [
"```\n",
"@article{DBLP:journals/corr/abs-1810-04805,\n",
"author = {Jacob Devlin and\n",
"Ming{-}Wei Chang and\n",
"Kenton Lee and\n",
"Kristina Toutanova},\n",
"title = {{BERT:} Pre-training of Deep Bidirectional Transformers for Language\n",
"Understanding},\n",
"journal = {CoRR},\n",
"volume = {abs/1810.04805},\n",
"year = {2018},\n",
"url = {http://arxiv.org/abs/1810.04805},\n",
"archivePrefix = {arXiv},\n",
"eprint = {1810.04805},\n",
"timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},\n",
"biburl = {https://dblp.org/rec/journals/corr/abs-1810-04805.bib},\n",
"bibsource = {dblp computer science bibliography, https://dblp.org}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "f09b9b09",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于 https://huggingface.co/bert-large-uncased-whole-word-masking ,并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "ceefe52d",
"metadata": {},
"source": [
"# BERT large model (uncased) whole word masking\n",
"\n",
"You can get more details from [Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "14552c09",
"metadata": {},
"source": [
"Pretrained model on English language using a masked language modeling (MLM) objective. It was introduced in\n",
"[this paper](https://arxiv.org/abs/1810.04805) and first released in\n",
"[this repository](https://github.com/google-research/bert). This model is uncased: it does not make a difference\n",
"between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "78d1e4a0",
"metadata": {},
"source": [
"Differently to other BERT models, this model was trained with a new technique: Whole Word Masking. In this case, all of the tokens corresponding to a word are masked at once. The overall masking rate remains the same.\n"
]
},
{
"cell_type": "markdown",
"id": "cdbe484a",
"metadata": {},
"source": [
"The training is identical -- each masked WordPiece token is predicted independently.\n"
]
},
{
"cell_type": "markdown",
"id": "fdbba80d",
"metadata": {},
"source": [
"Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by\n",
"the Hugging Face team.\n"
]
},
{
"cell_type": "markdown",
"id": "aba33624",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "459ca6e6",
"metadata": {},
"source": [
"BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it\n",
"was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of\n",
"publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it\n",
"was pretrained with two objectives:\n"
]
},
{
"cell_type": "markdown",
"id": "65f2ae1a",
"metadata": {},
"source": [
"- Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run\n",
"the entire masked sentence through the model and has to predict the masked words. This is different from traditional\n",
"recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like\n",
"GPT which internally mask the future tokens. It allows the model to learn a bidirectional representation of the\n",
"sentence.\n",
"- Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes\n",
"they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to\n",
"predict if the two sentences were following each other or not.\n"
]
},
{
"cell_type": "markdown",
"id": "86e8d7eb",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the English language that can then be used to extract features\n",
"useful for downstream tasks: if you have a dataset of labeled sentences for instance, you can train a standard\n",
"classifier using the features produced by the BERT model as inputs.\n"
]
},
{
"cell_type": "markdown",
"id": "b81821d8",
"metadata": {},
"source": [
"This model has the following configuration:\n"
]
},
{
"cell_type": "markdown",
"id": "3a576172",
"metadata": {},
"source": [
"- 24-layer\n",
"- 1024 hidden dimension\n",
"- 16 attention heads\n",
"- 336M parameters.\n"
]
},
{
"cell_type": "markdown",
"id": "0038f06c",
"metadata": {},
"source": [
"## Intended uses & limitations\n"
]
},
{
"cell_type": "markdown",
"id": "ba8c18de",
"metadata": {},
"source": [
"You can use the raw model for either masked language modeling or next sentence prediction, but it's mostly intended to\n",
"be fine-tuned on a downstream task. See the [model hub](https://huggingface.co/models?filter=bert) to look for\n",
"fine-tuned versions on a task that interests you.\n"
]
},
{
"cell_type": "markdown",
"id": "bb72ad39",
"metadata": {},
"source": [
"Note that this model is primarily aimed at being fine-tuned on tasks that use the whole sentence (potentially masked)\n",
"to make decisions, such as sequence classification, token classification or question answering. For tasks such as text\n",
"generation you should look at model like GPT2.\n"
]
},
{
"cell_type": "markdown",
"id": "54b59ca8",
"metadata": {},
"source": [
"### How to use\n"
]
},
{
"cell_type": "markdown",
"id": "a0ff2a80",
"metadata": {},
"source": [
"You can use this model directly with a pipeline for masked language modeling:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "990ce14a",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7d468ffb",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-large-uncased-whole-word-masking\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "93d6e9e4",
"metadata": {},
"source": [
"```\n",
"@article{DBLP:journals/corr/abs-1810-04805,\n",
"author = {Jacob Devlin and\n",
"Ming{-}Wei Chang and\n",
"Kenton Lee and\n",
"Kristina Toutanova},\n",
"title = {{BERT:} Pre-training of Deep Bidirectional Transformers for Language\n",
"Understanding},\n",
"journal = {CoRR},\n",
"volume = {abs/1810.04805},\n",
"year = {2018},\n",
"url = {http://arxiv.org/abs/1810.04805},\n",
"archivePrefix = {arXiv},\n",
"eprint = {1810.04805},\n",
"timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},\n",
"biburl = {https://dblp.org/rec/journals/corr/abs-1810-04805.bib},\n",
"bibsource = {dblp computer science bibliography, https://dblp.org}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "c9d05272",
"metadata": {},
"source": [
"> The model introduction and model weights originate from https://huggingface.co/bert-large-uncased-whole-word-masking and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: bookcorpus,wikipedia
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info: Model_Info:
name: "bert-large-uncased" description: BERT large model (uncased)
description: "BERT large model (uncased)" description_en: BERT large model (uncased)
description_en: "BERT large model (uncased)" from_repo: https://huggingface.co/bert-large-uncased
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/bert-large-uncased" name: bert-large-uncased
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: "bookcorpus,wikipedia"
Publisher: "huggingface"
License: "apache-2.0"
Language: "English"
Paper: Paper:
- title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding' - title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding'
url: 'http://arxiv.org/abs/1810.04805v2' url: http://arxiv.org/abs/1810.04805v2
IfTraining: 0 Publisher: huggingface
IfOnlineDemo: 0 Task:
\ No newline at end of file - sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "c000df74",
"metadata": {},
"source": [
"# BERT large model (uncased)\n",
"\n",
"详细内容请看[Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "bd7436a9",
"metadata": {},
"source": [
"Pretrained model on English language using a masked language modeling (MLM) objective. It was introduced in\n",
"[this paper](https://arxiv.org/abs/1810.04805) and first released in\n",
"[this repository](https://github.com/google-research/bert). This model is uncased: it does not make a difference\n",
"between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "87c430c2",
"metadata": {},
"source": [
"Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by\n",
"the Hugging Face team.\n"
]
},
{
"cell_type": "markdown",
"id": "e2004f07",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "ad86c301",
"metadata": {},
"source": [
"BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it\n",
"was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of\n",
"publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it\n",
"was pretrained with two objectives:\n"
]
},
{
"cell_type": "markdown",
"id": "8f12ab3c",
"metadata": {},
"source": [
"- Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run\n",
"the entire masked sentence through the model and has to predict the masked words. This is different from traditional\n",
"recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like\n",
"GPT which internally mask the future tokens. It allows the model to learn a bidirectional representation of the\n",
"sentence.\n",
"- Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes\n",
"they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to\n",
"predict if the two sentences were following each other or not.\n"
]
},
{
"cell_type": "markdown",
"id": "3fc80525",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the English language that can then be used to extract features\n",
"useful for downstream tasks: if you have a dataset of labeled sentences for instance, you can train a standard\n",
"classifier using the features produced by the BERT model as inputs.\n"
]
},
{
"cell_type": "markdown",
"id": "c31d15b4",
"metadata": {},
"source": [
"This model has the following configuration:\n"
]
},
{
"cell_type": "markdown",
"id": "822f7f40",
"metadata": {},
"source": [
"- 24-layer\n",
"- 1024 hidden dimension\n",
"- 16 attention heads\n",
"- 336M parameters.\n"
]
},
{
"cell_type": "markdown",
"id": "7fcdeb04",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "db4ceaa3",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "dc6a0473",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-large-uncased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "1156d387",
"metadata": {},
"source": [
"```\n",
"@article{DBLP:journals/corr/abs-1810-04805,\n",
"author = {Jacob Devlin and\n",
"Ming{-}Wei Chang and\n",
"Kenton Lee and\n",
"Kristina Toutanova},\n",
"title = {{BERT:} Pre-training of Deep Bidirectional Transformers for Language\n",
"Understanding},\n",
"journal = {CoRR},\n",
"volume = {abs/1810.04805},\n",
"year = {2018},\n",
"url = {http://arxiv.org/abs/1810.04805},\n",
"archivePrefix = {arXiv},\n",
"eprint = {1810.04805},\n",
"timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},\n",
"biburl = {https://dblp.org/rec/journals/corr/abs-1810-04805.bib},\n",
"bibsource = {dblp computer science bibliography, https://dblp.org}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "9d07ca08",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于 https://huggingface.co/bert-large-uncased ,并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "a4fae520",
"metadata": {},
"source": [
"# BERT large model (uncased)\n",
"\n",
"You can get more details from [Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "c410d1ae",
"metadata": {},
"source": [
"Pretrained model on English language using a masked language modeling (MLM) objective. It was introduced in\n",
"[this paper](https://arxiv.org/abs/1810.04805) and first released in\n",
"[this repository](https://github.com/google-research/bert). This model is uncased: it does not make a difference\n",
"between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "40166ab8",
"metadata": {},
"source": [
"Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by\n",
"the Hugging Face team.\n"
]
},
{
"cell_type": "markdown",
"id": "dacb968e",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "c519206d",
"metadata": {},
"source": [
"BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it\n",
"was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of\n",
"publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it\n",
"was pretrained with two objectives:\n"
]
},
{
"cell_type": "markdown",
"id": "2dd87a78",
"metadata": {},
"source": [
"- Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run\n",
"the entire masked sentence through the model and has to predict the masked words. This is different from traditional\n",
"recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like\n",
"GPT which internally mask the future tokens. It allows the model to learn a bidirectional representation of the\n",
"sentence.\n",
"- Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes\n",
"they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to\n",
"predict if the two sentences were following each other or not.\n"
]
},
{
"cell_type": "markdown",
"id": "507ce60a",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the English language that can then be used to extract features\n",
"useful for downstream tasks: if you have a dataset of labeled sentences for instance, you can train a standard\n",
"classifier using the features produced by the BERT model as inputs.\n"
]
},
{
"cell_type": "markdown",
"id": "7fb7a8a0",
"metadata": {},
"source": [
"This model has the following configuration:\n"
]
},
{
"cell_type": "markdown",
"id": "ebe2c593",
"metadata": {},
"source": [
"- 24-layer\n",
"- 1024 hidden dimension\n",
"- 16 attention heads\n",
"- 336M parameters.\n"
]
},
{
"cell_type": "markdown",
"id": "547e3cc8",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "669cb05f",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "09a4bc02",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-large-uncased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "3ae36313",
"metadata": {},
"source": [
"```\n",
"@article{DBLP:journals/corr/abs-1810-04805,\n",
"author = {Jacob Devlin and\n",
"Ming{-}Wei Chang and\n",
"Kenton Lee and\n",
"Kristina Toutanova},\n",
"title = {{BERT:} Pre-training of Deep Bidirectional Transformers for Language\n",
"Understanding},\n",
"journal = {CoRR},\n",
"volume = {abs/1810.04805},\n",
"year = {2018},\n",
"url = {http://arxiv.org/abs/1810.04805},\n",
"archivePrefix = {arXiv},\n",
"eprint = {1810.04805},\n",
"timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},\n",
"biburl = {https://dblp.org/rec/journals/corr/abs-1810-04805.bib},\n",
"bibsource = {dblp computer science bibliography, https://dblp.org}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "bed31ba3",
"metadata": {},
"source": [
"> The model introduction and model weights originate from https://huggingface.co/bert-large-uncased and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: wikipedia
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: ''
License: apache-2.0
Model_Info: Model_Info:
name: "distilbert-base-multilingual-cased" description: Model Card for DistilBERT base multilingual (cased)
description: "Model Card for DistilBERT base multilingual (cased)" description_en: Model Card for DistilBERT base multilingual (cased)
description_en: "Model Card for DistilBERT base multilingual (cased)" from_repo: https://huggingface.co/distilbert-base-multilingual-cased
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/distilbert-base-multilingual-cased" name: distilbert-base-multilingual-cased
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: "wikipedia"
Publisher: "huggingface"
License: "apache-2.0"
Language: ""
Paper: Paper:
- title: 'DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter' - title: 'DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter'
url: 'http://arxiv.org/abs/1910.01108v4' url: http://arxiv.org/abs/1910.01108v4
- title: 'Quantifying the Carbon Emissions of Machine Learning' - title: Quantifying the Carbon Emissions of Machine Learning
url: 'http://arxiv.org/abs/1910.09700v2' url: http://arxiv.org/abs/1910.09700v2
IfTraining: 0 Publisher: huggingface
IfOnlineDemo: 0 Task:
\ No newline at end of file - sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "922fd8e5",
"metadata": {},
"source": [
"# Model Card for DistilBERT base multilingual (cased)\n",
"\n",
"详细内容请看[Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。\n"
]
},
{
"cell_type": "markdown",
"id": "a1024bec",
"metadata": {},
"source": [
"## Model Description\n"
]
},
{
"cell_type": "markdown",
"id": "bcdfe024",
"metadata": {},
"source": [
"This model is a distilled version of the [BERT base multilingual model](https://huggingface.co/bert-base-multilingual-cased/). The code for the distillation process can be found [here](https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation). This model is cased: it does make a difference between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "5051aaa6",
"metadata": {},
"source": [
"The model is trained on the concatenation of Wikipedia in 104 different languages listed [here](https://github.com/google-research/bert/blob/master/multilingual.md#list-of-languages).\n",
"The model has 6 layers, 768 dimension and 12 heads, totalizing 134M parameters (compared to 177M parameters for mBERT-base).\n",
"On average, this model, referred to as DistilmBERT, is twice as fast as mBERT-base.\n"
]
},
{
"cell_type": "markdown",
"id": "cdddc273",
"metadata": {},
"source": [
"We encourage potential users of this model to check out the [BERT base multilingual model card](https://huggingface.co/bert-base-multilingual-cased) to learn more about usage, limitations and potential biases.\n"
]
},
{
"cell_type": "markdown",
"id": "8eebedbf",
"metadata": {},
"source": [
"- **Developed by:** Victor Sanh, Lysandre Debut, Julien Chaumond, Thomas Wolf (Hugging Face)\n",
"- **Model type:** Transformer-based language model\n",
"- **Language(s) (NLP):** 104 languages; see full list [here](https://github.com/google-research/bert/blob/master/multilingual.md#list-of-languages)\n",
"- **License:** Apache 2.0\n",
"- **Related Models:** [BERT base multilingual model](https://huggingface.co/bert-base-multilingual-cased)\n",
"- **Resources for more information:**\n",
"- [GitHub Repository](https://github.com/huggingface/transformers/blob/main/examples/research_projects/distillation/README.md)\n",
"- [Associated Paper](https://arxiv.org/abs/1910.01108)\n"
]
},
{
"cell_type": "markdown",
"id": "e9f48c0b",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f4dde273",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b940cddf",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"distilbert-base-multilingual-cased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "7ab62874",
"metadata": {},
"source": [
"# Citation\n",
"\n",
"```\n",
"@article{Sanh2019DistilBERTAD,\n",
" title={DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter},\n",
" author={Victor Sanh and Lysandre Debut and Julien Chaumond and Thomas Wolf},\n",
" journal={ArXiv},\n",
" year={2019},\n",
" volume={abs/1910.01108}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "8bdb4ee1",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于 https://huggingface.co/distilbert-base-multilingual-cased ,并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "4260a150",
"metadata": {},
"source": [
"# Model Card for DistilBERT base multilingual (cased)\n",
"\n",
"You can get more details from [Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "53f1b1c2",
"metadata": {},
"source": [
"This model is a distilled version of the [BERT base multilingual model](https://huggingface.co/bert-base-multilingual-cased/). The code for the distillation process can be found [here](https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation). This model is cased: it does make a difference between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "f417583b",
"metadata": {},
"source": [
"The model is trained on the concatenation of Wikipedia in 104 different languages listed [here](https://github.com/google-research/bert/blob/master/multilingual.md#list-of-languages).\n",
"The model has 6 layers, 768 dimension and 12 heads, totalizing 134M parameters (compared to 177M parameters for mBERT-base).\n",
"On average, this model, referred to as DistilmBERT, is twice as fast as mBERT-base.\n",
"\n",
"- **Developed by:** Victor Sanh, Lysandre Debut, Julien Chaumond, Thomas Wolf (Hugging Face)\n",
"- **Model type:** Transformer-based language model\n",
"- **Language(s) (NLP):** 104 languages; see full list [here](https://github.com/google-research/bert/blob/master/multilingual.md#list-of-languages)\n",
"- **License:** Apache 2.0\n",
"- **Related Models:** [BERT base multilingual model](https://huggingface.co/bert-base-multilingual-cased)\n",
"- **Resources for more information:**\n",
"- [GitHub Repository](https://github.com/huggingface/transformers/blob/main/examples/research_projects/distillation/README.md)\n",
"- [Associated Paper](https://arxiv.org/abs/1910.01108)\n"
]
},
{
"cell_type": "markdown",
"id": "f47ce9b7",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a1353b5f",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e23a860f",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"distilbert-base-multilingual-cased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "38c30ea4",
"metadata": {},
"source": [
"# Citation\n",
"\n",
"```\n",
"@article{Sanh2019DistilBERTAD,\n",
" title={DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter},\n",
" author={Victor Sanh and Lysandre Debut and Julien Chaumond and Thomas Wolf},\n",
" journal={ArXiv},\n",
" year={2019},\n",
" volume={abs/1910.01108}\n",
"}\n",
"```\n"
]
},
{
"cell_type": "markdown",
"id": "0ee03d6a",
"metadata": {},
"source": [
"> The model introduction and model weights originate from https://huggingface.co/distilbert-base-multilingual-cased and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: openwebtext
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info: Model_Info:
name: "distilgpt2" description: DistilGPT2
description: "DistilGPT2" description_en: DistilGPT2
description_en: "DistilGPT2" from_repo: https://huggingface.co/distilgpt2
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/distilgpt2" name: distilgpt2
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text Generation"
sub_tag: "文本生成"
Example:
Datasets: "openwebtext"
Publisher: "huggingface"
License: "apache-2.0"
Language: "English"
Paper: Paper:
- title: 'DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter' - title: 'DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter'
url: 'http://arxiv.org/abs/1910.01108v4' url: http://arxiv.org/abs/1910.01108v4
- title: 'Can Model Compression Improve NLP Fairness' - title: Can Model Compression Improve NLP Fairness
url: 'http://arxiv.org/abs/2201.08542v1' url: http://arxiv.org/abs/2201.08542v1
- title: 'Mitigating Gender Bias in Distilled Language Models via Counterfactual Role Reversal' - title: Mitigating Gender Bias in Distilled Language Models via Counterfactual Role
url: 'http://arxiv.org/abs/2203.12574v1' Reversal
- title: 'Quantifying the Carbon Emissions of Machine Learning' url: http://arxiv.org/abs/2203.12574v1
url: 'http://arxiv.org/abs/1910.09700v2' - title: Quantifying the Carbon Emissions of Machine Learning
- title: 'Distilling the Knowledge in a Neural Network' url: http://arxiv.org/abs/1910.09700v2
url: 'http://arxiv.org/abs/1503.02531v1' - title: Distilling the Knowledge in a Neural Network
IfTraining: 0 url: http://arxiv.org/abs/1503.02531v1
IfOnlineDemo: 0 Publisher: huggingface
\ No newline at end of file Task:
- sub_tag: 文本生成
sub_tag_en: Text Generation
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "72047643",
"metadata": {},
"source": [
"# DistilGPT2\n",
"\n",
"详细内容请看[GPT2 in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/gpt/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "20c299c9",
"metadata": {},
"source": [
"DistilGPT2 (short for Distilled-GPT2) is an English-language model pre-trained with the supervision of the smallest version of Generative Pre-trained Transformer 2 (GPT-2). Like GPT-2, DistilGPT2 can be used to generate text. Users of this model card should also consider information about the design, training, and limitations of GPT-2.\n"
]
},
{
"cell_type": "markdown",
"id": "c624b3d1",
"metadata": {},
"source": [
"## Model Details\n"
]
},
{
"cell_type": "markdown",
"id": "92002396",
"metadata": {},
"source": [
"- **Developed by:** Hugging Face\n",
"- **Model type:** Transformer-based Language Model\n",
"- **Language:** English\n",
"- **License:** Apache 2.0\n",
"- **Model Description:** DistilGPT2 is an English-language model pre-trained with the supervision of the 124 million parameter version of GPT-2. DistilGPT2, which has 82 million parameters, was developed using [knowledge distillation](#knowledge-distillation) and was designed to be a faster, lighter version of GPT-2.\n",
"- **Resources for more information:** See this repository for more about Distil\\* (a class of compressed models including Distilled-GPT2), [Sanh et al. (2019)](https://arxiv.org/abs/1910.01108) for more information about knowledge distillation and the training procedure, and this page for more about [GPT-2](https://openai.com/blog/better-language-models/).\n"
]
},
{
"cell_type": "markdown",
"id": "a1a84778",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f9c6043d",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a9f0754d",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"distilgpt2\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "03d3d465",
"metadata": {},
"source": [
"## Citation\n",
"\n",
"```\n",
"@inproceedings{sanh2019distilbert,\n",
"title={DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter},\n",
"author={Sanh, Victor and Debut, Lysandre and Chaumond, Julien and Wolf, Thomas},\n",
"booktitle={NeurIPS EMC^2 Workshop},\n",
"year={2019}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "7966636a",
"metadata": {},
"source": [
"## Glossary\n"
]
},
{
"cell_type": "markdown",
"id": "533038ef",
"metadata": {},
"source": [
"-\t<a name=\"knowledge-distillation\">**Knowledge Distillation**</a>: As described in [Sanh et al. (2019)](https://arxiv.org/pdf/1910.01108.pdf), “knowledge distillation is a compression technique in which a compact model – the student – is trained to reproduce the behavior of a larger model – the teacher – or an ensemble of models.” Also see [Bucila et al. (2006)](https://www.cs.cornell.edu/~caruana/compression.kdd06.pdf) and [Hinton et al. (2015)](https://arxiv.org/abs/1503.02531).\n"
]
},
{
"cell_type": "markdown",
"id": "a7ff7cc1",
"metadata": {},
"source": [
"<a href=\"https://huggingface.co/exbert/?model=distilgpt2\">\n",
"<img width=\"300px\" src=\"https://cdn-media.huggingface.co/exbert/button.png\">\n",
"</a>\n",
"\n",
"> 此模型介绍及权重来源于[https://huggingface.co/distilgpt2](https://huggingface.co/distilgpt2),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "1b34fb8a",
"metadata": {},
"source": [
"# DistilGPT2\n",
"\n",
"You can get more details from [GPT2 in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/gpt/README.md)."
]
},
{
"cell_type": "markdown",
"id": "f3ab8949",
"metadata": {},
"source": [
"DistilGPT2 (short for Distilled-GPT2) is an English-language model pre-trained with the supervision of the smallest version of Generative Pre-trained Transformer 2 (GPT-2). Like GPT-2, DistilGPT2 can be used to generate text. Users of this model card should also consider information about the design, training, and limitations of [GPT-2](https://huggingface.co/gpt2).\n"
]
},
{
"cell_type": "markdown",
"id": "c6fbc1da",
"metadata": {},
"source": [
"## Model Details\n"
]
},
{
"cell_type": "markdown",
"id": "e2929e2f",
"metadata": {},
"source": [
"- **Developed by:** Hugging Face\n",
"- **Model type:** Transformer-based Language Model\n",
"- **Language:** English\n",
"- **License:** Apache 2.0\n",
"- **Model Description:** DistilGPT2 is an English-language model pre-trained with the supervision of the 124 million parameter version of GPT-2. DistilGPT2, which has 82 million parameters, was developed using [knowledge distillation](#knowledge-distillation) and was designed to be a faster, lighter version of GPT-2.\n",
"- **Resources for more information:** See [this repository](https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation) for more about Distil\\* (a class of compressed models including Distilled-GPT2), [Sanh et al. (2019)](https://arxiv.org/abs/1910.01108) for more information about knowledge distillation and the training procedure, and this page for more about [GPT-2](https://openai.com/blog/better-language-models/).\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5e226406",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "51f32d75",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"distilgpt2\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "adb84dc8",
"metadata": {},
"source": [
"## Citation\n",
"\n",
"```\n",
"@inproceedings{sanh2019distilbert,\n",
"title={DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter},\n",
"author={Sanh, Victor and Debut, Lysandre and Chaumond, Julien and Wolf, Thomas},\n",
"booktitle={NeurIPS EMC^2 Workshop},\n",
"year={2019}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "7d2aaec2",
"metadata": {},
"source": [
"## Glossary\n"
]
},
{
"cell_type": "markdown",
"id": "004026dd",
"metadata": {},
"source": [
"-\t<a name=\"knowledge-distillation\">**Knowledge Distillation**</a>: As described in [Sanh et al. (2019)](https://arxiv.org/pdf/1910.01108.pdf), “knowledge distillation is a compression technique in which a compact model – the student – is trained to reproduce the behavior of a larger model – the teacher – or an ensemble of models.” Also see [Bucila et al. (2006)](https://www.cs.cornell.edu/~caruana/compression.kdd06.pdf) and [Hinton et al. (2015)](https://arxiv.org/abs/1503.02531).\n"
]
},
{
"cell_type": "markdown",
"id": "f8d12799",
"metadata": {},
"source": [
"<a href=\"https://huggingface.co/exbert/?model=distilgpt2\">\n",
"<img width=\"300px\" src=\"https://cdn-media.huggingface.co/exbert/button.png\">\n",
"</a>\n",
"\n",
"> The model introduction and model weights originate from [https://huggingface.co/distilgpt2](https://huggingface.co/distilgpt2) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: openwebtext
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info: Model_Info:
name: "distilroberta-base" description: Model Card for DistilRoBERTa base
description: "Model Card for DistilRoBERTa base" description_en: Model Card for DistilRoBERTa base
description_en: "Model Card for DistilRoBERTa base" from_repo: https://huggingface.co/distilroberta-base
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/distilroberta-base" name: distilroberta-base
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: "openwebtext"
Publisher: "huggingface"
License: "apache-2.0"
Language: "English"
Paper: Paper:
- title: 'DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter' - title: 'DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter'
url: 'http://arxiv.org/abs/1910.01108v4' url: http://arxiv.org/abs/1910.01108v4
- title: 'Quantifying the Carbon Emissions of Machine Learning' - title: Quantifying the Carbon Emissions of Machine Learning
url: 'http://arxiv.org/abs/1910.09700v2' url: http://arxiv.org/abs/1910.09700v2
IfTraining: 0 Publisher: huggingface
IfOnlineDemo: 0 Task:
\ No newline at end of file - sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "7f49bb4b",
"metadata": {},
"source": [
"# Model Card for DistilRoBERTa base\n"
]
},
{
"cell_type": "markdown",
"id": "88c832ab",
"metadata": {},
"source": [
"## Model Description\n"
]
},
{
"cell_type": "markdown",
"id": "3a2333a1",
"metadata": {},
"source": [
"This model is a distilled version of the RoBERTa-base model. It follows the same training procedure as DistilBERT.\n",
"The code for the distillation process can be found [here](https://github.com/huggingface/transformers/tree/master/examples/distillation).\n",
"This model is case-sensitive: it makes a difference between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "9ac70255",
"metadata": {},
"source": [
"The model has 6 layers, 768 dimension and 12 heads, totalizing 82M parameters (compared to 125M parameters for RoBERTa-base).\n",
"On average DistilRoBERTa is twice as fast as Roberta-base.\n"
]
},
{
"cell_type": "markdown",
"id": "a0757c23",
"metadata": {},
"source": [
"We encourage users of this model card to check out the RoBERTa-base model card to learn more about usage, limitations and potential biases.\n"
]
},
{
"cell_type": "markdown",
"id": "2865466d",
"metadata": {},
"source": [
"- **Developed by:** Victor Sanh, Lysandre Debut, Julien Chaumond, Thomas Wolf (Hugging Face)\n",
"- **Model type:** Transformer-based language model\n",
"- **Language(s) (NLP):** English\n",
"- **License:** Apache 2.0\n",
"- **Related Models:** RoBERTa-base model card\n",
"- [Associated Paper](https://arxiv.org/abs/1910.01108)\n"
]
},
{
"cell_type": "markdown",
"id": "a204fad3",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b2e488ed",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "43d7726b",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"distilroberta-base\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "e30fb0eb",
"metadata": {},
"source": [
"<a href=\"https://huggingface.co/exbert/?model=distilroberta-base\">\n",
"<img width=\"300px\" src=\"https://cdn-media.huggingface.co/exbert/button.png\">\n",
"</a>\n",
"\n",
"> 此模型介绍及权重来源于[https://huggingface.co/distilroberta-base](https://huggingface.co/distilroberta-base),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "4bd898ca",
"metadata": {},
"source": [
"# Model Card for DistilRoBERTa base\n"
]
},
{
"cell_type": "markdown",
"id": "7d39a086",
"metadata": {},
"source": [
"## Model Description\n"
]
},
{
"cell_type": "markdown",
"id": "e2043d14",
"metadata": {},
"source": [
"This model is a distilled version of the [RoBERTa-base model](https://huggingface.co/roberta-base). It follows the same training procedure as [DistilBERT](https://huggingface.co/distilbert-base-uncased).\n",
"The code for the distillation process can be found [here](https://github.com/huggingface/transformers/tree/master/examples/distillation).\n",
"This model is case-sensitive: it makes a difference between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "10aefe84",
"metadata": {},
"source": [
"The model has 6 layers, 768 dimension and 12 heads, totalizing 82M parameters (compared to 125M parameters for RoBERTa-base).\n",
"On average DistilRoBERTa is twice as fast as Roberta-base.\n"
]
},
{
"cell_type": "markdown",
"id": "d7ebd775",
"metadata": {},
"source": [
"We encourage users of this model card to check out the [RoBERTa-base model card](https://huggingface.co/roberta-base) to learn more about usage, limitations and potential biases.\n"
]
},
{
"cell_type": "markdown",
"id": "423d28b1",
"metadata": {},
"source": [
"- **Developed by:** Victor Sanh, Lysandre Debut, Julien Chaumond, Thomas Wolf (Hugging Face)\n",
"- **Model type:** Transformer-based language model\n",
"- **Language(s) (NLP):** English\n",
"- **License:** Apache 2.0\n",
"- **Related Models:** [RoBERTa-base model card](https://huggingface.co/roberta-base)\n",
"- **Resources for more information:**\n",
"- [GitHub Repository](https://github.com/huggingface/transformers/blob/main/examples/research_projects/distillation/README.md)\n",
"- [Associated Paper](https://arxiv.org/abs/1910.01108)\n"
]
},
{
"cell_type": "markdown",
"id": "715b4360",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9ad9b1a9",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "94e4d093",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"distilroberta-base\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "e258a20c",
"metadata": {},
"source": [
"<a href=\"https://huggingface.co/exbert/?model=distilroberta-base\">\n",
"<img width=\"300px\" src=\"https://cdn-media.huggingface.co/exbert/button.png\">\n",
"</a>\n",
"\n",
"> The model introduction and model weights originate from [https://huggingface.co/distilroberta-base](https://huggingface.co/distilroberta-base) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: mit
Model_Info: Model_Info:
name: "gpt2-large" description: GPT-2 Large
description: "GPT-2 Large" description_en: GPT-2 Large
description_en: "GPT-2 Large" from_repo: https://huggingface.co/gpt2-large
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/gpt2-large" name: gpt2-large
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text Generation"
sub_tag: "文本生成"
Example:
Datasets: ""
Publisher: "huggingface"
License: "mit"
Language: "English"
Paper: Paper:
- title: 'Quantifying the Carbon Emissions of Machine Learning' - title: Quantifying the Carbon Emissions of Machine Learning
url: 'http://arxiv.org/abs/1910.09700v2' url: http://arxiv.org/abs/1910.09700v2
IfTraining: 0 Publisher: huggingface
IfOnlineDemo: 0 Task:
\ No newline at end of file - sub_tag: 文本生成
sub_tag_en: Text Generation
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "32b8730a",
"metadata": {},
"source": [
"# GPT-2 Large\n",
"\n",
"详细内容请看[GPT2 in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/gpt/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "de66cac3",
"metadata": {},
"source": [
"## Table of Contents\n",
"- [Model Details](#model-details)\n",
"- [How To Get Started With the Model](#how-to-get-started-with-the-model)\n",
"- [Uses](#uses)\n",
"- [Risks, Limitations and Biases](#risks-limitations-and-biases)\n",
"- [Training](#training)\n",
"- [Evaluation](#evaluation)\n",
"- [Environmental Impact](#environmental-impact)\n",
"- [Technical Specifications](#technical-specifications)\n",
"- [Citation Information](#citation-information)\n",
"- [Model Card Authors](#model-card-author)\n"
]
},
{
"cell_type": "markdown",
"id": "8afa58ef",
"metadata": {},
"source": [
"## Model Details\n"
]
},
{
"cell_type": "markdown",
"id": "e4e46496",
"metadata": {},
"source": [
"**Model Description:** GPT-2 Large is the **774M parameter** version of GPT-2, a transformer-based language model created and released by OpenAI. The model is a pretrained model on English language using a causal language modeling (CLM) objective.\n"
]
},
{
"cell_type": "markdown",
"id": "15b8f634",
"metadata": {},
"source": [
"- **Developed by:** OpenAI, see [associated research paper](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf) and [GitHub repo](https://github.com/openai/gpt-2) for model developers.\n",
"- **Model Type:** Transformer-based language model\n",
"- **Language(s):** English\n",
"- **License:** [Modified MIT License](https://github.com/openai/gpt-2/blob/master/LICENSE)\n",
"- **Related Models:** [GPT-2](https://huggingface.co/gpt2), [GPT-Medium](https://huggingface.co/gpt2-medium) and [GPT-XL](https://huggingface.co/gpt2-xl)\n",
"- **Resources for more information:**\n",
"- [Research Paper](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf)\n",
"- [OpenAI Blog Post](https://openai.com/blog/better-language-models/)\n",
"- [GitHub Repo](https://github.com/openai/gpt-2)\n",
"- [OpenAI Model Card for GPT-2](https://github.com/openai/gpt-2/blob/master/model_card.md)\n",
"- Test the full generation capabilities here: https://transformer.huggingface.co/doc/gpt2-large\n"
]
},
{
"cell_type": "markdown",
"id": "6c2023d9",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b17e6efb",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "33c1f565",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"gpt2-large\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "8060d283",
"metadata": {},
"source": [
"## Citatioin\n",
"\n",
"```\n",
"@article{radford2019language,\n",
"title={Language models are unsupervised multitask learners},\n",
"author={Radford, Alec and Wu, Jeffrey and Child, Rewon and Luan, David and Amodei, Dario and Sutskever, Ilya and others},\n",
"journal={OpenAI blog},\n",
"volume={1},\n",
"number={8},\n",
"pages={9},\n",
"year={2019}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "083f0d9c",
"metadata": {},
"source": [
"## Model Card Authors\n"
]
},
{
"cell_type": "markdown",
"id": "f9e4bb43",
"metadata": {},
"source": [
"This model card was written by the Hugging Face team.\n",
"\n",
"> 此模型介绍及权重来源于[https://huggingface.co/gpt2-large](https://huggingface.co/gpt2-large),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "dc26013b",
"metadata": {},
"source": [
"# GPT-2 Large\n",
"\n",
"You can get more details from [GPT2 in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/gpt/README.md)."
]
},
{
"cell_type": "markdown",
"id": "38e29e37",
"metadata": {},
"source": [
"## Table of Contents\n",
"- [Model Details](#model-details)\n",
"- [How To Get Started With the Model](#how-to-get-started-with-the-model)\n",
"- [Uses](#uses)\n",
"- [Risks, Limitations and Biases](#risks-limitations-and-biases)\n",
"- [Training](#training)\n",
"- [Evaluation](#evaluation)\n",
"- [Environmental Impact](#environmental-impact)\n",
"- [Technical Specifications](#technical-specifications)\n",
"- [Citation Information](#citation-information)\n",
"- [Model Card Authors](#model-card-author)\n"
]
},
{
"cell_type": "markdown",
"id": "590c3fbd",
"metadata": {},
"source": [
"## Model Details\n"
]
},
{
"cell_type": "markdown",
"id": "1a2cd621",
"metadata": {},
"source": [
"**Model Description:** GPT-2 Large is the **774M parameter** version of GPT-2, a transformer-based language model created and released by OpenAI. The model is a pretrained model on English language using a causal language modeling (CLM) objective.\n"
]
},
{
"cell_type": "markdown",
"id": "0155f43f",
"metadata": {},
"source": [
"- **Developed by:** OpenAI, see [associated research paper](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf) and [GitHub repo](https://github.com/openai/gpt-2) for model developers.\n",
"- **Model Type:** Transformer-based language model\n",
"- **Language(s):** English\n",
"- **License:** [Modified MIT License](https://github.com/openai/gpt-2/blob/master/LICENSE)\n",
"- **Related Models:** https://huggingface.co/gpt2, GPT-Medium and GPT-XL\n",
"- **Resources for more information:**\n",
"- [Research Paper](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf)\n",
"- [OpenAI Blog Post](https://openai.com/blog/better-language-models/)\n",
"- [GitHub Repo](https://github.com/openai/gpt-2)\n",
"- [OpenAI Model Card for GPT-2](https://github.com/openai/gpt-2/blob/master/model_card.md)\n"
]
},
{
"cell_type": "markdown",
"id": "18e2772d",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "30207821",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2ae65fe6",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"gpt2-large\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "e8b7c92b",
"metadata": {},
"source": [
"## Citation\n",
"\n",
"```\n",
"@article{radford2019language,\n",
"title={Language models are unsupervised multitask learners},\n",
"author={Radford, Alec and Wu, Jeffrey and Child, Rewon and Luan, David and Amodei, Dario and Sutskever, Ilya and others},\n",
"journal={OpenAI blog},\n",
"volume={1},\n",
"number={8},\n",
"pages={9},\n",
"year={2019}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "7cded70d",
"metadata": {},
"source": [
"## Model Card Authors\n"
]
},
{
"cell_type": "markdown",
"id": "ff9ab2d4",
"metadata": {},
"source": [
"This model card was written by the Hugging Face team.\n",
"\n",
"> The model introduction and model weights originate from [https://huggingface.co/gpt2-large](https://huggingface.co/gpt2-large) and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: mit
Model_Info: Model_Info:
name: "gpt2-medium" description: GPT-2 Medium
description: "GPT-2 Medium" description_en: GPT-2 Medium
description_en: "GPT-2 Medium" from_repo: https://huggingface.co/gpt2-medium
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/gpt2-medium" name: gpt2-medium
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text Generation"
sub_tag: "文本生成"
Example:
Datasets: ""
Publisher: "huggingface"
License: "mit"
Language: "English"
Paper: Paper:
- title: 'Quantifying the Carbon Emissions of Machine Learning' - title: Quantifying the Carbon Emissions of Machine Learning
url: 'http://arxiv.org/abs/1910.09700v2' url: http://arxiv.org/abs/1910.09700v2
IfTraining: 0 Publisher: huggingface
IfOnlineDemo: 0 Task:
\ No newline at end of file - sub_tag: 文本生成
sub_tag_en: Text Generation
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "25324e9c",
"metadata": {},
"source": [
"# GPT-2 Medium\n",
"\n",
"详细内容请看[GPT2 in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/gpt/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "806177e3",
"metadata": {},
"source": [
"## Model Details\n"
]
},
{
"cell_type": "markdown",
"id": "dbcaecb0",
"metadata": {},
"source": [
"**Model Description:** GPT-2 Medium is the **355M parameter** version of GPT-2, a transformer-based language model created and released by OpenAI. The model is a pretrained model on English language using a causal language modeling (CLM) objective.\n"
]
},
{
"cell_type": "markdown",
"id": "ab73e9f0",
"metadata": {},
"source": [
"- **Developed by:** OpenAI, see [associated research paper](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf) and [GitHub repo](https://github.com/openai/gpt-2) for model developers.\n",
"- **Model Type:** Transformer-based language model\n",
"- **Language(s):** English\n",
"- **License:** [Modified MIT License](https://github.com/openai/gpt-2/blob/master/LICENSE)\n",
"- **Related Models:** [GPT2](https://huggingface.co/gpt2), [GPT2-Large](https://huggingface.co/gpt2-large) and [GPT2-XL](https://huggingface.co/gpt2-xl)\n",
"- **Resources for more information:**\n",
"- [Research Paper](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf)\n",
"- [OpenAI Blog Post](https://openai.com/blog/better-language-models/)\n",
"- [GitHub Repo](https://github.com/openai/gpt-2)\n",
"- [OpenAI Model Card for GPT-2](https://github.com/openai/gpt-2/blob/master/model_card.md)\n",
"- Test the full generation capabilities here: https://transformer.huggingface.co/doc/gpt2-large\n"
]
},
{
"cell_type": "markdown",
"id": "70c3fd36",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1bae5ee0",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "11b32577",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"gpt2-medium\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "08f90ea0",
"metadata": {},
"source": [
"## Citation\n",
"\n",
"```\n",
"@article{radford2019language,\n",
"title={Language models are unsupervised multitask learners},\n",
"author={Radford, Alec and Wu, Jeffrey and Child, Rewon and Luan, David and Amodei, Dario and Sutskever, Ilya and others},\n",
"journal={OpenAI blog},\n",
"volume={1},\n",
"number={8},\n",
"pages={9},\n",
"year={2019}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "64d79312",
"metadata": {},
"source": [
"## Model Card Authors\n"
]
},
{
"cell_type": "markdown",
"id": "d14dd2ac",
"metadata": {},
"source": [
"This model card was written by the Hugging Face team.\n",
"\n",
"> 此模型介绍及权重来源于 https://huggingface.co/gpt2-medium ,并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "46995787",
"metadata": {},
"source": [
"# GPT-2 Medium\n",
"\n",
"You can get more details from [GPT2 in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/gpt/README.md)."
]
},
{
"cell_type": "markdown",
"id": "f695ad73",
"metadata": {},
"source": [
"## Model Details\n"
]
},
{
"cell_type": "markdown",
"id": "5a8170d9",
"metadata": {},
"source": [
"**Model Description:** GPT-2 Medium is the **355M parameter** version of GPT-2, a transformer-based language model created and released by OpenAI. The model is a pretrained model on English language using a causal language modeling (CLM) objective.\n"
]
},
{
"cell_type": "markdown",
"id": "1d0dc244",
"metadata": {},
"source": [
"- **Developed by:** OpenAI, see [associated research paper](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf) and [GitHub repo](https://github.com/openai/gpt-2) for model developers.\n",
"- **Model Type:** Transformer-based language model\n",
"- **Language(s):** English\n",
"- **License:** [Modified MIT License](https://github.com/openai/gpt-2/blob/master/LICENSE)\n",
"- **Related Models:** [GPT2](https://huggingface.co/gpt2), [GPT2-Large](https://huggingface.co/gpt2-large) and [GPT2-XL](https://huggingface.co/gpt2-xl)\n",
"- **Resources for more information:**\n",
"- [Research Paper](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf)\n",
"- [OpenAI Blog Post](https://openai.com/blog/better-language-models/)\n",
"- [GitHub Repo](https://github.com/openai/gpt-2)\n",
"- [OpenAI Model Card for GPT-2](https://github.com/openai/gpt-2/blob/master/model_card.md)\n",
"- Test the full generation capabilities here: https://transformer.huggingface.co/doc/gpt2-large\n"
]
},
{
"cell_type": "markdown",
"id": "adc5a3f9",
"metadata": {},
"source": [
"## How to Get Started with the Model\n"
]
},
{
"cell_type": "markdown",
"id": "7566eafd",
"metadata": {},
"source": [
"Use the code below to get started with the model. \n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ab4c71ee",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b0167528",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"gpt2-medium\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "52cdcf9e",
"metadata": {},
"source": [
"## Citation\n",
"\n",
"```\n",
"@article{radford2019language,\n",
"title={Language models are unsupervised multitask learners},\n",
"author={Radford, Alec and Wu, Jeffrey and Child, Rewon and Luan, David and Amodei, Dario and Sutskever, Ilya and others},\n",
"journal={OpenAI blog},\n",
"volume={1},\n",
"number={8},\n",
"pages={9},\n",
"year={2019}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "eb327c10",
"metadata": {},
"source": [
"## Model Card Authors\n"
]
},
{
"cell_type": "markdown",
"id": "50fb7de8",
"metadata": {},
"source": [
"> The model introduction and model weights originate from https://huggingface.co/gpt2-medium and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: mit
Model_Info: Model_Info:
name: "gpt2" description: GPT-2
description: "GPT-2" description_en: GPT-2
description_en: "GPT-2" from_repo: https://huggingface.co/gpt2
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/gpt2" name: gpt2
Paper: null
Publisher: huggingface
Task: Task:
- tag_en: "Natural Language Processing" - sub_tag: 文本生成
tag: "自然语言处理" sub_tag_en: Text Generation
sub_tag_en: "Text Generation" tag: 自然语言处理
sub_tag: "文本生成" tag_en: Natural Language Processing
Example:
Datasets: ""
Publisher: "huggingface"
License: "mit"
Language: "English"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
{
"cells": [
{
"cell_type": "markdown",
"id": "a4cd103f",
"metadata": {},
"source": [
"# GPT-2\n",
"\n",
"详细内容请看[GPT2 in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/gpt/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "e10dfe6d",
"metadata": {},
"source": [
"Pretrained model on English language using a causal language modeling (CLM) objective. It was introduced in\n",
"[this paper](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf)\n",
"and first released at [this page](https://openai.com/blog/better-language-models/).\n"
]
},
{
"cell_type": "markdown",
"id": "d1b13043",
"metadata": {},
"source": [
"Disclaimer: The team releasing GPT-2 also wrote a\n",
"[model card](https://github.com/openai/gpt-2/blob/master/model_card.md) for their model. Content from this model card\n",
"has been written by the Hugging Face team to complete the information they provided and give specific examples of bias.\n"
]
},
{
"cell_type": "markdown",
"id": "016271a5",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "e3a53155",
"metadata": {},
"source": [
"GPT-2 is a transformers model pretrained on a very large corpus of English data in a self-supervised fashion. This\n",
"means it was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots\n",
"of publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely,\n",
"it was trained to guess the next word in sentences.\n"
]
},
{
"cell_type": "markdown",
"id": "6836ad17",
"metadata": {},
"source": [
"More precisely, inputs are sequences of continuous text of a certain length and the targets are the same sequence,\n",
"shifted one token (word or piece of word) to the right. The model uses internally a mask-mechanism to make sure the\n",
"predictions for the token `i` only uses the inputs from `1` to `i` but not the future tokens.\n"
]
},
{
"cell_type": "markdown",
"id": "26946ce6",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the English language that can then be used to extract features\n",
"useful for downstream tasks. The model is best at what it was pretrained for however, which is generating texts from a\n",
"prompt.\n"
]
},
{
"cell_type": "markdown",
"id": "571b41cf",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a6233e8e",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2e906136",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"gpt2\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "78f26b7f",
"metadata": {},
"source": [
"## Citation\n",
"\n",
"```\n",
"@article{radford2019language,\n",
"title={Language Models are Unsupervised Multitask Learners},\n",
"author={Radford, Alec and Wu, Jeff and Child, Rewon and Luan, David and Amodei, Dario and Sutskever, Ilya},\n",
"year={2019}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "2f646c57",
"metadata": {},
"source": [
"<a href=\"https://huggingface.co/exbert/?model=gpt2\">\n",
"<img width=\"300px\" src=\"https://cdn-media.huggingface.co/exbert/button.png\">\n",
"</a>\n",
"\n",
"> 此模型介绍及权重来源于[https://huggingface.co/gpt2](https://huggingface.co/gpt2),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
},
"vscode": {
"interpreter": {
"hash": "606ea184b8fed3419d714b545dc1784fad6c99d0cc940b6b9d787dccf225faa5"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "2d373572",
"metadata": {},
"source": [
"# GPT-2\n",
"\n",
"You can get more details from [GPT2 in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/gpt/README.md)."
]
},
{
"cell_type": "markdown",
"id": "00be5831",
"metadata": {},
"source": [
"Test the whole generation capabilities here: https://transformer.huggingface.co/doc/gpt2-large\n"
]
},
{
"cell_type": "markdown",
"id": "b5857cc2",
"metadata": {},
"source": [
"Pretrained model on English language using a causal language modeling (CLM) objective. It was introduced in\n",
"[this paper](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf)\n",
"and first released at [this page](https://openai.com/blog/better-language-models/).\n"
]
},
{
"cell_type": "markdown",
"id": "b0abac76",
"metadata": {},
"source": [
"Disclaimer: The team releasing GPT-2 also wrote a\n",
"[model card](https://github.com/openai/gpt-2/blob/master/model_card.md) for their model. Content from this model card\n",
"has been written by the Hugging Face team to complete the information they provided and give specific examples of bias.\n"
]
},
{
"cell_type": "markdown",
"id": "fa2c7f4b",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "294521bd",
"metadata": {},
"source": [
"GPT-2 is a transformers model pretrained on a very large corpus of English data in a self-supervised fashion. This\n",
"means it was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots\n",
"of publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely,\n",
"it was trained to guess the next word in sentences.\n"
]
},
{
"cell_type": "markdown",
"id": "b1204c32",
"metadata": {},
"source": [
"More precisely, inputs are sequences of continuous text of a certain length and the targets are the same sequence,\n",
"shifted one token (word or piece of word) to the right. The model uses internally a mask-mechanism to make sure the\n",
"predictions for the token `i` only uses the inputs from `1` to `i` but not the future tokens.\n"
]
},
{
"cell_type": "markdown",
"id": "a019cc9e",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the English language that can then be used to extract features\n",
"useful for downstream tasks. The model is best at what it was pretrained for however, which is generating texts from a\n",
"prompt.\n"
]
},
{
"cell_type": "markdown",
"id": "54ae8500",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d33fddda",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d0e160c6",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"gpt2\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "fcb8a843",
"metadata": {},
"source": [
"## Citation\n",
"\n",
"```\n",
"@article{radford2019language,\n",
"title={Language Models are Unsupervised Multitask Learners},\n",
"author={Radford, Alec and Wu, Jeff and Child, Rewon and Luan, David and Amodei, Dario and Sutskever, Ilya},\n",
"year={2019}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "513848f8",
"metadata": {},
"source": [
"<a href=\"https://huggingface.co/exbert/?model=gpt2\">\n",
"<img width=\"300px\" src=\"https://cdn-media.huggingface.co/exbert/button.png\">\n",
"</a>\n",
"\n",
"\n",
"> The model introduction and model weights originate from [https://huggingface.co/gpt2](https://huggingface.co/gpt2) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: indonlu
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: Indonesian
License: mit
Model_Info: Model_Info:
name: "indobenchmark/indobert-base-p1" description: IndoBERT Base Model (phase1 - uncased)
description: "IndoBERT Base Model (phase1 - uncased)" description_en: IndoBERT Base Model (phase1 - uncased)
description_en: "IndoBERT Base Model (phase1 - uncased)" from_repo: https://huggingface.co/indobenchmark/indobert-base-p1
icon: "" icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
from_repo: "https://huggingface.co/indobenchmark/indobert-base-p1" name: indobenchmark/indobert-base-p1
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Feature Extraction"
sub_tag: "特征抽取"
Example:
Datasets: "indonlu"
Publisher: "indobenchmark"
License: "mit"
Language: "Indonesian"
Paper: Paper:
- title: 'IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding' - title: 'IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language
url: 'http://arxiv.org/abs/2009.05387v3' Understanding'
IfTraining: 0 url: http://arxiv.org/abs/2009.05387v3
IfOnlineDemo: 0 Publisher: indobenchmark
\ No newline at end of file Task:
- sub_tag: 特征抽取
sub_tag_en: Feature Extraction
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "3f5a12e4",
"metadata": {},
"source": [
"# IndoBERT Base Model (phase1 - uncased)\n"
]
},
{
"cell_type": "markdown",
"id": "e2fcac01",
"metadata": {},
"source": [
"[IndoBERT](https://arxiv.org/abs/2009.05387) is a state-of-the-art language model for Indonesian based on the BERT model. The pretrained model is trained using a masked language modeling (MLM) objective and next sentence prediction (NSP) objective.\n"
]
},
{
"cell_type": "markdown",
"id": "6a9d6a02",
"metadata": {},
"source": [
"## All Pre-trained Models\n"
]
},
{
"cell_type": "markdown",
"id": "3020975b",
"metadata": {},
"source": [
"| Model | #params | Arch. | Training data |\n",
"|--------------------------------|--------------------------------|-------|-----------------------------------|\n",
"| `indobenchmark/indobert-base-p1` | 124.5M | Base | Indo4B (23.43 GB of text) |\n",
"| `indobenchmark/indobert-base-p2` | 124.5M | Base | Indo4B (23.43 GB of text) |\n",
"| `indobenchmark/indobert-large-p1` | 335.2M | Large | Indo4B (23.43 GB of text) |\n",
"| `indobenchmark/indobert-large-p2` | 335.2M | Large | Indo4B (23.43 GB of text) |\n",
"| `indobenchmark/indobert-lite-base-p1` | 11.7M | Base | Indo4B (23.43 GB of text) |\n",
"| `indobenchmark/indobert-lite-base-p2` | 11.7M | Base | Indo4B (23.43 GB of text) |\n",
"| `indobenchmark/indobert-lite-large-p1` | 17.7M | Large | Indo4B (23.43 GB of text) |\n",
"| `indobenchmark/indobert-lite-large-p2` | 17.7M | Large | Indo4B (23.43 GB of text) |\n"
]
},
{
"cell_type": "markdown",
"id": "d0e3771a",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f1f38760",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a11bc38f",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"indobenchmark/indobert-base-p1\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "d1fe4366",
"metadata": {},
"source": [
"## Citation\n",
"\n",
"```\n",
"@inproceedings{wilie2020indonlu,\n",
"title={IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding},\n",
"author={Bryan Wilie and Karissa Vincentio and Genta Indra Winata and Samuel Cahyawijaya and X. Li and Zhi Yuan Lim and S. Soleman and R. Mahendra and Pascale Fung and Syafri Bahar and A. Purwarianti},\n",
"booktitle={Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing},\n",
"year={2020}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "95f83dc9",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/indobenchmark/indobert-base-p1](https://huggingface.co/indobenchmark/indobert-base-p1),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "d6793868",
"metadata": {},
"source": [
"# IndoBERT Base Model (phase1 - uncased)\n"
]
},
{
"cell_type": "markdown",
"id": "48b35590",
"metadata": {},
"source": [
"[IndoBERT](https://arxiv.org/abs/2009.05387) is a state-of-the-art language model for Indonesian based on the BERT model. The pretrained model is trained using a masked language modeling (MLM) objective and next sentence prediction (NSP) objective.\n"
]
},
{
"cell_type": "markdown",
"id": "e5dc323c",
"metadata": {},
"source": [
"## All Pre-trained Models\n"
]
},
{
"cell_type": "markdown",
"id": "7db5d6e5",
"metadata": {},
"source": [
"| Model | #params | Arch. | Training data |\n",
"|--------------------------------|--------------------------------|-------|-----------------------------------|\n",
"| `indobenchmark/indobert-base-p1` | 124.5M | Base | Indo4B (23.43 GB of text) |\n",
"| `indobenchmark/indobert-base-p2` | 124.5M | Base | Indo4B (23.43 GB of text) |\n",
"| `indobenchmark/indobert-large-p1` | 335.2M | Large | Indo4B (23.43 GB of text) |\n",
"| `indobenchmark/indobert-large-p2` | 335.2M | Large | Indo4B (23.43 GB of text) |\n",
"| `indobenchmark/indobert-lite-base-p1` | 11.7M | Base | Indo4B (23.43 GB of text) |\n",
"| `indobenchmark/indobert-lite-base-p2` | 11.7M | Base | Indo4B (23.43 GB of text) |\n",
"| `indobenchmark/indobert-lite-large-p1` | 17.7M | Large | Indo4B (23.43 GB of text) |\n",
"| `indobenchmark/indobert-lite-large-p2` | 17.7M | Large | Indo4B (23.43 GB of text) |\n"
]
},
{
"cell_type": "markdown",
"id": "fc8827fd",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e5b6e205",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6701163d",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"indobenchmark/indobert-base-p1\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "fb28cf5b",
"metadata": {},
"source": [
"## Citation\n",
"\n",
"```\n",
"@inproceedings{wilie2020indonlu,\n",
"title={IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding},\n",
"author={Bryan Wilie and Karissa Vincentio and Genta Indra Winata and Samuel Cahyawijaya and X. Li and Zhi Yuan Lim and S. Soleman and R. Mahendra and Pascale Fung and Syafri Bahar and A. Purwarianti},\n",
"booktitle={Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing},\n",
"year={2020}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "e155d1ce",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/indobenchmark/indobert-base-p1](https://huggingface.co/indobenchmark/indobert-base-p1) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
# 模型列表
## johngiorgi/declutr-base
| 模型名称 | 模型介绍 | 模型大小 | 模型下载 |
| --- | --- | --- | --- |
|johngiorgi/declutr-base| | 625.22MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/johngiorgi/declutr-base/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/johngiorgi/declutr-base/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/johngiorgi/declutr-base/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/johngiorgi/declutr-base/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/johngiorgi/declutr-base/vocab.json)<br>[vocab.txt](https://bj.bcebos.com/paddlenlp/models/community/johngiorgi/declutr-base/vocab.txt) |
也可以通过`paddlenlp` cli 工具来下载对应的模型权重,使用步骤如下所示:
* 安装paddlenlp
```shell
pip install --upgrade paddlenlp
```
* 下载命令行
```shell
paddlenlp download --cache-dir ./pretrained_models johngiorgi/declutr-base
```
有任何下载的问题都可以到[PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP)中发Issue提问。
\ No newline at end of file
# model list
##
| model | description | model_size | download |
| --- | --- | --- | --- |
|johngiorgi/declutr-base| | 625.22MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/johngiorgi/declutr-base/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/johngiorgi/declutr-base/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/johngiorgi/declutr-base/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/johngiorgi/declutr-base/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/johngiorgi/declutr-base/vocab.json)<br>[vocab.txt](https://bj.bcebos.com/paddlenlp/models/community/johngiorgi/declutr-base/vocab.txt) |
or you can download all of model file with the following steps:
* install paddlenlp
```shell
pip install --upgrade paddlenlp
```
* download model with cli tool
```shell
paddlenlp download --cache-dir ./pretrained_models johngiorgi/declutr-base
```
If you have any problems with it, you can post issue on [PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP) to get support.
Model_Info:
name: "johngiorgi/declutr-base"
description: "DeCLUTR-base"
description_en: "DeCLUTR-base"
icon: ""
from_repo: "https://huggingface.co/johngiorgi/declutr-base"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Sentence Similarity"
sub_tag: "句子相似度"
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Feature Extraction"
sub_tag: "特征抽取"
Example:
Datasets: "openwebtext"
Publisher: "johngiorgi"
License: "apache-2.0"
Language: "English"
Paper:
- title: 'DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations'
url: 'http://arxiv.org/abs/2006.03659v4'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册