未验证 提交 1352e3d3 编写于 作者: 骑马小猫 提交者: GitHub

add paddlenlp community models (#5660)

* update project

* update icon and keyword
上级 747a474a
# 模型列表
## CLTL/MedRoBERTa.nl
| 模型名称 | 模型介绍 | 模型大小 | 模型下载 |
| --- | --- | --- | --- |
|CLTL/MedRoBERTa.nl| | 633.14MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/CLTL/MedRoBERTa.nl/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/CLTL/MedRoBERTa.nl/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/CLTL/MedRoBERTa.nl/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/CLTL/MedRoBERTa.nl/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/CLTL/MedRoBERTa.nl/vocab.json)<br>[vocab.txt](https://bj.bcebos.com/paddlenlp/models/community/CLTL/MedRoBERTa.nl/vocab.txt) |
也可以通过`paddlenlp` cli 工具来下载对应的模型权重,使用步骤如下所示:
* 安装paddlenlp
```shell
pip install --upgrade paddlenlp
```
* 下载命令行
```shell
paddlenlp download --cache-dir ./pretrained_models CLTL/MedRoBERTa.nl
```
有任何下载的问题都可以到[PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP)中发Issue提问。
\ No newline at end of file
# model list
##
| model | description | model_size | download |
| --- | --- | --- | --- |
|CLTL/MedRoBERTa.nl| | 633.14MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/CLTL/MedRoBERTa.nl/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/CLTL/MedRoBERTa.nl/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/CLTL/MedRoBERTa.nl/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/CLTL/MedRoBERTa.nl/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/CLTL/MedRoBERTa.nl/vocab.json)<br>[vocab.txt](https://bj.bcebos.com/paddlenlp/models/community/CLTL/MedRoBERTa.nl/vocab.txt) |
or you can download all of model file with the following steps:
* install paddlenlp
```shell
pip install --upgrade paddlenlp
```
* download model with cli tool
```shell
paddlenlp download --cache-dir ./pretrained_models CLTL/MedRoBERTa.nl
```
If you have any problems with it, you can post issue on [PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP) to get support.
Model_Info:
name: "CLTL/MedRoBERTa.nl"
description: "MedRoBERTa.nl"
description_en: "MedRoBERTa.nl"
icon: ""
from_repo: "https://huggingface.co/CLTL/MedRoBERTa.nl"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: ""
Publisher: "CLTL"
License: "mit"
Language: "Dutch"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
Datasets: conll2003
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: mit
Model_Info:
name: "Jean-Baptiste/roberta-large-ner-english"
description: "roberta-large-ner-english: model fine-tuned from roberta-large for NER task"
description_en: "roberta-large-ner-english: model fine-tuned from roberta-large for NER task"
icon: ""
from_repo: "https://huggingface.co/Jean-Baptiste/roberta-large-ner-english"
description: 'roberta-large-ner-english: model fine-tuned from roberta-large for
NER task'
description_en: 'roberta-large-ner-english: model fine-tuned from roberta-large
for NER task'
from_repo: https://huggingface.co/Jean-Baptiste/roberta-large-ner-english
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: Jean-Baptiste/roberta-large-ner-english
Paper: null
Publisher: Jean-Baptiste
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Token Classification"
sub_tag: "Token分类"
Example:
Datasets: "conll2003"
Publisher: "Jean-Baptiste"
License: "mit"
Language: "English"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- sub_tag: Token分类
sub_tag_en: Token Classification
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "743f4950",
"metadata": {},
"source": [
"# roberta-large-ner-english: model fine-tuned from roberta-large for NER task\n"
]
},
{
"cell_type": "markdown",
"id": "0d517a6d",
"metadata": {},
"source": [
"## Introduction\n"
]
},
{
"cell_type": "markdown",
"id": "bbb5e934",
"metadata": {},
"source": [
"roberta-large-ner-english is an english NER model that was fine-tuned from roberta-large on conll2003 dataset.\n",
"Model was validated on emails/chat data and outperformed other models on this type of data specifically.\n",
"In particular the model seems to work better on entity that don't start with an upper case.\n"
]
},
{
"cell_type": "markdown",
"id": "a13117c3",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b9e58955",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "db077413",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"Jean-Baptiste/roberta-large-ner-english\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "86ae5e96",
"metadata": {},
"source": [
"For those who could be interested, here is a short article on how I used the results of this model to train a LSTM model for signature detection in emails:\n",
"https://medium.com/@jean-baptiste.polle/lstm-model-for-email-signature-detection-8e990384fefa\n",
"\n",
"\n",
"> 此模型介绍及权重来源于[https://huggingface.co/Jean-Baptiste/roberta-large-ner-english](https://huggingface.co/Jean-Baptiste/roberta-large-ner-english),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "b0541e6a",
"metadata": {},
"source": [
"# roberta-large-ner-english: model fine-tuned from roberta-large for NER task\n"
]
},
{
"cell_type": "markdown",
"id": "c85540d7",
"metadata": {},
"source": [
"## Introduction\n"
]
},
{
"cell_type": "markdown",
"id": "c2e2ebde",
"metadata": {},
"source": [
"roberta-large-ner-english is an english NER model that was fine-tuned from roberta-large on conll2003 dataset.\n",
"Model was validated on emails/chat data and outperformed other models on this type of data specifically.\n",
"In particular the model seems to work better on entity that don't start with an upper case.\n"
]
},
{
"cell_type": "markdown",
"id": "4f6d5dbe",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a159cf92",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "daa60299",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"Jean-Baptiste/roberta-large-ner-english\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "2a66154e",
"metadata": {},
"source": [
"For those who could be interested, here is a short article on how I used the results of this model to train a LSTM model for signature detection in emails:\n",
"https://medium.com/@jean-baptiste.polle/lstm-model-for-email-signature-detection-8e990384fefa\n",
"\n",
"> The model introduction and model weights originate from https://huggingface.co/Jean-Baptiste/roberta-large-ner-english and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: Chinese
License: apache-2.0
Model_Info:
name: "Langboat/mengzi-bert-base-fin"
description: "Mengzi-BERT base fin model (Chinese)"
description_en: "Mengzi-BERT base fin model (Chinese)"
icon: ""
from_repo: "https://huggingface.co/Langboat/mengzi-bert-base-fin"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: ""
Publisher: "Langboat"
License: "apache-2.0"
Language: "Chinese"
description: Mengzi-BERT base fin model (Chinese)
description_en: Mengzi-BERT base fin model (Chinese)
from_repo: https://huggingface.co/Langboat/mengzi-bert-base-fin
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: Langboat/mengzi-bert-base-fin
Paper:
- title: 'Mengzi: Towards Lightweight yet Ingenious Pre-trained Models for Chinese'
url: 'http://arxiv.org/abs/2110.06696v2'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- title: 'Mengzi: Towards Lightweight yet Ingenious Pre-trained Models for Chinese'
url: http://arxiv.org/abs/2110.06696v2
Publisher: Langboat
Task:
- sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "18d5c43e",
"metadata": {},
"source": [
"# Mengzi-BERT base fin model (Chinese)\n",
"Continue trained mengzi-bert-base with 20G financial news and research reports. Masked language modeling(MLM), part-of-speech(POS) tagging and sentence order prediction(SOP) are used as training task.\n"
]
},
{
"cell_type": "markdown",
"id": "9aa78f76",
"metadata": {},
"source": [
"[Mengzi: Towards Lightweight yet Ingenious Pre-trained Models for Chinese](https://arxiv.org/abs/2110.06696)\n"
]
},
{
"cell_type": "markdown",
"id": "12bbac99",
"metadata": {},
"source": [
"## Usage\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3b18fe48",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1bb0e345",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"Langboat/mengzi-bert-base-fin\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "a8d785f4",
"metadata": {},
"source": [
"```\n",
"@misc{zhang2021mengzi,\n",
"title={Mengzi: Towards Lightweight yet Ingenious Pre-trained Models for Chinese},\n",
"author={Zhuosheng Zhang and Hanqing Zhang and Keming Chen and Yuhang Guo and Jingyun Hua and Yulong Wang and Ming Zhou},\n",
"year={2021},\n",
"eprint={2110.06696},\n",
"archivePrefix={arXiv},\n",
"primaryClass={cs.CL}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "ceb1547c",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/Langboat/mengzi-bert-base-fin](https://huggingface.co/Langboat/mengzi-bert-base-fin),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "752656a4",
"metadata": {},
"source": [
"# Mengzi-BERT base fin model (Chinese)\n",
"Continue trained mengzi-bert-base with 20G financial news and research reports. Masked language modeling(MLM), part-of-speech(POS) tagging and sentence order prediction(SOP) are used as training task.\n"
]
},
{
"cell_type": "markdown",
"id": "26c65092",
"metadata": {},
"source": [
"[Mengzi: Towards Lightweight yet Ingenious Pre-trained Models for Chinese](https://arxiv.org/abs/2110.06696)\n"
]
},
{
"cell_type": "markdown",
"id": "ea5404c7",
"metadata": {},
"source": [
"## Usage\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ebeb5daa",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d2c66056",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"Langboat/mengzi-bert-base-fin\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "a39809dc",
"metadata": {},
"source": [
"```\n",
"@misc{zhang2021mengzi,\n",
"title={Mengzi: Towards Lightweight yet Ingenious Pre-trained Models for Chinese},\n",
"author={Zhuosheng Zhang and Hanqing Zhang and Keming Chen and Yuhang Guo and Jingyun Hua and Yulong Wang and Ming Zhou},\n",
"year={2021},\n",
"eprint={2110.06696},\n",
"archivePrefix={arXiv},\n",
"primaryClass={cs.CL}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "f25bda96",
"metadata": {},
"source": [
"> The model introduction and model weights originate from https://huggingface.co/Langboat/mengzi-bert-base-fin and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
# 模型列表
## PlanTL-GOB-ES/roberta-base-biomedical-clinical-es
| 模型名称 | 模型介绍 | 模型大小 | 模型下载 |
| --- | --- | --- | --- |
|PlanTL-GOB-ES/roberta-base-biomedical-clinical-es| | 633.14MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-clinical-es/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-clinical-es/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-clinical-es/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-clinical-es/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-clinical-es/vocab.json)<br>[vocab.txt](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-clinical-es/vocab.txt) |
也可以通过`paddlenlp` cli 工具来下载对应的模型权重,使用步骤如下所示:
* 安装paddlenlp
```shell
pip install --upgrade paddlenlp
```
* 下载命令行
```shell
paddlenlp download --cache-dir ./pretrained_models PlanTL-GOB-ES/roberta-base-biomedical-clinical-es
```
有任何下载的问题都可以到[PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP)中发Issue提问。
\ No newline at end of file
# model list
##
| model | description | model_size | download |
| --- | --- | --- | --- |
|PlanTL-GOB-ES/roberta-base-biomedical-clinical-es| | 633.14MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-clinical-es/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-clinical-es/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-clinical-es/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-clinical-es/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-clinical-es/vocab.json)<br>[vocab.txt](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-clinical-es/vocab.txt) |
or you can download all of model file with the following steps:
* install paddlenlp
```shell
pip install --upgrade paddlenlp
```
* download model with cli tool
```shell
paddlenlp download --cache-dir ./pretrained_models PlanTL-GOB-ES/roberta-base-biomedical-clinical-es
```
If you have any problems with it, you can post issue on [PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP) to get support.
Model_Info:
name: "PlanTL-GOB-ES/roberta-base-biomedical-clinical-es"
description: "Biomedical-clinical language model for Spanish"
description_en: "Biomedical-clinical language model for Spanish"
icon: ""
from_repo: "https://huggingface.co/PlanTL-GOB-ES/roberta-base-biomedical-clinical-es"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: ""
Publisher: "PlanTL-GOB-ES"
License: "apache-2.0"
Language: "Spanish"
Paper:
- title: 'Biomedical and Clinical Language Models for Spanish: On the Benefits of Domain-Specific Pretraining in a Mid-Resource Scenario'
url: 'http://arxiv.org/abs/2109.03570v2'
- title: 'Spanish Biomedical Crawled Corpus: A Large, Diverse Dataset for Spanish Biomedical Language Models'
url: 'http://arxiv.org/abs/2109.07765v1'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
# 模型列表
## PlanTL-GOB-ES/roberta-base-biomedical-es
| 模型名称 | 模型介绍 | 模型大小 | 模型下载 |
| --- | --- | --- | --- |
|PlanTL-GOB-ES/roberta-base-biomedical-es| | 633.14MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-es/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-es/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-es/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-es/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-es/vocab.json) |
也可以通过`paddlenlp` cli 工具来下载对应的模型权重,使用步骤如下所示:
* 安装paddlenlp
```shell
pip install --upgrade paddlenlp
```
* 下载命令行
```shell
paddlenlp download --cache-dir ./pretrained_models PlanTL-GOB-ES/roberta-base-biomedical-es
```
有任何下载的问题都可以到[PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP)中发Issue提问。
\ No newline at end of file
# model list
##
| model | description | model_size | download |
| --- | --- | --- | --- |
|PlanTL-GOB-ES/roberta-base-biomedical-es| | 633.14MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-es/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-es/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-es/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-es/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-biomedical-es/vocab.json) |
or you can download all of model file with the following steps:
* install paddlenlp
```shell
pip install --upgrade paddlenlp
```
* download model with cli tool
```shell
paddlenlp download --cache-dir ./pretrained_models PlanTL-GOB-ES/roberta-base-biomedical-es
```
If you have any problems with it, you can post issue on [PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP) to get support.
Model_Info:
name: "PlanTL-GOB-ES/roberta-base-biomedical-es"
description: "Biomedical language model for Spanish"
description_en: "Biomedical language model for Spanish"
icon: ""
from_repo: "https://huggingface.co/PlanTL-GOB-ES/roberta-base-biomedical-es"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: ""
Publisher: "PlanTL-GOB-ES"
License: "apache-2.0"
Language: "Spanish"
Paper:
- title: 'Biomedical and Clinical Language Models for Spanish: On the Benefits of Domain-Specific Pretraining in a Mid-Resource Scenario'
url: 'http://arxiv.org/abs/2109.03570v2'
- title: 'Spanish Biomedical Crawled Corpus: A Large, Diverse Dataset for Spanish Biomedical Language Models'
url: 'http://arxiv.org/abs/2109.07765v1'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
# 模型列表
## PlanTL-GOB-ES/roberta-base-ca
| 模型名称 | 模型介绍 | 模型大小 | 模型下载 |
| --- | --- | --- | --- |
|PlanTL-GOB-ES/roberta-base-ca| | 633.14MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-ca/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-ca/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-ca/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-ca/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-ca/vocab.json) |
也可以通过`paddlenlp` cli 工具来下载对应的模型权重,使用步骤如下所示:
* 安装paddlenlp
```shell
pip install --upgrade paddlenlp
```
* 下载命令行
```shell
paddlenlp download --cache-dir ./pretrained_models PlanTL-GOB-ES/roberta-base-ca
```
有任何下载的问题都可以到[PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP)中发Issue提问。
\ No newline at end of file
# model list
##
| model | description | model_size | download |
| --- | --- | --- | --- |
|PlanTL-GOB-ES/roberta-base-ca| | 633.14MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-ca/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-ca/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-ca/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-ca/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/PlanTL-GOB-ES/roberta-base-ca/vocab.json) |
or you can download all of model file with the following steps:
* install paddlenlp
```shell
pip install --upgrade paddlenlp
```
* download model with cli tool
```shell
paddlenlp download --cache-dir ./pretrained_models PlanTL-GOB-ES/roberta-base-ca
```
If you have any problems with it, you can post issue on [PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP) to get support.
Model_Info:
name: "PlanTL-GOB-ES/roberta-base-ca"
description: "BERTa: RoBERTa-based Catalan language model"
description_en: "BERTa: RoBERTa-based Catalan language model"
icon: ""
from_repo: "https://huggingface.co/PlanTL-GOB-ES/roberta-base-ca"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: ""
Publisher: "PlanTL-GOB-ES"
License: "apache-2.0"
Language: "Catalan"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
Datasets: xnli
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: Spanish
License: mit
Model_Info:
name: "Recognai/bert-base-spanish-wwm-cased-xnli"
description: "bert-base-spanish-wwm-cased-xnli"
description_en: "bert-base-spanish-wwm-cased-xnli"
icon: ""
from_repo: "https://huggingface.co/Recognai/bert-base-spanish-wwm-cased-xnli"
description: bert-base-spanish-wwm-cased-xnli
description_en: bert-base-spanish-wwm-cased-xnli
from_repo: https://huggingface.co/Recognai/bert-base-spanish-wwm-cased-xnli
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: Recognai/bert-base-spanish-wwm-cased-xnli
Paper: null
Publisher: Recognai
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Zero-Shot Classification"
sub_tag: "零样本分类"
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text Classification"
sub_tag: "文本分类"
Example:
Datasets: "xnli"
Publisher: "Recognai"
License: "mit"
Language: "Spanish"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- sub_tag: 零样本分类
sub_tag_en: Zero-Shot Classification
tag: 自然语言处理
tag_en: Natural Language Processing
- sub_tag: 文本分类
sub_tag_en: Text Classification
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "0b1e9532",
"metadata": {},
"source": [
"# bert-base-spanish-wwm-cased-xnli\n"
]
},
{
"cell_type": "markdown",
"id": "2b09a9af",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "e348457b",
"metadata": {},
"source": [
"This model is a fine-tuned version of the [spanish BERT model](https://huggingface.co/dccuchile/bert-base-spanish-wwm-cased) with the Spanish portion of the XNLI dataset. \n"
]
},
{
"cell_type": "markdown",
"id": "6643a3b7",
"metadata": {},
"source": [
"### How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8475d429",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ced3e559",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"Recognai/bert-base-spanish-wwm-cased-xnli\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "47419faf",
"metadata": {},
"source": [
"## Eval results\n"
]
},
{
"cell_type": "markdown",
"id": "9b87e64b",
"metadata": {},
"source": [
"Accuracy for the test set:\n"
]
},
{
"cell_type": "markdown",
"id": "7be74f6f",
"metadata": {},
"source": [
"| | XNLI-es |\n",
"|-----------------------------|---------|\n",
"|bert-base-spanish-wwm-cased-xnli | 79.9% |\n",
"> 此模型介绍及权重来源于[https://huggingface.co/Recognai/bert-base-spanish-wwm-cased-xnli](https://huggingface.co/Recognai/bert-base-spanish-wwm-cased-xnli),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "7a8a1587",
"metadata": {},
"source": [
"# bert-base-spanish-wwm-cased-xnli\n"
]
},
{
"cell_type": "markdown",
"id": "210c8e3a",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "fe16ef03",
"metadata": {},
"source": [
"This model is a fine-tuned version of the spanish BERT model with the Spanish portion of the XNLI dataset.\n"
]
},
{
"cell_type": "markdown",
"id": "b23d27b0",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "37e5b840",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "117b1e15",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"Recognai/bert-base-spanish-wwm-cased-xnli\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "65669489",
"metadata": {},
"source": [
"## Eval results\n",
"\n",
"Accuracy for the test set:\n",
"\n",
"| | XNLI-es |\n",
"|-----------------------------|---------|\n",
"|bert-base-spanish-wwm-cased-xnli | 79.9% |\n",
"\n",
"> The model introduction and model weights originate from https://huggingface.co/Recognai/bert-base-spanish-wwm-cased-xnli and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
# 模型列表
## allenai/macaw-3b
| 模型名称 | 模型介绍 | 模型大小 | 模型下载 |
| --- | --- | --- | --- |
|allenai/macaw-3b| | 10.99G | [model_config.json](https://bj.bcebos.com/paddlenlp/models/community/allenai/macaw-3b/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/allenai/macaw-3b/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/allenai/macaw-3b/tokenizer_config.json) |
也可以通过`paddlenlp` cli 工具来下载对应的模型权重,使用步骤如下所示:
* 安装paddlenlp
```shell
pip install --upgrade paddlenlp
```
* 下载命令行
```shell
paddlenlp download --cache-dir ./pretrained_models allenai/macaw-3b
```
有任何下载的问题都可以到[PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP)中发Issue提问。
\ No newline at end of file
# model list
##
| model | description | model_size | download |
| --- | --- | --- | --- |
|allenai/macaw-3b| | 10.99G | [model_config.json](https://bj.bcebos.com/paddlenlp/models/community/allenai/macaw-3b/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/allenai/macaw-3b/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/allenai/macaw-3b/tokenizer_config.json) |
or you can download all of model file with the following steps:
* install paddlenlp
```shell
pip install --upgrade paddlenlp
```
* download model with cli tool
```shell
paddlenlp download --cache-dir ./pretrained_models allenai/macaw-3b
```
If you have any problems with it, you can post issue on [PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP) to get support.
Model_Info:
name: "allenai/macaw-3b"
description: "macaw-3b"
description_en: "macaw-3b"
icon: ""
from_repo: "https://huggingface.co/allenai/macaw-3b"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text2Text Generation"
sub_tag: "文本生成"
Example:
Datasets: ""
Publisher: "allenai"
License: "apache-2.0"
Language: "English"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info:
name: "allenai/macaw-large"
description: "macaw-large"
description_en: "macaw-large"
icon: ""
from_repo: "https://huggingface.co/allenai/macaw-large"
description: macaw-large
description_en: macaw-large
from_repo: https://huggingface.co/allenai/macaw-large
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: allenai/macaw-large
Paper: null
Publisher: allenai
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text2Text Generation"
sub_tag: "文本生成"
Example:
Datasets: ""
Publisher: "allenai"
License: "apache-2.0"
Language: "English"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- sub_tag: 文本生成
sub_tag_en: Text2Text Generation
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "d50965ae",
"metadata": {},
"source": [
"# macaw-large\n",
"\n",
"## Model description\n",
"\n",
"Macaw (<b>M</b>ulti-<b>a</b>ngle <b>c</b>(q)uestion <b>a</b>ns<b>w</b>ering) is a ready-to-use model capable of\n",
"general question answering,\n",
"showing robustness outside the domains it was trained on. It has been trained in \"multi-angle\" fashion,\n",
"which means it can handle a flexible set of input and output \"slots\"\n",
"(question, answer, multiple-choice options, context, and explanation) .\n",
"\n",
"Macaw was built on top of [T5](https://github.com/google-research/text-to-text-transfer-transformer) and comes in\n",
"three sizes: macaw-11b, macaw-3b,\n",
"and macaw-large, as well as an answer-focused version featured on\n",
"various leaderboards macaw-answer-11b.\n",
"\n",
"See https://github.com/allenai/macaw for more details."
]
},
{
"cell_type": "markdown",
"id": "1c0bce56",
"metadata": {},
"source": [
"## How to Use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "cb7a2c88",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b0fd69ae",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"allenai/macaw-large\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "955d0705",
"metadata": {},
"source": [
"## Reference\n",
"\n",
"> 此模型介绍及权重来源于[https://huggingface.co/allenai/macaw-large](https://huggingface.co/allenai/macaw-large),并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "f5a296e3",
"metadata": {},
"source": [
"# macaw-large\n",
"\n",
"## Model description\n",
"\n",
"Macaw (<b>M</b>ulti-<b>a</b>ngle <b>c</b>(q)uestion <b>a</b>ns<b>w</b>ering) is a ready-to-use model capable of\n",
"general question answering,\n",
"showing robustness outside the domains it was trained on. It has been trained in \"multi-angle\" fashion,\n",
"which means it can handle a flexible set of input and output \"slots\"\n",
"(question, answer, multiple-choice options, context, and explanation) .\n",
"\n",
"Macaw was built on top of [T5](https://github.com/google-research/text-to-text-transfer-transformer) and comes in\n",
"three sizes: macaw-11b, macaw-3b,\n",
"and macaw-large, as well as an answer-focused version featured on\n",
"various leaderboards macaw-answer-11b.\n",
"\n",
"See https://github.com/allenai/macaw for more details."
]
},
{
"cell_type": "markdown",
"id": "27cf8ebc",
"metadata": {},
"source": [
"## How to Use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "027c735c",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8f52c07a",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"allenai/macaw-large\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "ce759903",
"metadata": {},
"source": [
"## Reference\n",
"\n",
"> The model introduction and model weights originate from [https://huggingface.co/allenai/macaw-large](https://huggingface.co/allenai/macaw-large) and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info:
name: "allenai/specter"
description: "SPECTER"
description_en: "SPECTER"
icon: ""
from_repo: "https://huggingface.co/allenai/specter"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Feature Extraction"
sub_tag: "特征抽取"
Example:
Datasets: ""
Publisher: "allenai"
License: "apache-2.0"
Language: "English"
description: SPECTER
description_en: SPECTER
from_repo: https://huggingface.co/allenai/specter
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: allenai/specter
Paper:
- title: 'SPECTER: Document-level Representation Learning using Citation-informed Transformers'
url: 'http://arxiv.org/abs/2004.07180v4'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- title: 'SPECTER: Document-level Representation Learning using Citation-informed
Transformers'
url: http://arxiv.org/abs/2004.07180v4
Publisher: allenai
Task:
- sub_tag: 特征抽取
sub_tag_en: Feature Extraction
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "a5b54f39",
"metadata": {},
"source": [
"## SPECTER\n",
"\n",
"SPECTER is a pre-trained language model to generate document-level embedding of documents. It is pre-trained on a a powerful signal of document-level relatedness: the citation graph. Unlike existing pretrained language models, SPECTER can be easily applied to downstream applications without task-specific fine-tuning.\n",
"\n",
"Paper: [SPECTER: Document-level Representation Learning using Citation-informed Transformers](https://arxiv.org/pdf/2004.07180.pdf)\n",
"\n",
"Original Repo: [Github](https://github.com/allenai/specter)\n",
"\n",
"Evaluation Benchmark: [SciDocs](https://github.com/allenai/scidocs)\n",
"\n",
"Authors: *Arman Cohan, Sergey Feldman, Iz Beltagy, Doug Downey, Daniel S. Weld*"
]
},
{
"cell_type": "markdown",
"id": "e279b43d",
"metadata": {},
"source": [
"## How to Use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3dcf4e0b",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7348a84e",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"allenai/specter\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "89c70552",
"metadata": {},
"source": [
"## Reference\n",
"\n",
"> 此模型介绍及权重来源于[https://huggingface.co/allenai/specter](https://huggingface.co/allenai/specter),并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "a09f5723",
"metadata": {},
"source": [
"## SPECTER\n",
"\n",
"SPECTER is a pre-trained language model to generate document-level embedding of documents. It is pre-trained on a a powerful signal of document-level relatedness: the citation graph. Unlike existing pretrained language models, SPECTER can be easily applied to downstream applications without task-specific fine-tuning.\n",
"\n",
"Paper: [SPECTER: Document-level Representation Learning using Citation-informed Transformers](https://arxiv.org/pdf/2004.07180.pdf)\n",
"\n",
"Original Repo: [Github](https://github.com/allenai/specter)\n",
"\n",
"Evaluation Benchmark: [SciDocs](https://github.com/allenai/scidocs)\n",
"\n",
"Authors: *Arman Cohan, Sergey Feldman, Iz Beltagy, Doug Downey, Daniel S. Weld*"
]
},
{
"cell_type": "markdown",
"id": "b62bbb59",
"metadata": {},
"source": [
"## How to Use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2dff923a",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e60739cc",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"allenai/specter\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "cd668864",
"metadata": {},
"source": [
"## Reference\n",
"\n",
"> The model introduction and model weights originate from [https://huggingface.co/allenai/specter](https://huggingface.co/allenai/specter) and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info:
name: "alvaroalon2/biobert_chemical_ner"
description: ""
description_en: ""
icon: ""
from_repo: "https://huggingface.co/alvaroalon2/biobert_chemical_ner"
description: ''
description_en: ''
from_repo: https://huggingface.co/alvaroalon2/biobert_chemical_ner
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: alvaroalon2/biobert_chemical_ner
Paper: null
Publisher: alvaroalon2
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Token Classification"
sub_tag: "Token分类"
Example:
Datasets: ""
Publisher: "alvaroalon2"
License: "apache-2.0"
Language: "English"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- sub_tag: Token分类
sub_tag_en: Token Classification
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "0b8f2339",
"metadata": {},
"source": [
"## Overview\n",
"\n",
"BioBERT model fine-tuned in NER task with BC5CDR-chemicals and BC4CHEMD corpus.\n",
"\n",
"This was fine-tuned in order to use it in a BioNER/BioNEN system which is available at: https://github.com/librairy/bio-ner"
]
},
{
"cell_type": "markdown",
"id": "934c3f34",
"metadata": {},
"source": [
"## How to Use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a8516341",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "70114f31",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"alvaroalon2/biobert_chemical_ner\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "fb7b2eb8",
"metadata": {},
"source": [
"## Reference\n",
"\n",
"> 此模型介绍及权重来源于:[https://huggingface.co/alvaroalon2/biobert_chemical_ner](https://huggingface.co/alvaroalon2/biobert_chemical_ner),并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "f769316b",
"metadata": {},
"source": [
"## Overview\n",
"\n",
"BioBERT model fine-tuned in NER task with BC5CDR-chemicals and BC4CHEMD corpus.\n",
"\n",
"This was fine-tuned in order to use it in a BioNER/BioNEN system which is available at: https://github.com/librairy/bio-ner"
]
},
{
"cell_type": "markdown",
"id": "3a77ed26",
"metadata": {},
"source": [
"## How to Use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "202a3ef9",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "fc11d032",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"alvaroalon2/biobert_chemical_ner\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "762dee96",
"metadata": {},
"source": [
"## Reference\n",
"\n",
"> The model introduction and model weights originate from [https://huggingface.co/alvaroalon2/biobert_chemical_ner](https://huggingface.co/alvaroalon2/biobert_chemical_ner) and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ncbi_disease
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info:
name: "alvaroalon2/biobert_diseases_ner"
description: ""
description_en: ""
icon: ""
from_repo: "https://huggingface.co/alvaroalon2/biobert_diseases_ner"
description: ''
description_en: ''
from_repo: https://huggingface.co/alvaroalon2/biobert_diseases_ner
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: alvaroalon2/biobert_diseases_ner
Paper: null
Publisher: alvaroalon2
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Token Classification"
sub_tag: "Token分类"
Example:
Datasets: "ncbi_disease"
Publisher: "alvaroalon2"
License: "apache-2.0"
Language: "English"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- sub_tag: Token分类
sub_tag_en: Token Classification
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "578bdb21",
"metadata": {},
"source": [
"## Overview\n",
"\n",
"BioBERT model fine-tuned in NER task with BC5CDR-diseases and NCBI-diseases corpus\n",
"\n",
"This was fine-tuned in order to use it in a BioNER/BioNEN system which is available at: https://github.com/librairy/bio-ner"
]
},
{
"cell_type": "markdown",
"id": "d18b8736",
"metadata": {},
"source": [
"## How to Use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9b304ea9",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "49b790e5",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"alvaroalon2/biobert_diseases_ner\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "ab48464f",
"metadata": {},
"source": [
"## Reference\n",
"\n",
"> 此模型介绍及权重来源于[https://huggingface.co/alvaroalon2/biobert_diseases_ner](https://huggingface.co/alvaroalon2/biobert_diseases_ner),并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "98591560",
"metadata": {},
"source": [
"## Overview\n",
"\n",
"BioBERT model fine-tuned in NER task with BC5CDR-diseases and NCBI-diseases corpus\n",
"\n",
"This was fine-tuned in order to use it in a BioNER/BioNEN system which is available at: https://github.com/librairy/bio-ner"
]
},
{
"cell_type": "markdown",
"id": "da577da0",
"metadata": {},
"source": [
"## How to Use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0ee7d4df",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c6dfd3c0",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"alvaroalon2/biobert_diseases_ner\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "7a58f3ef",
"metadata": {},
"source": [
"## Reference\n",
"\n",
"> The model introduction and model weights originate from [https://huggingface.co/alvaroalon2/biobert_diseases_ner](https://huggingface.co/alvaroalon2/biobert_diseases_ner) and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info:
name: "alvaroalon2/biobert_genetic_ner"
description: ""
description_en: ""
icon: ""
from_repo: "https://huggingface.co/alvaroalon2/biobert_genetic_ner"
description: ''
description_en: ''
from_repo: https://huggingface.co/alvaroalon2/biobert_genetic_ner
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: alvaroalon2/biobert_genetic_ner
Paper: null
Publisher: alvaroalon2
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Token Classification"
sub_tag: "Token分类"
Example:
Datasets: ""
Publisher: "alvaroalon2"
License: "apache-2.0"
Language: "English"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- sub_tag: Token分类
sub_tag_en: Token Classification
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "795618b9",
"metadata": {},
"source": [
"## Overview\n",
"\n",
"BioBERT model fine-tuned in NER task with JNLPBA and BC2GM corpus for genetic class entities.\n",
"\n",
"This was fine-tuned in order to use it in a BioNER/BioNEN system which is available at: https://github.com/librairy/bio-ner"
]
},
{
"cell_type": "markdown",
"id": "bf1bde1a",
"metadata": {},
"source": [
"## How to Use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "90bf4208",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c3f9ddc9",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"alvaroalon2/biobert_genetic_ner\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "45bef570",
"metadata": {},
"source": [
"## Reference\n",
"\n",
"> 此模型介绍及权重来源于[https://huggingface.co/alvaroalon2/biobert_genetic_ner](https://huggingface.co/alvaroalon2/biobert_genetic_ner),并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "eeb5731b",
"metadata": {},
"source": [
"## Overview\n",
"\n",
"BioBERT model fine-tuned in NER task with JNLPBA and BC2GM corpus for genetic class entities.\n",
"\n",
"This was fine-tuned in order to use it in a BioNER/BioNEN system which is available at: https://github.com/librairy/bio-ner"
]
},
{
"cell_type": "markdown",
"id": "3501c0f5",
"metadata": {},
"source": [
"## How to Use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "da1caa55",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a8a173da",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"alvaroalon2/biobert_genetic_ner\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "0c74ebfe",
"metadata": {},
"source": [
"## Reference\n",
"\n",
"> The model introduction and model weights originate from [https://huggingface.co/alvaroalon2/biobert_genetic_ner](https://huggingface.co/alvaroalon2/biobert_genetic_ner) and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: ''
License: apache-2.0
Model_Info:
name: "amberoad/bert-multilingual-passage-reranking-msmarco"
description: "Passage Reranking Multilingual BERT 🔃 🌍"
description_en: "Passage Reranking Multilingual BERT 🔃 🌍"
icon: ""
from_repo: "https://huggingface.co/amberoad/bert-multilingual-passage-reranking-msmarco"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text Classification"
sub_tag: "文本分类"
Example:
Datasets: ""
Publisher: "amberoad"
License: "apache-2.0"
Language: ""
description: Passage Reranking Multilingual BERT 🔃 🌍
description_en: Passage Reranking Multilingual BERT 🔃 🌍
from_repo: https://huggingface.co/amberoad/bert-multilingual-passage-reranking-msmarco
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: amberoad/bert-multilingual-passage-reranking-msmarco
Paper:
- title: 'Passage Re-ranking with BERT'
url: 'http://arxiv.org/abs/1901.04085v5'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- title: Passage Re-ranking with BERT
url: http://arxiv.org/abs/1901.04085v5
Publisher: amberoad
Task:
- sub_tag: 文本分类
sub_tag_en: Text Classification
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "83244d63",
"metadata": {},
"source": [
"# Passage Reranking Multilingual BERT 🔃 🌍\n"
]
},
{
"cell_type": "markdown",
"id": "4c8c922a",
"metadata": {},
"source": [
"## Model description\n",
"**Input:** Supports over 100 Languages. See [List of supported languages](https://github.com/google-research/bert/blob/master/multilingual.md#list-of-languages) for all available.\n"
]
},
{
"cell_type": "markdown",
"id": "8b40d5de",
"metadata": {},
"source": [
"**Purpose:** This module takes a search query [1] and a passage [2] and calculates if the passage matches the query.\n",
"It can be used as an improvement for Elasticsearch Results and boosts the relevancy by up to 100%.\n"
]
},
{
"cell_type": "markdown",
"id": "c9d89366",
"metadata": {},
"source": [
"**Architecture:** On top of BERT there is a Densly Connected NN which takes the 768 Dimensional [CLS] Token as input and provides the output ([Arxiv](https://arxiv.org/abs/1901.04085)).\n"
]
},
{
"cell_type": "markdown",
"id": "29745195",
"metadata": {},
"source": [
"**Output:** Just a single value between between -10 and 10. Better matching query,passage pairs tend to have a higher a score.\n"
]
},
{
"cell_type": "markdown",
"id": "010a4d92",
"metadata": {},
"source": [
"## Intended uses & limitations\n",
"Both query[1] and passage[2] have to fit in 512 Tokens.\n",
"As you normally want to rerank the first dozens of search results keep in mind the inference time of approximately 300 ms/query.\n"
]
},
{
"cell_type": "markdown",
"id": "a9f2dea7",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8d023555",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4c83eef3",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"amberoad/bert-multilingual-passage-reranking-msmarco\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "2611b122",
"metadata": {},
"source": [
"## Training data\n"
]
},
{
"cell_type": "markdown",
"id": "ba62fbe0",
"metadata": {},
"source": [
"This model is trained using the [**Microsoft MS Marco Dataset**](https://microsoft.github.io/msmarco/ \"Microsoft MS Marco\"). This training dataset contains approximately 400M tuples of a query, relevant and non-relevant passages. All datasets used for training and evaluating are listed in this [table](https://github.com/microsoft/MSMARCO-Passage-Ranking#data-information-and-formating). The used dataset for training is called *Train Triples Large*, while the evaluation was made on *Top 1000 Dev*. There are 6,900 queries in total in the development dataset, where each query is mapped to top 1,000 passage retrieved using BM25 from MS MARCO corpus.\n"
]
},
{
"cell_type": "markdown",
"id": "afc188f2",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/amberoad/bert-multilingual-passage-reranking-msmarco](https://huggingface.co/amberoad/bert-multilingual-passage-reranking-msmarco),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "22c47298",
"metadata": {},
"source": [
"# Passage Reranking Multilingual BERT 🔃 🌍\n"
]
},
{
"cell_type": "markdown",
"id": "0bb73e0f",
"metadata": {},
"source": [
"## Model description\n",
"**Input:** Supports over 100 Languages. See [List of supported languages](https://github.com/google-research/bert/blob/master/multilingual.md#list-of-languages) for all available.\n"
]
},
{
"cell_type": "markdown",
"id": "fedf5cb8",
"metadata": {},
"source": [
"**Purpose:** This module takes a search query [1] and a passage [2] and calculates if the passage matches the query.\n",
"It can be used as an improvement for Elasticsearch Results and boosts the relevancy by up to 100%.\n"
]
},
{
"cell_type": "markdown",
"id": "146e3be4",
"metadata": {},
"source": [
"**Architecture:** On top of BERT there is a Densly Connected NN which takes the 768 Dimensional [CLS] Token as input and provides the output ([Arxiv](https://arxiv.org/abs/1901.04085)).\n"
]
},
{
"cell_type": "markdown",
"id": "772c5c82",
"metadata": {},
"source": [
"**Output:** Just a single value between between -10 and 10. Better matching query,passage pairs tend to have a higher a score.\n"
]
},
{
"cell_type": "markdown",
"id": "e5974e46",
"metadata": {},
"source": [
"## Intended uses & limitations\n",
"Both query[1] and passage[2] have to fit in 512 Tokens.\n",
"As you normally want to rerank the first dozens of search results keep in mind the inference time of approximately 300 ms/query.\n"
]
},
{
"cell_type": "markdown",
"id": "7d878609",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e0941f1f",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3bc201bf",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"amberoad/bert-multilingual-passage-reranking-msmarco\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "674ccc3a",
"metadata": {},
"source": [
"## Training data\n"
]
},
{
"cell_type": "markdown",
"id": "4404adda",
"metadata": {},
"source": [
"This model is trained using the [**Microsoft MS Marco Dataset**](https://microsoft.github.io/msmarco/ \"Microsoft MS Marco\"). This training dataset contains approximately 400M tuples of a query, relevant and non-relevant passages. All datasets used for training and evaluating are listed in this [table](https://github.com/microsoft/MSMARCO-Passage-Ranking#data-information-and-formating). The used dataset for training is called *Train Triples Large*, while the evaluation was made on *Top 1000 Dev*. There are 6,900 queries in total in the development dataset, where each query is mapped to top 1,000 passage retrieved using BM25 from MS MARCO corpus.\n"
]
},
{
"cell_type": "markdown",
"id": "79af5e42",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/amberoad/bert-multilingual-passage-reranking-msmarco](https://huggingface.co/amberoad/bert-multilingual-passage-reranking-msmarco) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
# 模型列表
## asi/gpt-fr-cased-base
| 模型名称 | 模型介绍 | 模型大小 | 模型下载 |
| --- | --- | --- | --- |
|asi/gpt-fr-cased-base| | 4.12G | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/asi/gpt-fr-cased-base/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/asi/gpt-fr-cased-base/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/asi/gpt-fr-cased-base/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/asi/gpt-fr-cased-base/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/asi/gpt-fr-cased-base/vocab.json) |
也可以通过`paddlenlp` cli 工具来下载对应的模型权重,使用步骤如下所示:
* 安装paddlenlp
```shell
pip install --upgrade paddlenlp
```
* 下载命令行
```shell
paddlenlp download --cache-dir ./pretrained_models asi/gpt-fr-cased-base
```
有任何下载的问题都可以到[PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP)中发Issue提问。
\ No newline at end of file
# model list
##
| model | description | model_size | download |
| --- | --- | --- | --- |
|asi/gpt-fr-cased-base| | 4.12G | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/asi/gpt-fr-cased-base/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/asi/gpt-fr-cased-base/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/asi/gpt-fr-cased-base/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/asi/gpt-fr-cased-base/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/asi/gpt-fr-cased-base/vocab.json) |
or you can download all of model file with the following steps:
* install paddlenlp
```shell
pip install --upgrade paddlenlp
```
* download model with cli tool
```shell
paddlenlp download --cache-dir ./pretrained_models asi/gpt-fr-cased-base
```
If you have any problems with it, you can post issue on [PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP) to get support.
Model_Info:
name: "asi/gpt-fr-cased-base"
description: "Model description"
description_en: "Model description"
icon: ""
from_repo: "https://huggingface.co/asi/gpt-fr-cased-base"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text Generation"
sub_tag: "文本生成"
Example:
Datasets: ""
Publisher: "asi"
License: "apache-2.0"
Language: "French"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
# 模型列表
## asi/gpt-fr-cased-small
| 模型名称 | 模型介绍 | 模型大小 | 模型下载 |
| --- | --- | --- | --- |
|asi/gpt-fr-cased-small| | 620.45MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/asi/gpt-fr-cased-small/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/asi/gpt-fr-cased-small/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/asi/gpt-fr-cased-small/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/asi/gpt-fr-cased-small/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/asi/gpt-fr-cased-small/vocab.json) |
也可以通过`paddlenlp` cli 工具来下载对应的模型权重,使用步骤如下所示:
* 安装paddlenlp
```shell
pip install --upgrade paddlenlp
```
* 下载命令行
```shell
paddlenlp download --cache-dir ./pretrained_models asi/gpt-fr-cased-small
```
有任何下载的问题都可以到[PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP)中发Issue提问。
\ No newline at end of file
# model list
##
| model | description | model_size | download |
| --- | --- | --- | --- |
|asi/gpt-fr-cased-small| | 620.45MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/asi/gpt-fr-cased-small/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/asi/gpt-fr-cased-small/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/asi/gpt-fr-cased-small/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/asi/gpt-fr-cased-small/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/asi/gpt-fr-cased-small/vocab.json) |
or you can download all of model file with the following steps:
* install paddlenlp
```shell
pip install --upgrade paddlenlp
```
* download model with cli tool
```shell
paddlenlp download --cache-dir ./pretrained_models asi/gpt-fr-cased-small
```
If you have any problems with it, you can post issue on [PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP) to get support.
Model_Info:
name: "asi/gpt-fr-cased-small"
description: "Model description"
description_en: "Model description"
icon: ""
from_repo: "https://huggingface.co/asi/gpt-fr-cased-small"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text Generation"
sub_tag: "文本生成"
Example:
Datasets: ""
Publisher: "asi"
License: "apache-2.0"
Language: "French"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: German
License: mit
Model_Info:
name: "benjamin/gerpt2-large"
description: "GerPT2"
description_en: "GerPT2"
icon: ""
from_repo: "https://huggingface.co/benjamin/gerpt2-large"
description: GerPT2
description_en: GerPT2
from_repo: https://huggingface.co/benjamin/gerpt2-large
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: benjamin/gerpt2-large
Paper: null
Publisher: benjamin
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text Generation"
sub_tag: "文本生成"
Example:
Datasets: ""
Publisher: "benjamin"
License: "mit"
Language: "German"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- sub_tag: 文本生成
sub_tag_en: Text Generation
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "e42aa4df",
"metadata": {},
"source": [
"# GerPT2\n"
]
},
{
"cell_type": "markdown",
"id": "08fd6403",
"metadata": {},
"source": [
"See the GPT2 model card for considerations on limitations and bias. See the GPT2 documentation for details on GPT2.\n"
]
},
{
"cell_type": "markdown",
"id": "8295e28d",
"metadata": {},
"source": [
"## Comparison to dbmdz/german-gpt2\n"
]
},
{
"cell_type": "markdown",
"id": "c0f50f67",
"metadata": {},
"source": [
"I evaluated both GerPT2-large and the other German GPT2, dbmdz/german-gpt2 on the [CC-100](http://data.statmt.org/cc-100/) dataset and on the German Wikipedia:\n"
]
},
{
"cell_type": "markdown",
"id": "6ecdc149",
"metadata": {},
"source": [
"| | CC-100 (PPL) | Wikipedia (PPL) |\n",
"|-------------------|--------------|-----------------|\n",
"| dbmdz/german-gpt2 | 49.47 | 62.92 |\n",
"| GerPT2 | 24.78 | 35.33 |\n",
"| GerPT2-large | __16.08__ | __23.26__ |\n",
"| | | |\n"
]
},
{
"cell_type": "markdown",
"id": "3cddd6a8",
"metadata": {},
"source": [
"See the script `evaluate.py` in the [GerPT2 Github repository](https://github.com/bminixhofer/gerpt2) for the code.\n"
]
},
{
"cell_type": "markdown",
"id": "d838da15",
"metadata": {},
"source": [
"## Usage\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "476bf523",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8f509fec",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"benjamin/gerpt2-large\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "d135a538",
"metadata": {},
"source": [
"```\n",
"@misc{Minixhofer_GerPT2_German_large_2020,\n",
"author = {Minixhofer, Benjamin},\n",
"doi = {10.5281/zenodo.5509984},\n",
"month = {12},\n",
"title = {{GerPT2: German large and small versions of GPT2}},\n",
"url = {https://github.com/bminixhofer/gerpt2},\n",
"year = {2020}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "63e09ad7",
"metadata": {},
"source": [
"## Acknowledgements\n"
]
},
{
"cell_type": "markdown",
"id": "d9dc51e1",
"metadata": {},
"source": [
"Thanks to [Hugging Face](https://huggingface.co) for awesome tools and infrastructure.\n",
"Huge thanks to [Artus Krohn-Grimberghe](https://twitter.com/artuskg) at [LYTiQ](https://www.lytiq.de/) for making this possible by sponsoring the resources used for training.\n",
"\n",
"> 此模型介绍及权重来源于[https://huggingface.co/benjamin/gerpt2-large](https://huggingface.co/benjamin/gerpt2-large),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "85c2e1a7",
"metadata": {},
"source": [
"# GerPT2\n"
]
},
{
"cell_type": "markdown",
"id": "595fe7cb",
"metadata": {},
"source": [
"See the GPT2 model card for considerations on limitations and bias. See the GPT2 documentation for details on GPT2.\n"
]
},
{
"cell_type": "markdown",
"id": "5b4f950b",
"metadata": {},
"source": [
"## Comparison to dbmdz/german-gpt2\n"
]
},
{
"cell_type": "markdown",
"id": "95be6eb8",
"metadata": {},
"source": [
"I evaluated both GerPT2-large and the other German GPT2, dbmdz/german-gpt2 on the [CC-100](http://data.statmt.org/cc-100/) dataset and on the German Wikipedia:\n"
]
},
{
"cell_type": "markdown",
"id": "8acd14be",
"metadata": {},
"source": [
"| | CC-100 (PPL) | Wikipedia (PPL) |\n",
"|-------------------|--------------|-----------------|\n",
"| dbmdz/german-gpt2 | 49.47 | 62.92 |\n",
"| GerPT2 | 24.78 | 35.33 |\n",
"| GerPT2-large | __16.08__ | __23.26__ |\n",
"| | | |\n"
]
},
{
"cell_type": "markdown",
"id": "6fa10d79",
"metadata": {},
"source": [
"See the script `evaluate.py` in the [GerPT2 Github repository](https://github.com/bminixhofer/gerpt2) for the code.\n"
]
},
{
"cell_type": "markdown",
"id": "a8514e1e",
"metadata": {},
"source": [
"## Usage\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4bc62c63",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "63f78302",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"benjamin/gerpt2-large\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "563152f3",
"metadata": {},
"source": [
"```\n",
"@misc{Minixhofer_GerPT2_German_large_2020,\n",
"author = {Minixhofer, Benjamin},\n",
"doi = {10.5281/zenodo.5509984},\n",
"month = {12},\n",
"title = {{GerPT2: German large and small versions of GPT2}},\n",
"url = {https://github.com/bminixhofer/gerpt2},\n",
"year = {2020}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "b0d67d21",
"metadata": {},
"source": [
"## Acknowledgements\n"
]
},
{
"cell_type": "markdown",
"id": "474c1c61",
"metadata": {},
"source": [
"Thanks to [Hugging Face](https://huggingface.co) for awesome tools and infrastructure.\n",
"Huge thanks to [Artus Krohn-Grimberghe](https://twitter.com/artuskg) at [LYTiQ](https://www.lytiq.de/) for making this possible by sponsoring the resources used for training.\n",
"\n",
"> The model introduction and model weights originate from [https://huggingface.co/benjamin/gerpt2-large](https://huggingface.co/benjamin/gerpt2-large) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
# 模型列表
## benjamin/gerpt2
| 模型名称 | 模型介绍 | 模型大小 | 模型下载 |
| --- | --- | --- | --- |
|benjamin/gerpt2| | 621.95MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/benjamin/gerpt2/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/benjamin/gerpt2/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/benjamin/gerpt2/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/benjamin/gerpt2/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/benjamin/gerpt2/vocab.json) |
也可以通过`paddlenlp` cli 工具来下载对应的模型权重,使用步骤如下所示:
* 安装paddlenlp
```shell
pip install --upgrade paddlenlp
```
* 下载命令行
```shell
paddlenlp download --cache-dir ./pretrained_models benjamin/gerpt2
```
有任何下载的问题都可以到[PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP)中发Issue提问。
\ No newline at end of file
# model list
##
| model | description | model_size | download |
| --- | --- | --- | --- |
|benjamin/gerpt2| | 621.95MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/benjamin/gerpt2/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/benjamin/gerpt2/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/benjamin/gerpt2/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/benjamin/gerpt2/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/benjamin/gerpt2/vocab.json) |
or you can download all of model file with the following steps:
* install paddlenlp
```shell
pip install --upgrade paddlenlp
```
* download model with cli tool
```shell
paddlenlp download --cache-dir ./pretrained_models benjamin/gerpt2
```
If you have any problems with it, you can post issue on [PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP) to get support.
Model_Info:
name: "benjamin/gerpt2"
description: "GerPT2"
description_en: "GerPT2"
icon: ""
from_repo: "https://huggingface.co/benjamin/gerpt2"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text Generation"
sub_tag: "文本生成"
Example:
Datasets: ""
Publisher: "benjamin"
License: "mit"
Language: "German"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: Korean
License: apache-2.0
Model_Info:
name: "beomi/kcbert-base"
description: "KcBERT: Korean comments BERT"
description_en: "KcBERT: Korean comments BERT"
icon: ""
from_repo: "https://huggingface.co/beomi/kcbert-base"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: ""
Publisher: "beomi"
License: "apache-2.0"
Language: "Korean"
description: 'KcBERT: Korean comments BERT'
description_en: 'KcBERT: Korean comments BERT'
from_repo: https://huggingface.co/beomi/kcbert-base
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: beomi/kcbert-base
Paper:
- title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding'
url: 'http://arxiv.org/abs/1810.04805v2'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding'
url: http://arxiv.org/abs/1810.04805v2
Publisher: beomi
Task:
- sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "8a51a2c8",
"metadata": {},
"source": [
"# KcBERT: Korean comments BERT\n"
]
},
{
"cell_type": "markdown",
"id": "29c7e5a4",
"metadata": {},
"source": [
"Kaggle에 학습을 위해 정제한(아래 `clean`처리를 거친) Dataset을 공개하였습니다!\n"
]
},
{
"cell_type": "markdown",
"id": "95a25c77",
"metadata": {},
"source": [
"직접 다운받으셔서 다양한 Task에 학습을 진행해보세요 :)\n"
]
},
{
"cell_type": "markdown",
"id": "edd96db1",
"metadata": {},
"source": [
"공개된 한국어 BERT는 대부분 한국어 위키, 뉴스 기사, 책 등 잘 정제된 데이터를 기반으로 학습한 모델입니다. 한편, 실제로 NSMC와 같은 댓글형 데이터셋은 정제되지 않았고 구어체 특징에 신조어가 많으며, 오탈자 등 공식적인 글쓰기에서 나타나지 않는 표현들이 빈번하게 등장합니다.\n"
]
},
{
"cell_type": "markdown",
"id": "a2df738b",
"metadata": {},
"source": [
"KcBERT는 위와 같은 특성의 데이터셋에 적용하기 위해, 네이버 뉴스에서 댓글과 대댓글을 수집해, 토크나이저와 BERT모델을 처음부터 학습한 Pretrained BERT 모델입니다.\n"
]
},
{
"cell_type": "markdown",
"id": "a0eb4ad8",
"metadata": {},
"source": [
"KcBERT는 Huggingface의 Transformers 라이브러리를 통해 간편히 불러와 사용할 수 있습니다. (별도의 파일 다운로드가 필요하지 않습니다.)\n"
]
},
{
"cell_type": "markdown",
"id": "d1c07267",
"metadata": {},
"source": [
"## KcBERT Performance\n"
]
},
{
"cell_type": "markdown",
"id": "52872aa3",
"metadata": {},
"source": [
"- Finetune 코드는 https://github.com/Beomi/KcBERT-finetune 에서 찾아보실 수 있습니다.\n"
]
},
{
"cell_type": "markdown",
"id": "fa15ccaf",
"metadata": {},
"source": [
"| | Size<br/>(용량) | **NSMC**<br/>(acc) | **Naver NER**<br/>(F1) | **PAWS**<br/>(acc) | **KorNLI**<br/>(acc) | **KorSTS**<br/>(spearman) | **Question Pair**<br/>(acc) | **KorQuaD (Dev)**<br/>(EM/F1) |\n",
"| :-------------------- | :---: | :----------------: | :--------------------: | :----------------: | :------------------: | :-----------------------: | :-------------------------: | :---------------------------: |\n",
"| KcBERT-Base | 417M | 89.62 | 84.34 | 66.95 | 74.85 | 75.57 | 93.93 | 60.25 / 84.39 |\n",
"| KcBERT-Large | 1.2G | **90.68** | 85.53 | 70.15 | 76.99 | 77.49 | 94.06 | 62.16 / 86.64 |\n",
"| KoBERT | 351M | 89.63 | 86.11 | 80.65 | 79.00 | 79.64 | 93.93 | 52.81 / 80.27 |\n",
"| XLM-Roberta-Base | 1.03G | 89.49 | 86.26 | 82.95 | 79.92 | 79.09 | 93.53 | 64.70 / 88.94 |\n",
"| HanBERT | 614M | 90.16 | **87.31** | 82.40 | **80.89** | 83.33 | 94.19 | 78.74 / 92.02 |\n",
"| KoELECTRA-Base | 423M | **90.21** | 86.87 | 81.90 | 80.85 | 83.21 | 94.20 | 61.10 / 89.59 |\n",
"| KoELECTRA-Base-v2 | 423M | 89.70 | 87.02 | **83.90** | 80.61 | **84.30** | **94.72** | **84.34 / 92.58** |\n",
"| DistilKoBERT | 108M | 88.41 | 84.13 | 62.55 | 70.55 | 73.21 | 92.48 | 54.12 / 77.80 |\n"
]
},
{
"cell_type": "markdown",
"id": "5193845f",
"metadata": {},
"source": [
"\\*HanBERT의 Size는 Bert Model과 Tokenizer DB를 합친 것입니다.\n"
]
},
{
"cell_type": "markdown",
"id": "93aecc1a",
"metadata": {},
"source": [
"\\***config의 세팅을 그대로 하여 돌린 결과이며, hyperparameter tuning을 추가적으로 할 시 더 좋은 성능이 나올 수 있습니다.**\n"
]
},
{
"cell_type": "markdown",
"id": "6f889bbd",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "465d2dee",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f884ed37",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"beomi/kcbert-base\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "a92e65b7",
"metadata": {},
"source": [
"```\n",
"@inproceedings{lee2020kcbert,\n",
"title={KcBERT: Korean Comments BERT},\n",
"author={Lee, Junbum},\n",
"booktitle={Proceedings of the 32nd Annual Conference on Human and Cognitive Language Technology},\n",
"pages={437--440},\n",
"year={2020}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "21364621",
"metadata": {},
"source": [
"- 논문집 다운로드 링크: http://hclt.kr/dwn/?v=bG5iOmNvbmZlcmVuY2U7aWR4OjMy (*혹은 http://hclt.kr/symp/?lnb=conference )\n"
]
},
{
"cell_type": "markdown",
"id": "45cdafe0",
"metadata": {},
"source": [
"## Acknowledgement\n"
]
},
{
"cell_type": "markdown",
"id": "a741fcf0",
"metadata": {},
"source": [
"KcBERT Model을 학습하는 GCP/TPU 환경은 TFRC 프로그램의 지원을 받았습니다.\n"
]
},
{
"cell_type": "markdown",
"id": "1c9655e9",
"metadata": {},
"source": [
"모델 학습 과정에서 많은 조언을 주신 [Monologg](https://github.com/monologg/) 님 감사합니다 :)\n"
]
},
{
"cell_type": "markdown",
"id": "85cb1e08",
"metadata": {},
"source": [
"## Reference\n"
]
},
{
"cell_type": "markdown",
"id": "227d89d2",
"metadata": {},
"source": [
"### Github Repos\n"
]
},
{
"cell_type": "markdown",
"id": "5e8f4de7",
"metadata": {},
"source": [
"- [BERT by Google](https://github.com/google-research/bert)\n",
"- [KoBERT by SKT](https://github.com/SKTBrain/KoBERT)\n",
"- [KoELECTRA by Monologg](https://github.com/monologg/KoELECTRA/)\n"
]
},
{
"cell_type": "markdown",
"id": "730bfede",
"metadata": {},
"source": [
"- [Transformers by Huggingface](https://github.com/huggingface/transformers)\n",
"- [Tokenizers by Hugginface](https://github.com/huggingface/tokenizers)\n"
]
},
{
"cell_type": "markdown",
"id": "66dbd496",
"metadata": {},
"source": [
"### Papers\n"
]
},
{
"cell_type": "markdown",
"id": "84fe619a",
"metadata": {},
"source": [
"- [BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding](https://arxiv.org/abs/1810.04805)\n"
]
},
{
"cell_type": "markdown",
"id": "63bb3dd3",
"metadata": {},
"source": [
"### Blogs\n"
]
},
{
"cell_type": "markdown",
"id": "a5aa5385",
"metadata": {},
"source": [
"- [Monologg님의 KoELECTRA 학습기](https://monologg.kr/categories/NLP/ELECTRA/)\n"
]
},
{
"cell_type": "markdown",
"id": "bcbd3600",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/beomi/kcbert-base](https://huggingface.co/beomi/kcbert-base),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "21e8b000",
"metadata": {},
"source": [
"# KcBERT: Korean comments BERT\n"
]
},
{
"cell_type": "markdown",
"id": "336ee0b8",
"metadata": {},
"source": [
"Kaggle에 학습을 위해 정제한(아래 `clean`처리를 거친) Dataset을 공개하였습니다!\n"
]
},
{
"cell_type": "markdown",
"id": "691c1f27",
"metadata": {},
"source": [
"직접 다운받으셔서 다양한 Task에 학습을 진행해보세요 :)\n"
]
},
{
"cell_type": "markdown",
"id": "36ec915c",
"metadata": {},
"source": [
"공개된 한국어 BERT는 대부분 한국어 위키, 뉴스 기사, 책 등 잘 정제된 데이터를 기반으로 학습한 모델입니다. 한편, 실제로 NSMC와 같은 댓글형 데이터셋은 정제되지 않았고 구어체 특징에 신조어가 많으며, 오탈자 등 공식적인 글쓰기에서 나타나지 않는 표현들이 빈번하게 등장합니다.\n"
]
},
{
"cell_type": "markdown",
"id": "b5b8d7d7",
"metadata": {},
"source": [
"KcBERT는 위와 같은 특성의 데이터셋에 적용하기 위해, 네이버 뉴스에서 댓글과 대댓글을 수집해, 토크나이저와 BERT모델을 처음부터 학습한 Pretrained BERT 모델입니다.\n"
]
},
{
"cell_type": "markdown",
"id": "b0095da8",
"metadata": {},
"source": [
"KcBERT는 Huggingface의 Transformers 라이브러리를 통해 간편히 불러와 사용할 수 있습니다. (별도의 파일 다운로드가 필요하지 않습니다.)\n"
]
},
{
"cell_type": "markdown",
"id": "4bf51d97",
"metadata": {},
"source": [
"## KcBERT Performance\n"
]
},
{
"cell_type": "markdown",
"id": "9679c8b9",
"metadata": {},
"source": [
"- Finetune 코드는 https://github.com/Beomi/KcBERT-finetune 에서 찾아보실 수 있습니다.\n"
]
},
{
"cell_type": "markdown",
"id": "486782a2",
"metadata": {},
"source": [
"| | Size<br/>(용량) | **NSMC**<br/>(acc) | **Naver NER**<br/>(F1) | **PAWS**<br/>(acc) | **KorNLI**<br/>(acc) | **KorSTS**<br/>(spearman) | **Question Pair**<br/>(acc) | **KorQuaD (Dev)**<br/>(EM/F1) |\n",
"| :-------------------- | :---: | :----------------: | :--------------------: | :----------------: | :------------------: | :-----------------------: | :-------------------------: | :---------------------------: |\n",
"| KcBERT-Base | 417M | 89.62 | 84.34 | 66.95 | 74.85 | 75.57 | 93.93 | 60.25 / 84.39 |\n",
"| KcBERT-Large | 1.2G | **90.68** | 85.53 | 70.15 | 76.99 | 77.49 | 94.06 | 62.16 / 86.64 |\n",
"| KoBERT | 351M | 89.63 | 86.11 | 80.65 | 79.00 | 79.64 | 93.93 | 52.81 / 80.27 |\n",
"| XLM-Roberta-Base | 1.03G | 89.49 | 86.26 | 82.95 | 79.92 | 79.09 | 93.53 | 64.70 / 88.94 |\n",
"| HanBERT | 614M | 90.16 | **87.31** | 82.40 | **80.89** | 83.33 | 94.19 | 78.74 / 92.02 |\n",
"| KoELECTRA-Base | 423M | **90.21** | 86.87 | 81.90 | 80.85 | 83.21 | 94.20 | 61.10 / 89.59 |\n",
"| KoELECTRA-Base-v2 | 423M | 89.70 | 87.02 | **83.90** | 80.61 | **84.30** | **94.72** | **84.34 / 92.58** |\n",
"| DistilKoBERT | 108M | 88.41 | 84.13 | 62.55 | 70.55 | 73.21 | 92.48 | 54.12 / 77.80 |\n"
]
},
{
"cell_type": "markdown",
"id": "e86103a2",
"metadata": {},
"source": [
"\\*HanBERT의 Size는 Bert Model과 Tokenizer DB를 합친 것입니다.\n"
]
},
{
"cell_type": "markdown",
"id": "1078bc5d",
"metadata": {},
"source": [
"\\***config의 세팅을 그대로 하여 돌린 결과이며, hyperparameter tuning을 추가적으로 할 시 더 좋은 성능이 나올 수 있습니다.**\n"
]
},
{
"cell_type": "markdown",
"id": "8ac2ee11",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e171068a",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "38c7ad79",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"beomi/kcbert-base\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "a794d15a",
"metadata": {},
"source": [
"```\n",
"@inproceedings{lee2020kcbert,\n",
"title={KcBERT: Korean Comments BERT},\n",
"author={Lee, Junbum},\n",
"booktitle={Proceedings of the 32nd Annual Conference on Human and Cognitive Language Technology},\n",
"pages={437--440},\n",
"year={2020}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "c0183cbe",
"metadata": {},
"source": [
"- 논문집 다운로드 링크: http://hclt.kr/dwn/?v=bG5iOmNvbmZlcmVuY2U7aWR4OjMy (*혹은 http://hclt.kr/symp/?lnb=conference )\n"
]
},
{
"cell_type": "markdown",
"id": "ba768b26",
"metadata": {},
"source": [
"## Acknowledgement\n"
]
},
{
"cell_type": "markdown",
"id": "ea148064",
"metadata": {},
"source": [
"KcBERT Model을 학습하는 GCP/TPU 환경은 TFRC 프로그램의 지원을 받았습니다.\n"
]
},
{
"cell_type": "markdown",
"id": "78732669",
"metadata": {},
"source": [
"모델 학습 과정에서 많은 조언을 주신 [Monologg](https://github.com/monologg/) 님 감사합니다 :)\n"
]
},
{
"cell_type": "markdown",
"id": "5ffa9ed9",
"metadata": {},
"source": [
"## Reference\n"
]
},
{
"cell_type": "markdown",
"id": "ea69da89",
"metadata": {},
"source": [
"### Github Repos\n"
]
},
{
"cell_type": "markdown",
"id": "d72d564c",
"metadata": {},
"source": [
"- [BERT by Google](https://github.com/google-research/bert)\n",
"- [KoBERT by SKT](https://github.com/SKTBrain/KoBERT)\n",
"- [KoELECTRA by Monologg](https://github.com/monologg/KoELECTRA/)\n"
]
},
{
"cell_type": "markdown",
"id": "38503607",
"metadata": {},
"source": [
"- [Transformers by Huggingface](https://github.com/huggingface/transformers)\n",
"- [Tokenizers by Hugginface](https://github.com/huggingface/tokenizers)\n"
]
},
{
"cell_type": "markdown",
"id": "a71a565f",
"metadata": {},
"source": [
"### Papers\n"
]
},
{
"cell_type": "markdown",
"id": "9aa4d324",
"metadata": {},
"source": [
"- [BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding](https://arxiv.org/abs/1810.04805)\n"
]
},
{
"cell_type": "markdown",
"id": "6b1ba932",
"metadata": {},
"source": [
"### Blogs\n"
]
},
{
"cell_type": "markdown",
"id": "5c9e32e1",
"metadata": {},
"source": [
"- [Monologg님의 KoELECTRA 학습기](https://monologg.kr/categories/NLP/ELECTRA/)\n"
]
},
{
"cell_type": "markdown",
"id": "0b551dcf",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/beomi/kcbert-base](https://huggingface.co/beomi/kcbert-base) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
# 模型列表
## bhadresh-savani/roberta-base-emotion
| 模型名称 | 模型介绍 | 模型大小 | 模型下载 |
| --- | --- | --- | --- |
|bhadresh-savani/roberta-base-emotion| | 475.53MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/bhadresh-savani/roberta-base-emotion/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/bhadresh-savani/roberta-base-emotion/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/bhadresh-savani/roberta-base-emotion/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/bhadresh-savani/roberta-base-emotion/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/bhadresh-savani/roberta-base-emotion/vocab.json)<br>[vocab.txt](https://bj.bcebos.com/paddlenlp/models/community/bhadresh-savani/roberta-base-emotion/vocab.txt) |
也可以通过`paddlenlp` cli 工具来下载对应的模型权重,使用步骤如下所示:
* 安装paddlenlp
```shell
pip install --upgrade paddlenlp
```
* 下载命令行
```shell
paddlenlp download --cache-dir ./pretrained_models bhadresh-savani/roberta-base-emotion
```
有任何下载的问题都可以到[PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP)中发Issue提问。
\ No newline at end of file
# model list
##
| model | description | model_size | download |
| --- | --- | --- | --- |
|bhadresh-savani/roberta-base-emotion| | 475.53MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/bhadresh-savani/roberta-base-emotion/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/bhadresh-savani/roberta-base-emotion/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/bhadresh-savani/roberta-base-emotion/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/bhadresh-savani/roberta-base-emotion/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/bhadresh-savani/roberta-base-emotion/vocab.json)<br>[vocab.txt](https://bj.bcebos.com/paddlenlp/models/community/bhadresh-savani/roberta-base-emotion/vocab.txt) |
or you can download all of model file with the following steps:
* install paddlenlp
```shell
pip install --upgrade paddlenlp
```
* download model with cli tool
```shell
paddlenlp download --cache-dir ./pretrained_models bhadresh-savani/roberta-base-emotion
```
If you have any problems with it, you can post issue on [PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP) to get support.
Model_Info:
name: "bhadresh-savani/roberta-base-emotion"
description: "robert-base-emotion"
description_en: "robert-base-emotion"
icon: ""
from_repo: "https://huggingface.co/bhadresh-savani/roberta-base-emotion"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text Classification"
sub_tag: "文本分类"
Example:
Datasets: "emotion,emotion"
Publisher: "bhadresh-savani"
License: "apache-2.0"
Language: "English"
Paper:
- title: 'RoBERTa: A Robustly Optimized BERT Pretraining Approach'
url: 'http://arxiv.org/abs/1907.11692v1'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
# 模型列表
## cahya/bert-base-indonesian-522M
| 模型名称 | 模型介绍 | 模型大小 | 模型下载 |
| --- | --- | --- | --- |
|cahya/bert-base-indonesian-522M| | 518.25MB | [model_config.json](https://bj.bcebos.com/paddlenlp/models/community/cahya/bert-base-indonesian-522M/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/cahya/bert-base-indonesian-522M/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/cahya/bert-base-indonesian-522M/tokenizer_config.json)<br>[vocab.txt](https://bj.bcebos.com/paddlenlp/models/community/cahya/bert-base-indonesian-522M/vocab.txt) |
也可以通过`paddlenlp` cli 工具来下载对应的模型权重,使用步骤如下所示:
* 安装paddlenlp
```shell
pip install --upgrade paddlenlp
```
* 下载命令行
```shell
paddlenlp download --cache-dir ./pretrained_models cahya/bert-base-indonesian-522M
```
有任何下载的问题都可以到[PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP)中发Issue提问。
\ No newline at end of file
# model list
##
| model | description | model_size | download |
| --- | --- | --- | --- |
|cahya/bert-base-indonesian-522M| | 518.25MB | [model_config.json](https://bj.bcebos.com/paddlenlp/models/community/cahya/bert-base-indonesian-522M/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/cahya/bert-base-indonesian-522M/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/cahya/bert-base-indonesian-522M/tokenizer_config.json)<br>[vocab.txt](https://bj.bcebos.com/paddlenlp/models/community/cahya/bert-base-indonesian-522M/vocab.txt) |
or you can download all of model file with the following steps:
* install paddlenlp
```shell
pip install --upgrade paddlenlp
```
* download model with cli tool
```shell
paddlenlp download --cache-dir ./pretrained_models cahya/bert-base-indonesian-522M
```
If you have any problems with it, you can post issue on [PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP) to get support.
Model_Info:
name: "cahya/bert-base-indonesian-522M"
description: "Indonesian BERT base model (uncased)"
description_en: "Indonesian BERT base model (uncased)"
icon: ""
from_repo: "https://huggingface.co/cahya/bert-base-indonesian-522M"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: "wikipedia"
Publisher: "cahya"
License: "mit"
Language: "Indonesian"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
# 模型列表
## cahya/gpt2-small-indonesian-522M
| 模型名称 | 模型介绍 | 模型大小 | 模型下载 |
| --- | --- | --- | --- |
|cahya/gpt2-small-indonesian-522M| | 621.95MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/cahya/gpt2-small-indonesian-522M/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/cahya/gpt2-small-indonesian-522M/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/cahya/gpt2-small-indonesian-522M/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/cahya/gpt2-small-indonesian-522M/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/cahya/gpt2-small-indonesian-522M/vocab.json) |
也可以通过`paddlenlp` cli 工具来下载对应的模型权重,使用步骤如下所示:
* 安装paddlenlp
```shell
pip install --upgrade paddlenlp
```
* 下载命令行
```shell
paddlenlp download --cache-dir ./pretrained_models cahya/gpt2-small-indonesian-522M
```
有任何下载的问题都可以到[PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP)中发Issue提问。
\ No newline at end of file
# model list
##
| model | description | model_size | download |
| --- | --- | --- | --- |
|cahya/gpt2-small-indonesian-522M| | 621.95MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/cahya/gpt2-small-indonesian-522M/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/cahya/gpt2-small-indonesian-522M/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/cahya/gpt2-small-indonesian-522M/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/cahya/gpt2-small-indonesian-522M/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/cahya/gpt2-small-indonesian-522M/vocab.json) |
or you can download all of model file with the following steps:
* install paddlenlp
```shell
pip install --upgrade paddlenlp
```
* download model with cli tool
```shell
paddlenlp download --cache-dir ./pretrained_models cahya/gpt2-small-indonesian-522M
```
If you have any problems with it, you can post issue on [PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP) to get support.
Model_Info:
name: "cahya/gpt2-small-indonesian-522M"
description: "Indonesian GPT2 small model"
description_en: "Indonesian GPT2 small model"
icon: ""
from_repo: "https://huggingface.co/cahya/gpt2-small-indonesian-522M"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text Generation"
sub_tag: "文本生成"
Example:
Datasets: ""
Publisher: "cahya"
License: "mit"
Language: "Indonesian"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
# 模型列表
## ceshine/t5-paraphrase-paws-msrp-opinosis
| 模型名称 | 模型介绍 | 模型大小 | 模型下载 |
| --- | --- | --- | --- |
|ceshine/t5-paraphrase-paws-msrp-opinosis| | 1.11G | [model_config.json](https://bj.bcebos.com/paddlenlp/models/community/ceshine/t5-paraphrase-paws-msrp-opinosis/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/ceshine/t5-paraphrase-paws-msrp-opinosis/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/ceshine/t5-paraphrase-paws-msrp-opinosis/tokenizer_config.json) |
也可以通过`paddlenlp` cli 工具来下载对应的模型权重,使用步骤如下所示:
* 安装paddlenlp
```shell
pip install --upgrade paddlenlp
```
* 下载命令行
```shell
paddlenlp download --cache-dir ./pretrained_models ceshine/t5-paraphrase-paws-msrp-opinosis
```
有任何下载的问题都可以到[PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP)中发Issue提问。
\ No newline at end of file
# model list
##
| model | description | model_size | download |
| --- | --- | --- | --- |
|ceshine/t5-paraphrase-paws-msrp-opinosis| | 1.11G | [model_config.json](https://bj.bcebos.com/paddlenlp/models/community/ceshine/t5-paraphrase-paws-msrp-opinosis/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/ceshine/t5-paraphrase-paws-msrp-opinosis/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/ceshine/t5-paraphrase-paws-msrp-opinosis/tokenizer_config.json) |
or you can download all of model file with the following steps:
* install paddlenlp
```shell
pip install --upgrade paddlenlp
```
* download model with cli tool
```shell
paddlenlp download --cache-dir ./pretrained_models ceshine/t5-paraphrase-paws-msrp-opinosis
```
If you have any problems with it, you can post issue on [PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP) to get support.
Model_Info:
name: "ceshine/t5-paraphrase-paws-msrp-opinosis"
description: "T5-base Parapharasing model fine-tuned on PAWS, MSRP, and Opinosis"
description_en: "T5-base Parapharasing model fine-tuned on PAWS, MSRP, and Opinosis"
icon: ""
from_repo: "https://huggingface.co/ceshine/t5-paraphrase-paws-msrp-opinosis"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text2Text Generation"
sub_tag: "文本生成"
Example:
Datasets: ""
Publisher: "ceshine"
License: "apache-2.0"
Language: "English"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
# 模型列表
## ceshine/t5-paraphrase-quora-paws
| 模型名称 | 模型介绍 | 模型大小 | 模型下载 |
| --- | --- | --- | --- |
|ceshine/t5-paraphrase-quora-paws| | 1.11G | [model_config.json](https://bj.bcebos.com/paddlenlp/models/community/ceshine/t5-paraphrase-quora-paws/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/ceshine/t5-paraphrase-quora-paws/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/ceshine/t5-paraphrase-quora-paws/tokenizer_config.json) |
也可以通过`paddlenlp` cli 工具来下载对应的模型权重,使用步骤如下所示:
* 安装paddlenlp
```shell
pip install --upgrade paddlenlp
```
* 下载命令行
```shell
paddlenlp download --cache-dir ./pretrained_models ceshine/t5-paraphrase-quora-paws
```
有任何下载的问题都可以到[PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP)中发Issue提问。
\ No newline at end of file
# model list
##
| model | description | model_size | download |
| --- | --- | --- | --- |
|ceshine/t5-paraphrase-quora-paws| | 1.11G | [model_config.json](https://bj.bcebos.com/paddlenlp/models/community/ceshine/t5-paraphrase-quora-paws/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/ceshine/t5-paraphrase-quora-paws/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/ceshine/t5-paraphrase-quora-paws/tokenizer_config.json) |
or you can download all of model file with the following steps:
* install paddlenlp
```shell
pip install --upgrade paddlenlp
```
* download model with cli tool
```shell
paddlenlp download --cache-dir ./pretrained_models ceshine/t5-paraphrase-quora-paws
```
If you have any problems with it, you can post issue on [PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP) to get support.
Model_Info:
name: "ceshine/t5-paraphrase-quora-paws"
description: "T5-base Parapharasing model fine-tuned on PAWS and Quora"
description_en: "T5-base Parapharasing model fine-tuned on PAWS and Quora"
icon: ""
from_repo: "https://huggingface.co/ceshine/t5-paraphrase-quora-paws"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text2Text Generation"
sub_tag: "文本生成"
Example:
Datasets: ""
Publisher: "ceshine"
License: "apache-2.0"
Language: "English"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: Russian,English
License: mit
Model_Info:
name: "cointegrated/rubert-tiny"
description: "pip install transformers sentencepiece"
description_en: "pip install transformers sentencepiece"
icon: ""
from_repo: "https://huggingface.co/cointegrated/rubert-tiny"
description: pip install transformers sentencepiece
description_en: pip install transformers sentencepiece
from_repo: https://huggingface.co/cointegrated/rubert-tiny
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: cointegrated/rubert-tiny
Paper: null
Publisher: cointegrated
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Feature Extraction"
sub_tag: "特征抽取"
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Sentence Similarity"
sub_tag: "句子相似度"
Example:
Datasets: ""
Publisher: "cointegrated"
License: "mit"
Language: "Russian,English"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- sub_tag: 特征抽取
sub_tag_en: Feature Extraction
tag: 自然语言处理
tag_en: Natural Language Processing
- sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
- sub_tag: 句子相似度
sub_tag_en: Sentence Similarity
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "83973edc",
"metadata": {},
"source": [
"## Overview\n",
"\n",
"This is a very small distilled version of the bert-base-multilingual-cased model for Russian and English (45 MB, 12M parameters). There is also an **updated version of this model**, rubert-tiny2, with a larger vocabulary and better quality on practically all Russian NLU tasks.\n"
]
},
{
"cell_type": "markdown",
"id": "59944441",
"metadata": {},
"source": [
"This model is useful if you want to fine-tune it for a relatively simple Russian task (e.g. NER or sentiment classification), and you care more about speed and size than about accuracy. It is approximately x10 smaller and faster than a base-sized BERT. Its `[CLS]` embeddings can be used as a sentence representation aligned between Russian and English.\n"
]
},
{
"cell_type": "markdown",
"id": "c0e2918f",
"metadata": {},
"source": [
"It was trained on the [Yandex Translate corpus](https://translate.yandex.ru/corpus), [OPUS-100](https://huggingface.co/datasets/opus100) and Tatoeba, using MLM loss distilled from bert-base-multilingual-cased, translation ranking loss, and `[CLS]` embeddings distilled from LaBSE, rubert-base-cased-sentence, Laser and USE.\n"
]
},
{
"cell_type": "markdown",
"id": "b0c0158e",
"metadata": {},
"source": [
"There is a more detailed [description in Russian](https://habr.com/ru/post/562064/).\n"
]
},
{
"cell_type": "markdown",
"id": "28ce4026",
"metadata": {},
"source": [
"Sentence embeddings can be produced as follows:\n"
]
},
{
"cell_type": "markdown",
"id": "d521437a",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "da5acdb0",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "df2d3cc6",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cointegrated/rubert-tiny\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "065bda47",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/cointegrated/rubert-tiny](https://huggingface.co/cointegrated/rubert-tiny),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "b59db37b",
"metadata": {},
"source": [
"## Overview\n",
"\n",
"This is a very small distilled version of the bert-base-multilingual-cased model for Russian and English (45 MB, 12M parameters). There is also an **updated version of this model**, rubert-tiny2, with a larger vocabulary and better quality on practically all Russian NLU tasks.\n"
]
},
{
"cell_type": "markdown",
"id": "5e7c8c35",
"metadata": {},
"source": [
"This model is useful if you want to fine-tune it for a relatively simple Russian task (e.g. NER or sentiment classification), and you care more about speed and size than about accuracy. It is approximately x10 smaller and faster than a base-sized BERT. Its `[CLS]` embeddings can be used as a sentence representation aligned between Russian and English.\n"
]
},
{
"cell_type": "markdown",
"id": "bc3c5717",
"metadata": {},
"source": [
"It was trained on the [Yandex Translate corpus](https://translate.yandex.ru/corpus), [OPUS-100](https://huggingface.co/datasets/opus100) and Tatoeba, using MLM loss (distilled from bert-base-multilingual-cased\n",
"), translation ranking loss, and `[CLS]` embeddings distilled from LaBSE, rubert-base-cased-sentence, Laser and USE.\n"
]
},
{
"cell_type": "markdown",
"id": "2db0a3ee",
"metadata": {},
"source": [
"There is a more detailed [description in Russian](https://habr.com/ru/post/562064/).\n"
]
},
{
"cell_type": "markdown",
"id": "c3a52477",
"metadata": {},
"source": [
"Sentence embeddings can be produced as follows:\n"
]
},
{
"cell_type": "markdown",
"id": "add13de4",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c0a8f905",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "481d0ca6",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cointegrated/rubert-tiny\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "e6df17e3",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/cointegrated/rubert-tiny](https://huggingface.co/cointegrated/rubert-tiny) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: Russian
License: mit
Model_Info:
name: "cointegrated/rubert-tiny2"
description: "pip install transformers sentencepiece"
description_en: "pip install transformers sentencepiece"
icon: ""
from_repo: "https://huggingface.co/cointegrated/rubert-tiny2"
description: pip install transformers sentencepiece
description_en: pip install transformers sentencepiece
from_repo: https://huggingface.co/cointegrated/rubert-tiny2
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: cointegrated/rubert-tiny2
Paper: null
Publisher: cointegrated
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Feature Extraction"
sub_tag: "特征抽取"
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Sentence Similarity"
sub_tag: "句子相似度"
Example:
Datasets: ""
Publisher: "cointegrated"
License: "mit"
Language: "Russian"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- sub_tag: 特征抽取
sub_tag_en: Feature Extraction
tag: 自然语言处理
tag_en: Natural Language Processing
- sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
- sub_tag: 句子相似度
sub_tag_en: Sentence Similarity
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "9eef057a",
"metadata": {},
"source": [
"## Overview\n",
"\n",
"This is an updated version of cointegrated/rubert-tiny: a small Russian BERT-based encoder with high-quality sentence embeddings. This [post in Russian](https://habr.com/ru/post/669674/) gives more details.\n"
]
},
{
"cell_type": "markdown",
"id": "08d9a049",
"metadata": {},
"source": [
"The differences from the previous version include:\n",
"- a larger vocabulary: 83828 tokens instead of 29564;\n",
"- larger supported sequences: 2048 instead of 512;\n",
"- sentence embeddings approximate LaBSE closer than before;\n",
"- meaningful segment embeddings (tuned on the NLI task)\n",
"- the model is focused only on Russian.\n"
]
},
{
"cell_type": "markdown",
"id": "8a7ba50b",
"metadata": {},
"source": [
"The model should be used as is to produce sentence embeddings (e.g. for KNN classification of short texts) or fine-tuned for a downstream task.\n"
]
},
{
"cell_type": "markdown",
"id": "184e1cc6",
"metadata": {},
"source": [
"Sentence embeddings can be produced as follows:\n"
]
},
{
"cell_type": "markdown",
"id": "a9613056",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d60b7b64",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "716f2b63",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cointegrated/rubert-tiny2\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "0ba8c599",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/cointegrated/rubert-tiny2](https://huggingface.co/cointegrated/rubert-tiny2),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "db267b71",
"metadata": {},
"source": [
"## Overview\n",
"\n",
"This is an updated version of cointegrated/rubert-tiny: a small Russian BERT-based encoder with high-quality sentence embeddings. This [post in Russian](https://habr.com/ru/post/669674/) gives more details.\n"
]
},
{
"cell_type": "markdown",
"id": "801acf5c",
"metadata": {},
"source": [
"The differences from the previous version include:\n",
"- a larger vocabulary: 83828 tokens instead of 29564;\n",
"- larger supported sequences: 2048 instead of 512;\n",
"- sentence embeddings approximate LaBSE closer than before;\n",
"- meaningful segment embeddings (tuned on the NLI task)\n",
"- the model is focused only on Russian.\n"
]
},
{
"cell_type": "markdown",
"id": "f2c7dbc1",
"metadata": {},
"source": [
"The model should be used as is to produce sentence embeddings (e.g. for KNN classification of short texts) or fine-tuned for a downstream task.\n"
]
},
{
"cell_type": "markdown",
"id": "9ff63df2",
"metadata": {},
"source": [
"Sentence embeddings can be produced as follows:\n"
]
},
{
"cell_type": "markdown",
"id": "2b073558",
"metadata": {},
"source": [
"## how to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c98c0cce",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "81978806",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cointegrated/rubert-tiny2\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "33dbe378",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/cointegrated/rubert-tiny2](https://huggingface.co/cointegrated/rubert-tiny2) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: ''
License: apache-2.0
Model_Info:
name: "cross-encoder/ms-marco-MiniLM-L-12-v2"
description: "Cross-Encoder for MS Marco"
description_en: "Cross-Encoder for MS Marco"
icon: ""
from_repo: "https://huggingface.co/cross-encoder/ms-marco-MiniLM-L-12-v2"
description: Cross-Encoder for MS Marco
description_en: Cross-Encoder for MS Marco
from_repo: https://huggingface.co/cross-encoder/ms-marco-MiniLM-L-12-v2
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: cross-encoder/ms-marco-MiniLM-L-12-v2
Paper: null
Publisher: cross-encoder
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text Classification"
sub_tag: "文本分类"
Example:
Datasets: ""
Publisher: "cross-encoder"
License: "apache-2.0"
Language: ""
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- sub_tag: 文本分类
sub_tag_en: Text Classification
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "b14e9fee",
"metadata": {},
"source": [
"# Cross-Encoder for MS Marco\n"
]
},
{
"cell_type": "markdown",
"id": "770d5215",
"metadata": {},
"source": [
"This model was trained on the [MS Marco Passage Ranking](https://github.com/microsoft/MSMARCO-Passage-Ranking) task.\n"
]
},
{
"cell_type": "markdown",
"id": "0e8686b5",
"metadata": {},
"source": [
"The model can be used for Information Retrieval: Given a query, encode the query will all possible passages (e.g. retrieved with ElasticSearch). Then sort the passages in a decreasing order. See [SBERT.net Retrieve & Re-rank](https://www.sbert.net/examples/applications/retrieve_rerank/README.html) for more details. The training code is available here: [SBERT.net Training MS Marco](https://github.com/UKPLab/sentence-transformers/tree/master/examples/training/ms_marco)\n"
]
},
{
"cell_type": "markdown",
"id": "c437c78a",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1f4581da",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "295c7df7",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/ms-marco-MiniLM-L-12-v2\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "706017d9",
"metadata": {},
"source": [
"## Performance\n",
"In the following table, we provide various pre-trained Cross-Encoders together with their performance on the [TREC Deep Learning 2019](https://microsoft.github.io/TREC-2019-Deep-Learning/) and the [MS Marco Passage Reranking](https://github.com/microsoft/MSMARCO-Passage-Ranking/) dataset.\n"
]
},
{
"cell_type": "markdown",
"id": "2aa6bf22",
"metadata": {},
"source": [
"| Model-Name | NDCG@10 (TREC DL 19) | MRR@10 (MS Marco Dev) | Docs / Sec |\n",
"| ------------- |:-------------| -----| --- |\n",
"| **Version 2 models** | | |\n",
"| cross-encoder/ms-marco-TinyBERT-L-2-v2 | 69.84 | 32.56 | 9000\n",
"| cross-encoder/ms-marco-MiniLM-L-2-v2 | 71.01 | 34.85 | 4100\n",
"| cross-encoder/ms-marco-MiniLM-L-4-v2 | 73.04 | 37.70 | 2500\n",
"| cross-encoder/ms-marco-MiniLM-L-6-v2 | 74.30 | 39.01 | 1800\n",
"| cross-encoder/ms-marco-MiniLM-L-12-v2 | 74.31 | 39.02 | 960\n",
"| **Version 1 models** | | |\n",
"| cross-encoder/ms-marco-TinyBERT-L-2 | 67.43 | 30.15 | 9000\n",
"| cross-encoder/ms-marco-TinyBERT-L-4 | 68.09 | 34.50 | 2900\n",
"| cross-encoder/ms-marco-TinyBERT-L-6 | 69.57 | 36.13 | 680\n",
"| cross-encoder/ms-marco-electra-base | 71.99 | 36.41 | 340\n",
"| **Other models** | | |\n",
"| nboost/pt-tinybert-msmarco | 63.63 | 28.80 | 2900\n",
"| nboost/pt-bert-base-uncased-msmarco | 70.94 | 34.75 | 340\n",
"| nboost/pt-bert-large-msmarco | 73.36 | 36.48 | 100\n",
"| Capreolus/electra-base-msmarco | 71.23 | 36.89 | 340\n",
"| amberoad/bert-multilingual-passage-reranking-msmarco | 68.40 | 35.54 | 330\n",
"| sebastian-hofstaetter/distilbert-cat-margin_mse-T2-msmarco | 72.82 | 37.88 | 720\n"
]
},
{
"cell_type": "markdown",
"id": "65eda465",
"metadata": {},
"source": [
"Note: Runtime was computed on a V100 GPU.\n",
"> 此模型介绍及权重来源于[https://huggingface.co/cross-encoder/ms-marco-MiniLM-L-12-v2](https://huggingface.co/cross-encoder/ms-marco-MiniLM-L-12-v2),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "366980e6",
"metadata": {},
"source": [
"# Cross-Encoder for MS Marco\n"
]
},
{
"cell_type": "markdown",
"id": "4c7d726e",
"metadata": {},
"source": [
"This model was trained on the [MS Marco Passage Ranking](https://github.com/microsoft/MSMARCO-Passage-Ranking) task.\n"
]
},
{
"cell_type": "markdown",
"id": "1535e90f",
"metadata": {},
"source": [
"The model can be used for Information Retrieval: Given a query, encode the query will all possible passages (e.g. retrieved with ElasticSearch). Then sort the passages in a decreasing order. See [SBERT.net Retrieve & Re-rank](https://www.sbert.net/examples/applications/retrieve_rerank/README.html) for more details. The training code is available here: [SBERT.net Training MS Marco](https://github.com/UKPLab/sentence-transformers/tree/master/examples/training/ms_marco)\n"
]
},
{
"cell_type": "markdown",
"id": "3eda3140",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "74d5bcd7",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "59553cde",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/ms-marco-MiniLM-L-12-v2\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "0b6883fa",
"metadata": {},
"source": [
"## Performance\n",
"In the following table, we provide various pre-trained Cross-Encoders together with their performance on the [TREC Deep Learning 2019](https://microsoft.github.io/TREC-2019-Deep-Learning/) and the [MS Marco Passage Reranking](https://github.com/microsoft/MSMARCO-Passage-Ranking/) dataset.\n"
]
},
{
"cell_type": "markdown",
"id": "e04ad9db",
"metadata": {},
"source": [
"| Model-Name | NDCG@10 (TREC DL 19) | MRR@10 (MS Marco Dev) | Docs / Sec |\n",
"| ------------- |:-------------| -----| --- |\n",
"| **Version 2 models** | | |\n",
"| cross-encoder/ms-marco-TinyBERT-L-2-v2 | 69.84 | 32.56 | 9000\n",
"| cross-encoder/ms-marco-MiniLM-L-2-v2 | 71.01 | 34.85 | 4100\n",
"| cross-encoder/ms-marco-MiniLM-L-4-v2 | 73.04 | 37.70 | 2500\n",
"| cross-encoder/ms-marco-MiniLM-L-6-v2 | 74.30 | 39.01 | 1800\n",
"| cross-encoder/ms-marco-MiniLM-L-12-v2 | 74.31 | 39.02 | 960\n",
"| **Version 1 models** | | |\n",
"| cross-encoder/ms-marco-TinyBERT-L-2 | 67.43 | 30.15 | 9000\n",
"| cross-encoder/ms-marco-TinyBERT-L-4 | 68.09 | 34.50 | 2900\n",
"| cross-encoder/ms-marco-TinyBERT-L-6 | 69.57 | 36.13 | 680\n",
"| cross-encoder/ms-marco-electra-base | 71.99 | 36.41 | 340\n",
"| **Other models** | | |\n",
"| nboost/pt-tinybert-msmarco | 63.63 | 28.80 | 2900\n",
"| nboost/pt-bert-base-uncased-msmarco | 70.94 | 34.75 | 340\n",
"| nboost/pt-bert-large-msmarco | 73.36 | 36.48 | 100\n",
"| Capreolus/electra-base-msmarco | 71.23 | 36.89 | 340\n",
"| amberoad/bert-multilingual-passage-reranking-msmarco | 68.40 | 35.54 | 330\n",
"| sebastian-hofstaetter/distilbert-cat-margin_mse-T2-msmarco | 72.82 | 37.88 | 720\n"
]
},
{
"cell_type": "markdown",
"id": "18e7124d",
"metadata": {},
"source": [
"Note: Runtime was computed on a V100 GPU.\n",
"> The model introduction and model weights originate from [https://huggingface.co/cross-encoder/ms-marco-MiniLM-L-12-v2](https://huggingface.co/cross-encoder/ms-marco-MiniLM-L-12-v2) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: ''
License: apache-2.0
Model_Info:
name: "cross-encoder/ms-marco-TinyBERT-L-2"
description: "Cross-Encoder for MS Marco"
description_en: "Cross-Encoder for MS Marco"
icon: ""
from_repo: "https://huggingface.co/cross-encoder/ms-marco-TinyBERT-L-2"
description: Cross-Encoder for MS Marco
description_en: Cross-Encoder for MS Marco
from_repo: https://huggingface.co/cross-encoder/ms-marco-TinyBERT-L-2
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: cross-encoder/ms-marco-TinyBERT-L-2
Paper: null
Publisher: cross-encoder
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text Classification"
sub_tag: "文本分类"
Example:
Datasets: ""
Publisher: "cross-encoder"
License: "apache-2.0"
Language: ""
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- sub_tag: 文本分类
sub_tag_en: Text Classification
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "32947f83",
"metadata": {},
"source": [
"# Cross-Encoder for MS Marco\n"
]
},
{
"cell_type": "markdown",
"id": "d34eaa08",
"metadata": {},
"source": [
"This model was trained on the [MS Marco Passage Ranking](https://github.com/microsoft/MSMARCO-Passage-Ranking) task.\n"
]
},
{
"cell_type": "markdown",
"id": "dcf2e434",
"metadata": {},
"source": [
"The model can be used for Information Retrieval: Given a query, encode the query will all possible passages (e.g. retrieved with ElasticSearch). Then sort the passages in a decreasing order. See [SBERT.net Retrieve & Re-rank](https://www.sbert.net/examples/applications/retrieve_rerank/README.html) for more details. The training code is available here: [SBERT.net Training MS Marco](https://github.com/UKPLab/sentence-transformers/tree/master/examples/training/ms_marco)\n"
]
},
{
"cell_type": "markdown",
"id": "bb938635",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "463fcbb2",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e3ac7704",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/ms-marco-TinyBERT-L-2\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "e185e8d7",
"metadata": {},
"source": [
"## Performance\n",
"In the following table, we provide various pre-trained Cross-Encoders together with their performance on the [TREC Deep Learning 2019](https://microsoft.github.io/TREC-2019-Deep-Learning/) and the [MS Marco Passage Reranking](https://github.com/microsoft/MSMARCO-Passage-Ranking/) dataset.\n"
]
},
{
"cell_type": "markdown",
"id": "1b6ce4a0",
"metadata": {},
"source": [
"| Model-Name | NDCG@10 (TREC DL 19) | MRR@10 (MS Marco Dev) | Docs / Sec |\n",
"| ------------- |:-------------| -----| --- |\n",
"| **Version 2 models** | | |\n",
"| cross-encoder/ms-marco-TinyBERT-L-2-v2 | 69.84 | 32.56 | 9000\n",
"| cross-encoder/ms-marco-MiniLM-L-2-v2 | 71.01 | 34.85 | 4100\n",
"| cross-encoder/ms-marco-MiniLM-L-4-v2 | 73.04 | 37.70 | 2500\n",
"| cross-encoder/ms-marco-MiniLM-L-6-v2 | 74.30 | 39.01 | 1800\n",
"| cross-encoder/ms-marco-MiniLM-L-12-v2 | 74.31 | 39.02 | 960\n",
"| **Version 1 models** | | |\n",
"| cross-encoder/ms-marco-TinyBERT-L-2 | 67.43 | 30.15 | 9000\n",
"| cross-encoder/ms-marco-TinyBERT-L-4 | 68.09 | 34.50 | 2900\n",
"| cross-encoder/ms-marco-TinyBERT-L-6 | 69.57 | 36.13 | 680\n",
"| cross-encoder/ms-marco-electra-base | 71.99 | 36.41 | 340\n",
"| **Other models** | | |\n",
"| nboost/pt-tinybert-msmarco | 63.63 | 28.80 | 2900\n",
"| nboost/pt-bert-base-uncased-msmarco | 70.94 | 34.75 | 340\n",
"| nboost/pt-bert-large-msmarco | 73.36 | 36.48 | 100\n",
"| Capreolus/electra-base-msmarco | 71.23 | 36.89 | 340\n",
"| amberoad/bert-multilingual-passage-reranking-msmarco | 68.40 | 35.54 | 330\n",
"| sebastian-hofstaetter/distilbert-cat-margin_mse-T2-msmarco | 72.82 | 37.88 | 720\n"
]
},
{
"cell_type": "markdown",
"id": "478f9bd9",
"metadata": {},
"source": [
"Note: Runtime was computed on a V100 GPU.\n",
"> 此模型介绍及权重来源于[https://huggingface.co/cross-encoder/ms-marco-TinyBERT-L-2](https://huggingface.co/cross-encoder/ms-marco-TinyBERT-L-2),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "545c6ec0",
"metadata": {},
"source": [
"# Cross-Encoder for MS Marco\n"
]
},
{
"cell_type": "markdown",
"id": "cbd27361",
"metadata": {},
"source": [
"This model was trained on the [MS Marco Passage Ranking](https://github.com/microsoft/MSMARCO-Passage-Ranking) task.\n"
]
},
{
"cell_type": "markdown",
"id": "185acb77",
"metadata": {},
"source": [
"The model can be used for Information Retrieval: Given a query, encode the query will all possible passages (e.g. retrieved with ElasticSearch). Then sort the passages in a decreasing order. See [SBERT.net Retrieve & Re-rank](https://www.sbert.net/examples/applications/retrieve_rerank/README.html) for more details. The training code is available here: [SBERT.net Training MS Marco](https://github.com/UKPLab/sentence-transformers/tree/master/examples/training/ms_marco)\n"
]
},
{
"cell_type": "markdown",
"id": "1fb83fc3",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2cf01d71",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d24e4eb7",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/ms-marco-TinyBERT-L-2\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "7eb19416",
"metadata": {},
"source": [
"## Performance\n",
"In the following table, we provide various pre-trained Cross-Encoders together with their performance on the [TREC Deep Learning 2019](https://microsoft.github.io/TREC-2019-Deep-Learning/) and the [MS Marco Passage Reranking](https://github.com/microsoft/MSMARCO-Passage-Ranking/) dataset.\n"
]
},
{
"cell_type": "markdown",
"id": "e51901bb",
"metadata": {},
"source": [
"| Model-Name | NDCG@10 (TREC DL 19) | MRR@10 (MS Marco Dev) | Docs / Sec |\n",
"| ------------- |:-------------| -----| --- |\n",
"| **Version 2 models** | | |\n",
"| cross-encoder/ms-marco-TinyBERT-L-2-v2 | 69.84 | 32.56 | 9000\n",
"| cross-encoder/ms-marco-MiniLM-L-2-v2 | 71.01 | 34.85 | 4100\n",
"| cross-encoder/ms-marco-MiniLM-L-4-v2 | 73.04 | 37.70 | 2500\n",
"| cross-encoder/ms-marco-MiniLM-L-6-v2 | 74.30 | 39.01 | 1800\n",
"| cross-encoder/ms-marco-MiniLM-L-12-v2 | 74.31 | 39.02 | 960\n",
"| **Version 1 models** | | |\n",
"| cross-encoder/ms-marco-TinyBERT-L-2 | 67.43 | 30.15 | 9000\n",
"| cross-encoder/ms-marco-TinyBERT-L-4 | 68.09 | 34.50 | 2900\n",
"| cross-encoder/ms-marco-TinyBERT-L-6 | 69.57 | 36.13 | 680\n",
"| cross-encoder/ms-marco-electra-base | 71.99 | 36.41 | 340\n",
"| **Other models** | | |\n",
"| nboost/pt-tinybert-msmarco | 63.63 | 28.80 | 2900\n",
"| nboost/pt-bert-base-uncased-msmarco | 70.94 | 34.75 | 340\n",
"| nboost/pt-bert-large-msmarco | 73.36 | 36.48 | 100\n",
"| Capreolus/electra-base-msmarco | 71.23 | 36.89 | 340\n",
"| amberoad/bert-multilingual-passage-reranking-msmarco | 68.40 | 35.54 | 330\n",
"| sebastian-hofstaetter/distilbert-cat-margin_mse-T2-msmarco | 72.82 | 37.88 | 720\n"
]
},
{
"cell_type": "markdown",
"id": "f2318843",
"metadata": {},
"source": [
"Note: Runtime was computed on a V100 GPU.\n",
"> The model introduction and model weights originate from [https://huggingface.co/cross-encoder/ms-marco-TinyBERT-L-2](https://huggingface.co/cross-encoder/ms-marco-TinyBERT-L-2) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: multi_nli,snli
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info:
name: "cross-encoder/nli-MiniLM2-L6-H768"
description: "Cross-Encoder for Natural Language Inference"
description_en: "Cross-Encoder for Natural Language Inference"
icon: ""
from_repo: "https://huggingface.co/cross-encoder/nli-MiniLM2-L6-H768"
description: Cross-Encoder for Natural Language Inference
description_en: Cross-Encoder for Natural Language Inference
from_repo: https://huggingface.co/cross-encoder/nli-MiniLM2-L6-H768
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: cross-encoder/nli-MiniLM2-L6-H768
Paper: null
Publisher: cross-encoder
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Zero-Shot Classification"
sub_tag: "零样本分类"
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text Classification"
sub_tag: "文本分类"
Example:
Datasets: "multi_nli,snli"
Publisher: "cross-encoder"
License: "apache-2.0"
Language: "English"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- sub_tag: 零样本分类
sub_tag_en: Zero-Shot Classification
tag: 自然语言处理
tag_en: Natural Language Processing
- sub_tag: 文本分类
sub_tag_en: Text Classification
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "f11c50a6",
"metadata": {},
"source": [
"# Cross-Encoder for Natural Language Inference\n",
"This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class.\n"
]
},
{
"cell_type": "markdown",
"id": "e01fe90a",
"metadata": {},
"source": [
"## Training Data\n",
"The model was trained on the [SNLI](https://nlp.stanford.edu/projects/snli/) and [MultiNLI](https://cims.nyu.edu/~sbowman/multinli/) datasets. For a given sentence pair, it will output three scores corresponding to the labels: contradiction, entailment, neutral.\n"
]
},
{
"cell_type": "markdown",
"id": "ff850419",
"metadata": {},
"source": [
"## Performance\n",
"For evaluation results, see [SBERT.net - Pretrained Cross-Encoder](https://www.sbert.net/docs/pretrained_cross-encoders.html#nli).\n"
]
},
{
"cell_type": "markdown",
"id": "a0b92b0d",
"metadata": {},
"source": [
"## Usage\n"
]
},
{
"cell_type": "markdown",
"id": "d3857388",
"metadata": {},
"source": [
"Pre-trained models can be used like this:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d2c99a51",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "aeda53c1",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/nli-MiniLM2-L6-H768\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "760a7b59",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/cross-encoder/nli-MiniLM2-L6-H768](https://huggingface.co/cross-encoder/nli-MiniLM2-L6-H768),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "7d3f71fa",
"metadata": {},
"source": [
"# Cross-Encoder for Natural Language Inference\n",
"This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class.\n"
]
},
{
"cell_type": "markdown",
"id": "daf01f92",
"metadata": {},
"source": [
"## Training Data\n",
"The model was trained on the [SNLI](https://nlp.stanford.edu/projects/snli/) and [MultiNLI](https://cims.nyu.edu/~sbowman/multinli/) datasets. For a given sentence pair, it will output three scores corresponding to the labels: contradiction, entailment, neutral.\n"
]
},
{
"cell_type": "markdown",
"id": "805a7294",
"metadata": {},
"source": [
"## Performance\n",
"For evaluation results, see [SBERT.net - Pretrained Cross-Encoder](https://www.sbert.net/docs/pretrained_cross-encoders.html#nli).\n"
]
},
{
"cell_type": "markdown",
"id": "46a403e0",
"metadata": {},
"source": [
"## Usage\n"
]
},
{
"cell_type": "markdown",
"id": "abbbbd38",
"metadata": {},
"source": [
"Pre-trained models can be used like this:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a2522fb4",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1557ae2a",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/nli-MiniLM2-L6-H768\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "4259d72d",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/cross-encoder/nli-MiniLM2-L6-H768](https://huggingface.co/cross-encoder/nli-MiniLM2-L6-H768) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: multi_nli,snli
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info:
name: "cross-encoder/nli-distilroberta-base"
description: "Cross-Encoder for Natural Language Inference"
description_en: "Cross-Encoder for Natural Language Inference"
icon: ""
from_repo: "https://huggingface.co/cross-encoder/nli-distilroberta-base"
description: Cross-Encoder for Natural Language Inference
description_en: Cross-Encoder for Natural Language Inference
from_repo: https://huggingface.co/cross-encoder/nli-distilroberta-base
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: cross-encoder/nli-distilroberta-base
Paper: null
Publisher: cross-encoder
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Zero-Shot Classification"
sub_tag: "零样本分类"
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text Classification"
sub_tag: "文本分类"
Example:
Datasets: "multi_nli,snli"
Publisher: "cross-encoder"
License: "apache-2.0"
Language: "English"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- sub_tag: 零样本分类
sub_tag_en: Zero-Shot Classification
tag: 自然语言处理
tag_en: Natural Language Processing
- sub_tag: 文本分类
sub_tag_en: Text Classification
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "dfce17cd",
"metadata": {},
"source": [
"# Cross-Encoder for Natural Language Inference\n",
"This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class.\n"
]
},
{
"cell_type": "markdown",
"id": "ec682169",
"metadata": {},
"source": [
"## Training Data\n",
"The model was trained on the [SNLI](https://nlp.stanford.edu/projects/snli/) and [MultiNLI](https://cims.nyu.edu/~sbowman/multinli/) datasets. For a given sentence pair, it will output three scores corresponding to the labels: contradiction, entailment, neutral.\n"
]
},
{
"cell_type": "markdown",
"id": "ba993930",
"metadata": {},
"source": [
"## Performance\n",
"For evaluation results, see [SBERT.net - Pretrained Cross-Encoder](https://www.sbert.net/docs/pretrained_cross-encoders.html#nli).\n"
]
},
{
"cell_type": "markdown",
"id": "15de6eec",
"metadata": {},
"source": [
"## Usage\n"
]
},
{
"cell_type": "markdown",
"id": "6ab89b97",
"metadata": {},
"source": [
"Pre-trained models can be used like this:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f53af30f",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f31b1839",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/nli-distilroberta-base\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "4254d407",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/cross-encoder/nli-distilroberta-base](https://huggingface.co/cross-encoder/nli-distilroberta-base),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "a4ae7e65",
"metadata": {},
"source": [
"# Cross-Encoder for Natural Language Inference\n",
"This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class.\n"
]
},
{
"cell_type": "markdown",
"id": "f2d88a35",
"metadata": {},
"source": [
"## Training Data\n",
"The model was trained on the [SNLI](https://nlp.stanford.edu/projects/snli/) and [MultiNLI](https://cims.nyu.edu/~sbowman/multinli/) datasets. For a given sentence pair, it will output three scores corresponding to the labels: contradiction, entailment, neutral.\n"
]
},
{
"cell_type": "markdown",
"id": "d982bc91",
"metadata": {},
"source": [
"## Performance\n",
"For evaluation results, see [SBERT.net - Pretrained Cross-Encoder](https://www.sbert.net/docs/pretrained_cross-encoders.html#nli).\n"
]
},
{
"cell_type": "markdown",
"id": "1f3796c9",
"metadata": {},
"source": [
"## Usage\n"
]
},
{
"cell_type": "markdown",
"id": "14206f74",
"metadata": {},
"source": [
"Pre-trained models can be used like this:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7e5f7a2f",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "05497be6",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/nli-distilroberta-base\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "ea7e434c",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/cross-encoder/nli-distilroberta-base](https://huggingface.co/cross-encoder/nli-distilroberta-base) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: multi_nli,snli
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info:
name: "cross-encoder/nli-roberta-base"
description: "Cross-Encoder for Natural Language Inference"
description_en: "Cross-Encoder for Natural Language Inference"
icon: ""
from_repo: "https://huggingface.co/cross-encoder/nli-roberta-base"
description: Cross-Encoder for Natural Language Inference
description_en: Cross-Encoder for Natural Language Inference
from_repo: https://huggingface.co/cross-encoder/nli-roberta-base
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: cross-encoder/nli-roberta-base
Paper: null
Publisher: cross-encoder
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Zero-Shot Classification"
sub_tag: "零样本分类"
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text Classification"
sub_tag: "文本分类"
Example:
Datasets: "multi_nli,snli"
Publisher: "cross-encoder"
License: "apache-2.0"
Language: "English"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- sub_tag: 零样本分类
sub_tag_en: Zero-Shot Classification
tag: 自然语言处理
tag_en: Natural Language Processing
- sub_tag: 文本分类
sub_tag_en: Text Classification
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "4fd29af9",
"metadata": {},
"source": [
"# Cross-Encoder for Natural Language Inference\n",
"This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class.\n"
]
},
{
"cell_type": "markdown",
"id": "26cf9863",
"metadata": {},
"source": [
"## Training Data\n",
"The model was trained on the [SNLI](https://nlp.stanford.edu/projects/snli/) and [MultiNLI](https://cims.nyu.edu/~sbowman/multinli/) datasets. For a given sentence pair, it will output three scores corresponding to the labels: contradiction, entailment, neutral.\n"
]
},
{
"cell_type": "markdown",
"id": "913c77b3",
"metadata": {},
"source": [
"## Performance\n",
"For evaluation results, see [SBERT.net - Pretrained Cross-Encoder](https://www.sbert.net/docs/pretrained_cross-encoders.html#nli).\n"
]
},
{
"cell_type": "markdown",
"id": "1edcf5c1",
"metadata": {},
"source": [
"## Usage\n"
]
},
{
"cell_type": "markdown",
"id": "a3d044ef",
"metadata": {},
"source": [
"Pre-trained models can be used like this:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "549f470f",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "358989b6",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/nli-roberta-base\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "453c5b27",
"metadata": {},
"source": [
"此模型介绍及权重来源于[https://huggingface.co/cross-encoder/nli-roberta-base](https://huggingface.co/cross-encoder/nli-roberta-base),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "d174d9c5",
"metadata": {},
"source": [
"# Cross-Encoder for Natural Language Inference\n",
"This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class.\n"
]
},
{
"cell_type": "markdown",
"id": "6b47f4c6",
"metadata": {},
"source": [
"## Training Data\n",
"The model was trained on the [SNLI](https://nlp.stanford.edu/projects/snli/) and [MultiNLI](https://cims.nyu.edu/~sbowman/multinli/) datasets. For a given sentence pair, it will output three scores corresponding to the labels: contradiction, entailment, neutral.\n"
]
},
{
"cell_type": "markdown",
"id": "39bc9190",
"metadata": {},
"source": [
"## Performance\n",
"For evaluation results, see [SBERT.net - Pretrained Cross-Encoder](https://www.sbert.net/docs/pretrained_cross-encoders.html#nli).\n"
]
},
{
"cell_type": "markdown",
"id": "0d84928d",
"metadata": {},
"source": [
"## Usage\n"
]
},
{
"cell_type": "markdown",
"id": "3b2a033c",
"metadata": {},
"source": [
"Pre-trained models can be used like this:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4d9e33fd",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f84786a3",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/nli-roberta-base\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "dac6563c",
"metadata": {},
"source": [
"The model introduction and model weights originate from [https://huggingface.co/cross-encoder/nli-roberta-base](https://huggingface.co/cross-encoder/nli-roberta-base) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
# 模型列表
## cross-encoder/qnli-distilroberta-base
| 模型名称 | 模型介绍 | 模型大小 | 模型下载 |
| --- | --- | --- | --- |
|cross-encoder/qnli-distilroberta-base| | 313.28MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/cross-encoder/qnli-distilroberta-base/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/cross-encoder/qnli-distilroberta-base/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/cross-encoder/qnli-distilroberta-base/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/cross-encoder/qnli-distilroberta-base/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/cross-encoder/qnli-distilroberta-base/vocab.json)<br>[vocab.txt](https://bj.bcebos.com/paddlenlp/models/community/cross-encoder/qnli-distilroberta-base/vocab.txt) |
也可以通过`paddlenlp` cli 工具来下载对应的模型权重,使用步骤如下所示:
* 安装paddlenlp
```shell
pip install --upgrade paddlenlp
```
* 下载命令行
```shell
paddlenlp download --cache-dir ./pretrained_models cross-encoder/qnli-distilroberta-base
```
有任何下载的问题都可以到[PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP)中发Issue提问。
\ No newline at end of file
# model list
##
| model | description | model_size | download |
| --- | --- | --- | --- |
|cross-encoder/qnli-distilroberta-base| | 313.28MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/cross-encoder/qnli-distilroberta-base/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/cross-encoder/qnli-distilroberta-base/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/cross-encoder/qnli-distilroberta-base/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/cross-encoder/qnli-distilroberta-base/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/cross-encoder/qnli-distilroberta-base/vocab.json)<br>[vocab.txt](https://bj.bcebos.com/paddlenlp/models/community/cross-encoder/qnli-distilroberta-base/vocab.txt) |
or you can download all of model file with the following steps:
* install paddlenlp
```shell
pip install --upgrade paddlenlp
```
* download model with cli tool
```shell
paddlenlp download --cache-dir ./pretrained_models cross-encoder/qnli-distilroberta-base
```
If you have any problems with it, you can post issue on [PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP) to get support.
Model_Info:
name: "cross-encoder/qnli-distilroberta-base"
description: "Cross-Encoder for Quora Duplicate Questions Detection"
description_en: "Cross-Encoder for Quora Duplicate Questions Detection"
icon: ""
from_repo: "https://huggingface.co/cross-encoder/qnli-distilroberta-base"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text Classification"
sub_tag: "文本分类"
Example:
Datasets: ""
Publisher: "cross-encoder"
License: "apache-2.0"
Language: ""
Paper:
- title: 'GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding'
url: 'http://arxiv.org/abs/1804.07461v3'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: ''
License: apache-2.0
Model_Info:
name: "cross-encoder/quora-distilroberta-base"
description: "Cross-Encoder for Quora Duplicate Questions Detection"
description_en: "Cross-Encoder for Quora Duplicate Questions Detection"
icon: ""
from_repo: "https://huggingface.co/cross-encoder/quora-distilroberta-base"
description: Cross-Encoder for Quora Duplicate Questions Detection
description_en: Cross-Encoder for Quora Duplicate Questions Detection
from_repo: https://huggingface.co/cross-encoder/quora-distilroberta-base
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: cross-encoder/quora-distilroberta-base
Paper: null
Publisher: cross-encoder
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text Classification"
sub_tag: "文本分类"
Example:
Datasets: ""
Publisher: "cross-encoder"
License: "apache-2.0"
Language: ""
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- sub_tag: 文本分类
sub_tag_en: Text Classification
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "9108ec88",
"metadata": {},
"source": [
"# Cross-Encoder for Quora Duplicate Questions Detection\n",
"This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class.\n"
]
},
{
"cell_type": "markdown",
"id": "dcc58a5d",
"metadata": {},
"source": [
"## Training Data\n",
"This model was trained on the [Quora Duplicate Questions](https://www.quora.com/q/quoradata/First-Quora-Dataset-Release-Question-Pairs) dataset. The model will predict a score between 0 and 1 how likely the two given questions are duplicates.\n"
]
},
{
"cell_type": "markdown",
"id": "4f967914",
"metadata": {},
"source": [
"Note: The model is not suitable to estimate the similarity of questions, e.g. the two questions \"How to learn Java\" and \"How to learn Python\" will result in a rahter low score, as these are not duplicates.\n"
]
},
{
"cell_type": "markdown",
"id": "fe95bb7e",
"metadata": {},
"source": [
"## Usage and Performance\n"
]
},
{
"cell_type": "markdown",
"id": "8efd69d2",
"metadata": {},
"source": [
"Pre-trained models can be used like this:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "92142a26",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "436ba799",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/quora-distilroberta-base\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "a5a90cce",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/cross-encoder/quora-distilroberta-base](https://huggingface.co/cross-encoder/quora-distilroberta-base),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "104bbe82",
"metadata": {},
"source": [
"# Cross-Encoder for Quora Duplicate Questions Detection\n",
"This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class.\n"
]
},
{
"cell_type": "markdown",
"id": "71def254",
"metadata": {},
"source": [
"## Training Data\n",
"This model was trained on the [Quora Duplicate Questions](https://www.quora.com/q/quoradata/First-Quora-Dataset-Release-Question-Pairs) dataset. The model will predict a score between 0 and 1 how likely the two given questions are duplicates.\n"
]
},
{
"cell_type": "markdown",
"id": "10f2b17c",
"metadata": {},
"source": [
"Note: The model is not suitable to estimate the similarity of questions, e.g. the two questions \"How to learn Java\" and \"How to learn Python\" will result in a rahter low score, as these are not duplicates.\n"
]
},
{
"cell_type": "markdown",
"id": "9e28c83a",
"metadata": {},
"source": [
"## Usage and Performance\n"
]
},
{
"cell_type": "markdown",
"id": "bc8ce622",
"metadata": {},
"source": [
"Pre-trained models can be used like this:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3f66406a",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7ba92b4f",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/quora-distilroberta-base\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "93656328",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/cross-encoder/quora-distilroberta-base](https://huggingface.co/cross-encoder/quora-distilroberta-base) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: ''
License: apache-2.0
Model_Info:
name: "cross-encoder/quora-roberta-base"
description: "Cross-Encoder for Quora Duplicate Questions Detection"
description_en: "Cross-Encoder for Quora Duplicate Questions Detection"
icon: ""
from_repo: "https://huggingface.co/cross-encoder/quora-roberta-base"
description: Cross-Encoder for Quora Duplicate Questions Detection
description_en: Cross-Encoder for Quora Duplicate Questions Detection
from_repo: https://huggingface.co/cross-encoder/quora-roberta-base
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: cross-encoder/quora-roberta-base
Paper: null
Publisher: cross-encoder
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text Classification"
sub_tag: "文本分类"
Example:
Datasets: ""
Publisher: "cross-encoder"
License: "apache-2.0"
Language: ""
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- sub_tag: 文本分类
sub_tag_en: Text Classification
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "87e3266c",
"metadata": {},
"source": [
"# Cross-Encoder for Quora Duplicate Questions Detection\n",
"This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class.\n"
]
},
{
"cell_type": "markdown",
"id": "e743234a",
"metadata": {},
"source": [
"## Training Data\n",
"This model was trained on the [Quora Duplicate Questions](https://www.quora.com/q/quoradata/First-Quora-Dataset-Release-Question-Pairs) dataset. The model will predict a score between 0 and 1 how likely the two given questions are duplicates.\n"
]
},
{
"cell_type": "markdown",
"id": "f755b608",
"metadata": {},
"source": [
"Note: The model is not suitable to estimate the similarity of questions, e.g. the two questions \"How to learn Java\" and \"How to learn Python\" will result in a rahter low score, as these are not duplicates.\n"
]
},
{
"cell_type": "markdown",
"id": "5a08f2c7",
"metadata": {},
"source": [
"## Usage and Performance\n"
]
},
{
"cell_type": "markdown",
"id": "c4021393",
"metadata": {},
"source": [
"Pre-trained models can be used like this:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5f704b5f",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "161c640b",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/quora-roberta-base\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "93a5e3b7",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/cross-encoder/quora-roberta-base](https://huggingface.co/cross-encoder/quora-roberta-base),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "74b2ba5f",
"metadata": {},
"source": [
"# Cross-Encoder for Quora Duplicate Questions Detection\n",
"This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class.\n"
]
},
{
"cell_type": "markdown",
"id": "36bf7390",
"metadata": {},
"source": [
"## Training Data\n",
"This model was trained on the [Quora Duplicate Questions](https://www.quora.com/q/quoradata/First-Quora-Dataset-Release-Question-Pairs) dataset. The model will predict a score between 0 and 1 how likely the two given questions are duplicates.\n"
]
},
{
"cell_type": "markdown",
"id": "5aa29571",
"metadata": {},
"source": [
"Note: The model is not suitable to estimate the similarity of questions, e.g. the two questions \"How to learn Java\" and \"How to learn Python\" will result in a rahter low score, as these are not duplicates.\n"
]
},
{
"cell_type": "markdown",
"id": "1fe76310",
"metadata": {},
"source": [
"## Usage and Performance\n"
]
},
{
"cell_type": "markdown",
"id": "e7067bef",
"metadata": {},
"source": [
"Pre-trained models can be used like this:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a9ea7b3d",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b30bfcd4",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/quora-roberta-base\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "ecb795de",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/cross-encoder/quora-roberta-base](https://huggingface.co/cross-encoder/quora-roberta-base) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: ''
License: apache-2.0
Model_Info:
name: "cross-encoder/stsb-TinyBERT-L-4"
description: "Cross-Encoder for Quora Duplicate Questions Detection"
description_en: "Cross-Encoder for Quora Duplicate Questions Detection"
icon: ""
from_repo: "https://huggingface.co/cross-encoder/stsb-TinyBERT-L-4"
description: Cross-Encoder for Quora Duplicate Questions Detection
description_en: Cross-Encoder for Quora Duplicate Questions Detection
from_repo: https://huggingface.co/cross-encoder/stsb-TinyBERT-L-4
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: cross-encoder/stsb-TinyBERT-L-4
Paper: null
Publisher: cross-encoder
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text Classification"
sub_tag: "文本分类"
Example:
Datasets: ""
Publisher: "cross-encoder"
License: "apache-2.0"
Language: ""
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- sub_tag: 文本分类
sub_tag_en: Text Classification
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "a3deebdc",
"metadata": {},
"source": [
"# Cross-Encoder for Quora Duplicate Questions Detection\n",
"This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class.\n"
]
},
{
"cell_type": "markdown",
"id": "4fc17643",
"metadata": {},
"source": [
"## Training Data\n",
"This model was trained on the [STS benchmark dataset](http://ixa2.si.ehu.eus/stswiki/index.php/STSbenchmark). The model will predict a score between 0 and 1 how for the semantic similarity of two sentences.\n"
]
},
{
"cell_type": "markdown",
"id": "f66fb11e",
"metadata": {},
"source": [
"## Usage and Performance\n"
]
},
{
"cell_type": "markdown",
"id": "fd12128b",
"metadata": {},
"source": [
"Pre-trained models can be used like this:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d0d04e39",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "d07e31aa",
"metadata": {
"collapsed": true
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/root/miniconda3/envs/paddle/lib/python3.7/site-packages/tqdm/auto.py:22: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
" from .autonotebook import tqdm as notebook_tqdm\n",
"\u001b[32m[2022-11-21 02:38:07,127] [ INFO]\u001b[0m - Downloading model_config.json from https://bj.bcebos.com/paddlenlp/models/community/cross-encoder/stsb-TinyBERT-L-4/model_config.json\u001b[0m\n",
"100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 432/432 [00:00<00:00, 425kB/s]\n",
"\u001b[32m[2022-11-21 02:38:07,197] [ INFO]\u001b[0m - We are using <class 'paddlenlp.transformers.bert.modeling.BertModel'> to load 'cross-encoder/stsb-TinyBERT-L-4'.\u001b[0m\n",
"\u001b[32m[2022-11-21 02:38:07,198] [ INFO]\u001b[0m - Downloading https://bj.bcebos.com/paddlenlp/models/community/cross-encoder/stsb-TinyBERT-L-4/model_state.pdparams and saved to /root/.paddlenlp/models/cross-encoder/stsb-TinyBERT-L-4\u001b[0m\n",
"\u001b[32m[2022-11-21 02:38:07,198] [ INFO]\u001b[0m - Downloading model_state.pdparams from https://bj.bcebos.com/paddlenlp/models/community/cross-encoder/stsb-TinyBERT-L-4/model_state.pdparams\u001b[0m\n",
"100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 54.8M/54.8M [00:00<00:00, 64.7MB/s]\n",
"\u001b[32m[2022-11-21 02:38:08,199] [ INFO]\u001b[0m - Already cached /root/.paddlenlp/models/cross-encoder/stsb-TinyBERT-L-4/model_config.json\u001b[0m\n",
"W1121 02:38:08.202270 64563 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 10.2, Runtime API Version: 10.2\n",
"W1121 02:38:08.207437 64563 gpu_resources.cc:91] device: 0, cuDNN Version: 7.6.\n",
"\u001b[32m[2022-11-21 02:38:09,661] [ INFO]\u001b[0m - Weights from pretrained model not used in BertModel: ['classifier.weight', 'classifier.bias']\u001b[0m\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"(Tensor(shape=[1, 20, 312], dtype=float32, place=Place(gpu:0), stop_gradient=False,\n",
" [[[-0.73827386, -0.57349819, 0.47456041, ..., -0.07317579,\n",
" 0.23808761, -0.43587247],\n",
" [-0.71079123, -0.37019217, 0.44499084, ..., -0.07541266,\n",
" 0.22209664, -0.48883811],\n",
" [-0.61283624, 0.01138088, 0.46346331, ..., -0.15316986,\n",
" 0.38455290, -0.23527470],\n",
" ...,\n",
" [-0.19267607, -0.42171016, 0.40080610, ..., -0.04322027,\n",
" 0.16102640, -0.43728969],\n",
" [-0.76348048, 0.00028179, 0.50795513, ..., 0.02495949,\n",
" 0.32419923, -0.44668996],\n",
" [-0.72070849, -0.48510927, 0.47747549, ..., -0.01621611,\n",
" 0.31407145, -0.38287419]]]), Tensor(shape=[1, 312], dtype=float32, place=Place(gpu:0), stop_gradient=False,\n",
" [[ 0.38359359, 0.16227540, -0.58949089, -0.67293817, 0.70552814,\n",
" 0.74028063, -0.60770833, 0.50480992, 0.71489060, -0.73976040,\n",
" -0.11784898, 0.73014355, -0.65726435, 0.17490843, -0.44103470,\n",
" 0.62014306, 0.35533482, -0.44271812, -0.61711168, -0.70586687,\n",
" 0.69903672, 0.00862758, 0.69424403, 0.31887573, 0.38736165,\n",
" 0.02848060, -0.69896543, 0.69952166, 0.56477094, 0.68585342,\n",
" 0.66026199, 0.67826200, 0.67839348, 0.74852920, -0.04272985,\n",
" 0.76357287, 0.38685408, -0.69717598, 0.69945419, 0.44048944,\n",
" -0.66915488, 0.11735962, 0.37215349, 0.73054057, 0.71345085,\n",
" 0.66489315, 0.19956835, 0.71552449, 0.64762783, -0.46583632,\n",
" -0.09976894, -0.45265704, 0.54242563, 0.42835563, -0.60076892,\n",
" 0.69768012, -0.72207040, -0.52898210, 0.34657273, 0.05400079,\n",
" 0.57360554, -0.72731823, -0.71799070, -0.37212241, -0.70602018,\n",
" -0.71248102, 0.02778789, -0.73165607, 0.46581894, -0.72120243,\n",
" 0.60769719, -0.63354278, 0.75307459, 0.00700274, -0.00984141,\n",
" -0.58984685, 0.36321065, 0.60098255, -0.72467339, 0.18362086,\n",
" 0.10687865, -0.63730168, -0.62655306, -0.00187578, -0.51795095,\n",
" -0.64884937, 0.69950461, 0.72286713, 0.72522557, -0.45434299,\n",
" -0.43063730, -0.10669708, -0.51012146, 0.66286671, 0.69542134,\n",
" 0.21393165, -0.02928682, 0.67238331, 0.20404275, -0.63556075,\n",
" 0.55774790, 0.26141557, 0.70166790, -0.03091500, 0.65226245,\n",
" -0.69878876, 0.32701582, -0.68492270, 0.67152256, 0.66395414,\n",
" -0.68914133, -0.63889050, 0.71558940, 0.50034380, -0.12911484,\n",
" 0.70831281, 0.68631476, -0.41206849, 0.23268108, 0.67747647,\n",
" -0.29744238, 0.65135175, -0.70074749, 0.56074560, -0.63501489,\n",
" 0.74985635, -0.60603380, 0.66920304, -0.72418481, -0.59756589,\n",
" -0.70151484, -0.38735744, -0.66458094, -0.71190053, -0.69316322,\n",
" 0.43108079, -0.21692288, 0.70705998, -0.14984211, 0.75786442,\n",
" 0.69729054, -0.68925959, -0.46773866, 0.66707891, -0.07957093,\n",
" 0.73757517, 0.10062494, -0.73353016, 0.10992812, -0.48824292,\n",
" 0.62493157, 0.43311006, -0.15723324, -0.48392498, -0.65230477,\n",
" -0.41098344, -0.65238249, -0.41507134, -0.55544889, -0.32195652,\n",
" -0.74827588, -0.64071310, -0.49207535, -0.69750905, -0.57037342,\n",
" 0.35724813, 0.74778593, 0.49369636, -0.69870174, 0.24547403,\n",
" 0.73229605, 0.15653144, 0.41334581, 0.64413625, 0.53084993,\n",
" -0.64746642, -0.58720803, 0.63381183, 0.76515305, -0.68342912,\n",
" 0.65923864, -0.74662960, -0.72339952, 0.32203752, -0.63402468,\n",
" -0.71399093, -0.50430977, 0.26967043, -0.21176267, 0.65678287,\n",
" 0.09193933, 0.23962519, 0.59481263, -0.61463839, -0.28634411,\n",
" 0.69451737, 0.47513142, 0.30889973, -0.18030594, -0.50777411,\n",
" 0.71548641, -0.34869543, -0.01252351, 0.12018032, 0.69536412,\n",
" 0.53745425, 0.54889160, -0.10619923, 0.68386155, -0.68498713,\n",
" 0.23352134, 0.67296249, -0.12094481, -0.69636226, -0.06552890,\n",
" 0.00965041, -0.52394331, 0.72305930, -0.17239039, -0.73262835,\n",
" 0.50841606, 0.39529455, -0.70830429, 0.51234418, 0.68391299,\n",
" -0.72483873, -0.51841038, -0.58264560, -0.74197364, 0.46386808,\n",
" -0.23263671, 0.21232133, -0.69674802, 0.33948907, 0.75922930,\n",
" -0.43505231, -0.53149903, -0.65927148, 0.09607304, -0.68945718,\n",
" 0.66966355, 0.68096715, 0.66396469, 0.13001618, -0.68894261,\n",
" -0.66597682, 0.61407733, 0.69670630, 0.63995171, 0.33257753,\n",
" 0.66776848, 0.57427299, 0.32768273, 0.69438887, 0.41346189,\n",
" -0.71529591, -0.09860074, -0.72291893, 0.16860481, -0.67641008,\n",
" 0.70644248, -0.24303547, 0.28892463, 0.56054235, 0.55539572,\n",
" 0.70762485, -0.50166684, -0.70544142, -0.74241722, -0.74010289,\n",
" 0.70217764, -0.09219251, 0.47989756, -0.17431454, 0.76019192,\n",
" -0.09623899, -0.64994997, -0.03216666, 0.70323825, -0.66661566,\n",
" 0.71163839, -0.08982500, -0.35390857, 0.61377501, -0.49430367,\n",
" 0.49526611, 0.75078416, -0.05324765, -0.75398672, 0.70934319,\n",
" 0.21146417, -0.59094489, 0.39163795, -0.67382598, -0.63484156,\n",
" -0.27295890, 0.75101918, 0.70603085, 0.71781063, -0.57344818,\n",
" -0.22560060, -0.62196493, 0.68178481, 0.61596531, -0.12730023,\n",
" -0.69500911, 0.73689735, 0.12627751, -0.26101601, -0.24929181,\n",
" 0.68093145, 0.05896470]]))\n"
]
}
],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/stsb-TinyBERT-L-4\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "aeccdfe1",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/cross-encoder/stsb-TinyBERT-L-4](https://huggingface.co/cross-encoder/stsb-TinyBERT-L-4),并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "6e8592db",
"metadata": {},
"source": [
"# Cross-Encoder for Quora Duplicate Questions Detection\n",
"This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class.\n"
]
},
{
"cell_type": "markdown",
"id": "c3be9ab9",
"metadata": {},
"source": [
"## Training Data\n",
"This model was trained on the [STS benchmark dataset](http://ixa2.si.ehu.eus/stswiki/index.php/STSbenchmark). The model will predict a score between 0 and 1 how for the semantic similarity of two sentences.\n"
]
},
{
"cell_type": "markdown",
"id": "3f2d2712",
"metadata": {},
"source": [
"## Usage and Performance\n"
]
},
{
"cell_type": "markdown",
"id": "0127bf3d",
"metadata": {},
"source": [
"Pre-trained models can be used like this:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e6968e7e",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "39e99053",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/stsb-TinyBERT-L-4\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "35446f31",
"metadata": {},
"source": [
"You can use this model also without sentence_transformers and by just using ``AutoModel`` class\n",
"> The model introduction and model weights originate from [https://huggingface.co/cross-encoder/stsb-TinyBERT-L-4](https://huggingface.co/cross-encoder/stsb-TinyBERT-L-4) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: ''
License: apache-2.0
Model_Info:
name: "cross-encoder/stsb-distilroberta-base"
description: "Cross-Encoder for Quora Duplicate Questions Detection"
description_en: "Cross-Encoder for Quora Duplicate Questions Detection"
icon: ""
from_repo: "https://huggingface.co/cross-encoder/stsb-distilroberta-base"
description: Cross-Encoder for Quora Duplicate Questions Detection
description_en: Cross-Encoder for Quora Duplicate Questions Detection
from_repo: https://huggingface.co/cross-encoder/stsb-distilroberta-base
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: cross-encoder/stsb-distilroberta-base
Paper: null
Publisher: cross-encoder
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text Classification"
sub_tag: "文本分类"
Example:
Datasets: ""
Publisher: "cross-encoder"
License: "apache-2.0"
Language: ""
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- sub_tag: 文本分类
sub_tag_en: Text Classification
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "7c9e1c38",
"metadata": {},
"source": [
"# Cross-Encoder for Quora Duplicate Questions Detection\n",
"This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class.\n"
]
},
{
"cell_type": "markdown",
"id": "c62db00c",
"metadata": {},
"source": [
"## Training Data\n",
"This model was trained on the [STS benchmark dataset](http://ixa2.si.ehu.eus/stswiki/index.php/STSbenchmark). The model will predict a score between 0 and 1 how for the semantic similarity of two sentences.\n"
]
},
{
"cell_type": "markdown",
"id": "03f81dda",
"metadata": {},
"source": [
"## Usage and Performance\n"
]
},
{
"cell_type": "markdown",
"id": "ac99e012",
"metadata": {},
"source": [
"Pre-trained models can be used like this:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "37931dd1",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ff0714d5",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/stsb-distilroberta-base\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "e783f36c",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/cross-encoder/stsb-distilroberta-base](https://huggingface.co/cross-encoder/stsb-distilroberta-base),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "cc55c2df",
"metadata": {},
"source": [
"# Cross-Encoder for Quora Duplicate Questions Detection\n",
"This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class.\n"
]
},
{
"cell_type": "markdown",
"id": "6e6d61e4",
"metadata": {},
"source": [
"## Training Data\n",
"This model was trained on the [STS benchmark dataset](http://ixa2.si.ehu.eus/stswiki/index.php/STSbenchmark). The model will predict a score between 0 and 1 how for the semantic similarity of two sentences.\n"
]
},
{
"cell_type": "markdown",
"id": "73fa7630",
"metadata": {},
"source": [
"## Usage and Performance\n"
]
},
{
"cell_type": "markdown",
"id": "33248e47",
"metadata": {},
"source": [
"Pre-trained models can be used like this:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "48f2d520",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f16202eb",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/stsb-distilroberta-base\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "8586b106",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/cross-encoder/stsb-distilroberta-base](https://huggingface.co/cross-encoder/stsb-distilroberta-base) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: ''
License: apache-2.0
Model_Info:
name: "cross-encoder/stsb-roberta-base"
description: "Cross-Encoder for Quora Duplicate Questions Detection"
description_en: "Cross-Encoder for Quora Duplicate Questions Detection"
icon: ""
from_repo: "https://huggingface.co/cross-encoder/stsb-roberta-base"
description: Cross-Encoder for Quora Duplicate Questions Detection
description_en: Cross-Encoder for Quora Duplicate Questions Detection
from_repo: https://huggingface.co/cross-encoder/stsb-roberta-base
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: cross-encoder/stsb-roberta-base
Paper: null
Publisher: cross-encoder
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text Classification"
sub_tag: "文本分类"
Example:
Datasets: ""
Publisher: "cross-encoder"
License: "apache-2.0"
Language: ""
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- sub_tag: 文本分类
sub_tag_en: Text Classification
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "0ce6be0e",
"metadata": {},
"source": [
"# Cross-Encoder for Quora Duplicate Questions Detection\n",
"This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class.\n"
]
},
{
"cell_type": "markdown",
"id": "6e5557d3",
"metadata": {},
"source": [
"## Training Data\n",
"This model was trained on the [STS benchmark dataset](http://ixa2.si.ehu.eus/stswiki/index.php/STSbenchmark). The model will predict a score between 0 and 1 how for the semantic similarity of two sentences.\n"
]
},
{
"cell_type": "markdown",
"id": "dac1f27b",
"metadata": {},
"source": [
"## Usage and Performance\n"
]
},
{
"cell_type": "markdown",
"id": "c279cc30",
"metadata": {},
"source": [
"Pre-trained models can be used like this:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "64e1d35f",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3c22da03",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/stsb-roberta-base\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "49af1fc0",
"metadata": {},
"source": [
"You can use this model also without sentence_transformers and by just using ``AutoModel`` class\n",
"> 此模型介绍及权重来源于[https://huggingface.co/cross-encoder/stsb-roberta-base](https://huggingface.co/cross-encoder/stsb-roberta-base),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "c3137a69",
"metadata": {},
"source": [
"# Cross-Encoder for Quora Duplicate Questions Detection\n",
"This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class.\n"
]
},
{
"cell_type": "markdown",
"id": "5406455e",
"metadata": {},
"source": [
"## Training Data\n",
"This model was trained on the [STS benchmark dataset](http://ixa2.si.ehu.eus/stswiki/index.php/STSbenchmark). The model will predict a score between 0 and 1 how for the semantic similarity of two sentences.\n"
]
},
{
"cell_type": "markdown",
"id": "565af020",
"metadata": {},
"source": [
"## Usage and Performance\n"
]
},
{
"cell_type": "markdown",
"id": "bd866838",
"metadata": {},
"source": [
"Pre-trained models can be used like this:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "07301a77",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0b756d3a",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/stsb-roberta-base\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "6ba822d5",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/cross-encoder/stsb-roberta-base](https://huggingface.co/cross-encoder/stsb-roberta-base) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: ''
License: apache-2.0
Model_Info:
name: "cross-encoder/stsb-roberta-large"
description: "Cross-Encoder for Quora Duplicate Questions Detection"
description_en: "Cross-Encoder for Quora Duplicate Questions Detection"
icon: ""
from_repo: "https://huggingface.co/cross-encoder/stsb-roberta-large"
description: Cross-Encoder for Quora Duplicate Questions Detection
description_en: Cross-Encoder for Quora Duplicate Questions Detection
from_repo: https://huggingface.co/cross-encoder/stsb-roberta-large
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: cross-encoder/stsb-roberta-large
Paper: null
Publisher: cross-encoder
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text Classification"
sub_tag: "文本分类"
Example:
Datasets: ""
Publisher: "cross-encoder"
License: "apache-2.0"
Language: ""
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- sub_tag: 文本分类
sub_tag_en: Text Classification
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "a8a5f540",
"metadata": {},
"source": [
"# Cross-Encoder for Quora Duplicate Questions Detection\n",
"This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class.\n"
]
},
{
"cell_type": "markdown",
"id": "e4d8f5f6",
"metadata": {},
"source": [
"## Training Data\n",
"This model was trained on the [STS benchmark dataset](http://ixa2.si.ehu.eus/stswiki/index.php/STSbenchmark). The model will predict a score between 0 and 1 how for the semantic similarity of two sentences.\n"
]
},
{
"cell_type": "markdown",
"id": "182943f7",
"metadata": {},
"source": [
"## Usage and Performance\n"
]
},
{
"cell_type": "markdown",
"id": "764e0664",
"metadata": {},
"source": [
"Pre-trained models can be used like this:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "61787745",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f4671372",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/stsb-roberta-large\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "9e8e26d0",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/cross-encoder/stsb-roberta-large](https://huggingface.co/cross-encoder/stsb-roberta-large),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "291a48fa",
"metadata": {},
"source": [
"# Cross-Encoder for Quora Duplicate Questions Detection\n",
"This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class.\n"
]
},
{
"cell_type": "markdown",
"id": "92f483ed",
"metadata": {},
"source": [
"## Training Data\n",
"This model was trained on the [STS benchmark dataset](http://ixa2.si.ehu.eus/stswiki/index.php/STSbenchmark). The model will predict a score between 0 and 1 how for the semantic similarity of two sentences.\n"
]
},
{
"cell_type": "markdown",
"id": "5dbde912",
"metadata": {},
"source": [
"## Usage and Performance\n"
]
},
{
"cell_type": "markdown",
"id": "3e04e94a",
"metadata": {},
"source": [
"Pre-trained models can be used like this:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4209e47d",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e2649c47",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"cross-encoder/stsb-roberta-large\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "678e37ab",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/cross-encoder/stsb-roberta-large](https://huggingface.co/cross-encoder/stsb-roberta-large) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
# 模型列表
## csarron/roberta-base-squad-v1
| 模型名称 | 模型介绍 | 模型大小 | 模型下载 |
| --- | --- | --- | --- |
|csarron/roberta-base-squad-v1| | 475.51MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/csarron/roberta-base-squad-v1/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/csarron/roberta-base-squad-v1/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/csarron/roberta-base-squad-v1/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/csarron/roberta-base-squad-v1/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/csarron/roberta-base-squad-v1/vocab.json)<br>[vocab.txt](https://bj.bcebos.com/paddlenlp/models/community/csarron/roberta-base-squad-v1/vocab.txt) |
也可以通过`paddlenlp` cli 工具来下载对应的模型权重,使用步骤如下所示:
* 安装paddlenlp
```shell
pip install --upgrade paddlenlp
```
* 下载命令行
```shell
paddlenlp download --cache-dir ./pretrained_models csarron/roberta-base-squad-v1
```
有任何下载的问题都可以到[PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP)中发Issue提问。
\ No newline at end of file
# model list
##
| model | description | model_size | download |
| --- | --- | --- | --- |
|csarron/roberta-base-squad-v1| | 475.51MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/csarron/roberta-base-squad-v1/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/csarron/roberta-base-squad-v1/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/csarron/roberta-base-squad-v1/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/csarron/roberta-base-squad-v1/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/csarron/roberta-base-squad-v1/vocab.json)<br>[vocab.txt](https://bj.bcebos.com/paddlenlp/models/community/csarron/roberta-base-squad-v1/vocab.txt) |
or you can download all of model file with the following steps:
* install paddlenlp
```shell
pip install --upgrade paddlenlp
```
* download model with cli tool
```shell
paddlenlp download --cache-dir ./pretrained_models csarron/roberta-base-squad-v1
```
If you have any problems with it, you can post issue on [PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP) to get support.
Model_Info:
name: "csarron/roberta-base-squad-v1"
description: "RoBERTa-base fine-tuned on SQuAD v1"
description_en: "RoBERTa-base fine-tuned on SQuAD v1"
icon: ""
from_repo: "https://huggingface.co/csarron/roberta-base-squad-v1"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Question Answering"
sub_tag: "回答问题"
Example:
Datasets: "squad"
Publisher: "csarron"
License: "mit"
Language: "English"
Paper:
- title: 'RoBERTa: A Robustly Optimized BERT Pretraining Approach'
url: 'http://arxiv.org/abs/1907.11692v1'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: German
License: mit
Model_Info:
name: "dbmdz/bert-base-german-cased"
description: "🤗 + 📚 dbmdz German BERT models"
description_en: "🤗 + 📚 dbmdz German BERT models"
icon: ""
from_repo: "https://huggingface.co/dbmdz/bert-base-german-cased"
description: 🤗 + 📚 dbmdz German BERT models
description_en: 🤗 + 📚 dbmdz German BERT models
from_repo: https://huggingface.co/dbmdz/bert-base-german-cased
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: dbmdz/bert-base-german-cased
Paper: null
Publisher: dbmdz
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: ""
Publisher: "dbmdz"
License: "mit"
Language: "German"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "9fb28341",
"metadata": {},
"source": [
"# 🤗 + 📚 dbmdz German BERT models\n",
"\n",
"In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State\n",
"Library open sources another German BERT models 🎉\n",
"\n",
"# German BERT\n",
"\n",
"## Stats\n",
"\n",
"In addition to the recently released [German BERT](https://deepset.ai/german-bert)\n",
"model by [deepset](https://deepset.ai/) we provide another German-language model.\n",
"\n",
"The source data for the model consists of a recent Wikipedia dump, EU Bookshop corpus,\n",
"Open Subtitles, CommonCrawl, ParaCrawl and News Crawl. This results in a dataset with\n",
"a size of 16GB and 2,350,234,427 tokens.\n",
"\n",
"For sentence splitting, we use [spacy](https://spacy.io/). Our preprocessing steps\n",
"(sentence piece model for vocab generation) follow those used for training\n",
"[SciBERT](https://github.com/allenai/scibert). The model is trained with an initial\n",
"sequence length of 512 subwords and was performed for 1.5M steps."
]
},
{
"cell_type": "markdown",
"id": "589fadf4",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "646e12d4",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5935d3e0",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dbmdz/bert-base-german-cased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "b05add24",
"metadata": {},
"source": [
"# Reference\n",
"\n",
"> 此模型介绍及权重来源于[https://huggingface.co/dbmdz/bert-base-german-cased](https://huggingface.co/dbmdz/bert-base-german-cased),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "e875e0cc",
"metadata": {},
"source": [
"# 🤗 + 📚 dbmdz German BERT models\n",
"\n",
"In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State\n",
"Library open sources another German BERT models 🎉\n",
"\n",
"# German BERT\n",
"\n",
"## Stats\n",
"\n",
"In addition to the recently released [German BERT](https://deepset.ai/german-bert)\n",
"model by [deepset](https://deepset.ai/) we provide another German-language model.\n",
"\n",
"The source data for the model consists of a recent Wikipedia dump, EU Bookshop corpus,\n",
"Open Subtitles, CommonCrawl, ParaCrawl and News Crawl. This results in a dataset with\n",
"a size of 16GB and 2,350,234,427 tokens.\n",
"\n",
"For sentence splitting, we use [spacy](https://spacy.io/). Our preprocessing steps\n",
"(sentence piece model for vocab generation) follow those used for training\n",
"[SciBERT](https://github.com/allenai/scibert). The model is trained with an initial\n",
"sequence length of 512 subwords and was performed for 1.5M steps."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8dcad967",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c7c65281",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dbmdz/bert-base-german-cased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "1b52feb8",
"metadata": {},
"source": [
"# Reference"
]
},
{
"cell_type": "markdown",
"id": "bc00304a",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/dbmdz/bert-base-german-cased](https://huggingface.co/dbmdz/bert-base-german-cased) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: German
License: mit
Model_Info:
name: "dbmdz/bert-base-german-uncased"
description: "🤗 + 📚 dbmdz German BERT models"
description_en: "🤗 + 📚 dbmdz German BERT models"
icon: ""
from_repo: "https://huggingface.co/dbmdz/bert-base-german-uncased"
description: 🤗 + 📚 dbmdz German BERT models
description_en: 🤗 + 📚 dbmdz German BERT models
from_repo: https://huggingface.co/dbmdz/bert-base-german-uncased
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: dbmdz/bert-base-german-uncased
Paper: null
Publisher: dbmdz
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: ""
Publisher: "dbmdz"
License: "mit"
Language: "German"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "46b7bbb6",
"metadata": {},
"source": [
"# 🤗 + 📚 dbmdz German BERT models\n",
"\n",
"In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State\n",
"Library open sources another German BERT models 🎉\n",
"\n",
"# German BERT\n",
"\n",
"## Stats\n",
"\n",
"In addition to the recently released [German BERT](https://deepset.ai/german-bert)\n",
"model by [deepset](https://deepset.ai/) we provide another German-language model.\n",
"\n",
"The source data for the model consists of a recent Wikipedia dump, EU Bookshop corpus,\n",
"Open Subtitles, CommonCrawl, ParaCrawl and News Crawl. This results in a dataset with\n",
"a size of 16GB and 2,350,234,427 tokens.\n",
"\n",
"For sentence splitting, we use [spacy](https://spacy.io/). Our preprocessing steps\n",
"(sentence piece model for vocab generation) follow those used for training\n",
"[SciBERT](https://github.com/allenai/scibert). The model is trained with an initial\n",
"sequence length of 512 subwords and was performed for 1.5M steps."
]
},
{
"cell_type": "markdown",
"id": "bc37d3e3",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2afff18c",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "967f058e",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dbmdz/bert-base-german-uncased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "483dbced",
"metadata": {},
"source": [
"# Reference"
]
},
{
"cell_type": "markdown",
"id": "04e50d8c",
"metadata": {},
"source": [
"\n",
"> 此模型介绍及权重来源于[https://huggingface.co/dbmdz/bert-base-german-uncased](https://huggingface.co/dbmdz/bert-base-german-uncased),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "5e0d446c",
"metadata": {},
"source": [
"# 🤗 + 📚 dbmdz German BERT models\n",
"\n",
"In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State\n",
"Library open sources another German BERT models 🎉\n",
"\n",
"# German BERT\n",
"\n",
"## Stats\n",
"\n",
"In addition to the recently released [German BERT](https://deepset.ai/german-bert)\n",
"model by [deepset](https://deepset.ai/) we provide another German-language model.\n",
"\n",
"The source data for the model consists of a recent Wikipedia dump, EU Bookshop corpus,\n",
"Open Subtitles, CommonCrawl, ParaCrawl and News Crawl. This results in a dataset with\n",
"a size of 16GB and 2,350,234,427 tokens.\n",
"\n",
"For sentence splitting, we use [spacy](https://spacy.io/). Our preprocessing steps\n",
"(sentence piece model for vocab generation) follow those used for training\n",
"[SciBERT](https://github.com/allenai/scibert). The model is trained with an initial\n",
"sequence length of 512 subwords and was performed for 1.5M steps."
]
},
{
"cell_type": "markdown",
"id": "524680d5",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "39332440",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "19cf118e",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dbmdz/bert-base-german-uncased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "fb81d709",
"metadata": {},
"source": [
"# Reference"
]
},
{
"cell_type": "markdown",
"id": "747fd5d3",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/dbmdz/bert-base-german-uncased](https://huggingface.co/dbmdz/bert-base-german-uncased) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: wikipedia
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: Italian
License: mit
Model_Info:
name: "dbmdz/bert-base-italian-uncased"
description: "🤗 + 📚 dbmdz BERT and ELECTRA models"
description_en: "🤗 + 📚 dbmdz BERT and ELECTRA models"
icon: ""
from_repo: "https://huggingface.co/dbmdz/bert-base-italian-uncased"
description: 🤗 + 📚 dbmdz BERT and ELECTRA models
description_en: 🤗 + 📚 dbmdz BERT and ELECTRA models
from_repo: https://huggingface.co/dbmdz/bert-base-italian-uncased
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: dbmdz/bert-base-italian-uncased
Paper: null
Publisher: dbmdz
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: "wikipedia"
Publisher: "dbmdz"
License: "mit"
Language: "Italian"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "dea2fc9e",
"metadata": {},
"source": [
"# 🤗 + 📚 dbmdz BERT and ELECTRA models\n"
]
},
{
"cell_type": "markdown",
"id": "00744cbd",
"metadata": {},
"source": [
"In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State\n",
"Library open sources Italian BERT and ELECTRA models 🎉\n"
]
},
{
"cell_type": "markdown",
"id": "d7106b74",
"metadata": {},
"source": [
"# Italian BERT\n"
]
},
{
"cell_type": "markdown",
"id": "7ee0fd67",
"metadata": {},
"source": [
"The source data for the Italian BERT model consists of a recent Wikipedia dump and\n",
"various texts from the [OPUS corpora](http://opus.nlpl.eu/) collection. The final\n",
"training corpus has a size of 13GB and 2,050,057,573 tokens.\n"
]
},
{
"cell_type": "markdown",
"id": "a3961910",
"metadata": {},
"source": [
"For sentence splitting, we use NLTK (faster compared to spacy).\n",
"Our cased and uncased models are training with an initial sequence length of 512\n",
"subwords for ~2-3M steps.\n"
]
},
{
"cell_type": "markdown",
"id": "480e4fea",
"metadata": {},
"source": [
"For the XXL Italian models, we use the same training data from OPUS and extend\n",
"it with data from the Italian part of the [OSCAR corpus](https://traces1.inria.fr/oscar/).\n",
"Thus, the final training corpus has a size of 81GB and 13,138,379,147 tokens.\n"
]
},
{
"cell_type": "markdown",
"id": "d710804e",
"metadata": {},
"source": [
"Note: Unfortunately, a wrong vocab size was used when training the XXL models.\n",
"This explains the mismatch of the \"real\" vocab size of 31102, compared to the\n",
"vocab size specified in `config.json`. However, the model is working and all\n",
"evaluations were done under those circumstances.\n",
"See [this issue](https://github.com/dbmdz/berts/issues/7) for more information.\n"
]
},
{
"cell_type": "markdown",
"id": "2d9c79e5",
"metadata": {},
"source": [
"The Italian ELECTRA model was trained on the \"XXL\" corpus for 1M steps in total using a batch\n",
"size of 128. We pretty much following the ELECTRA training procedure as used for\n",
"[BERTurk](https://github.com/stefan-it/turkish-bert/tree/master/electra).\n"
]
},
{
"cell_type": "markdown",
"id": "3ee71cee",
"metadata": {},
"source": [
"## Usage\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9ffe9a93",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "82d327d4",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dbmdz/bert-base-italian-uncased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "56d92161",
"metadata": {},
"source": [
"# Reference"
]
},
{
"cell_type": "markdown",
"id": "ad146f63",
"metadata": {},
"source": [
"\n",
"> 此模型介绍及权重来源于[https://huggingface.co/dbmdz/bert-base-italian-uncased](https://huggingface.co/dbmdz/bert-base-italian-uncased),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "8601b7e0",
"metadata": {},
"source": [
"# 🤗 + 📚 dbmdz BERT and ELECTRA models\n"
]
},
{
"cell_type": "markdown",
"id": "2e2ee06f",
"metadata": {},
"source": [
"In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State\n",
"Library open sources Italian BERT and ELECTRA models 🎉\n"
]
},
{
"cell_type": "markdown",
"id": "a7b6e470",
"metadata": {},
"source": [
"# Italian BERT\n"
]
},
{
"cell_type": "markdown",
"id": "d1afb03c",
"metadata": {},
"source": [
"The source data for the Italian BERT model consists of a recent Wikipedia dump and\n",
"various texts from the [OPUS corpora](http://opus.nlpl.eu/) collection. The final\n",
"training corpus has a size of 13GB and 2,050,057,573 tokens.\n"
]
},
{
"cell_type": "markdown",
"id": "a900d41a",
"metadata": {},
"source": [
"For sentence splitting, we use NLTK (faster compared to spacy).\n",
"Our cased and uncased models are training with an initial sequence length of 512\n",
"subwords for ~2-3M steps.\n"
]
},
{
"cell_type": "markdown",
"id": "d4ea3425",
"metadata": {},
"source": [
"For the XXL Italian models, we use the same training data from OPUS and extend\n",
"it with data from the Italian part of the [OSCAR corpus](https://traces1.inria.fr/oscar/).\n",
"Thus, the final training corpus has a size of 81GB and 13,138,379,147 tokens.\n"
]
},
{
"cell_type": "markdown",
"id": "f1d5804d",
"metadata": {},
"source": [
"Note: Unfortunately, a wrong vocab size was used when training the XXL models.\n",
"This explains the mismatch of the \"real\" vocab size of 31102, compared to the\n",
"vocab size specified in `config.json`. However, the model is working and all\n",
"evaluations were done under those circumstances.\n",
"See [this issue](https://github.com/dbmdz/berts/issues/7) for more information.\n"
]
},
{
"cell_type": "markdown",
"id": "cc4f3d3d",
"metadata": {},
"source": [
"The Italian ELECTRA model was trained on the \"XXL\" corpus for 1M steps in total using a batch\n",
"size of 128. We pretty much following the ELECTRA training procedure as used for\n",
"[BERTurk](https://github.com/stefan-it/turkish-bert/tree/master/electra).\n"
]
},
{
"cell_type": "markdown",
"id": "76e431e8",
"metadata": {},
"source": [
"## Usage\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1b014af1",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ca7904c6",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dbmdz/bert-base-italian-uncased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "261390e6",
"metadata": {},
"source": [
"# Reference"
]
},
{
"cell_type": "markdown",
"id": "f5c0c815",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/dbmdz/bert-base-italian-uncased](https://huggingface.co/dbmdz/bert-base-italian-uncased) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: wikipedia
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: Italian
License: mit
Model_Info:
name: "dbmdz/bert-base-italian-xxl-cased"
description: "🤗 + 📚 dbmdz BERT and ELECTRA models"
description_en: "🤗 + 📚 dbmdz BERT and ELECTRA models"
icon: ""
from_repo: "https://huggingface.co/dbmdz/bert-base-italian-xxl-cased"
description: 🤗 + 📚 dbmdz BERT and ELECTRA models
description_en: 🤗 + 📚 dbmdz BERT and ELECTRA models
from_repo: https://huggingface.co/dbmdz/bert-base-italian-xxl-cased
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: dbmdz/bert-base-italian-xxl-cased
Paper: null
Publisher: dbmdz
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: "wikipedia"
Publisher: "dbmdz"
License: "mit"
Language: "Italian"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "4e448d86",
"metadata": {},
"source": [
"# 🤗 + 📚 dbmdz BERT and ELECTRA models\n"
]
},
{
"cell_type": "markdown",
"id": "9bcf089b",
"metadata": {},
"source": [
"In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State\n",
"Library open sources Italian BERT and ELECTRA models 🎉\n"
]
},
{
"cell_type": "markdown",
"id": "fb9adbdd",
"metadata": {},
"source": [
"# Italian BERT\n"
]
},
{
"cell_type": "markdown",
"id": "e5a80c49",
"metadata": {},
"source": [
"The source data for the Italian BERT model consists of a recent Wikipedia dump and\n",
"various texts from the [OPUS corpora](http://opus.nlpl.eu/) collection. The final\n",
"training corpus has a size of 13GB and 2,050,057,573 tokens.\n"
]
},
{
"cell_type": "markdown",
"id": "3513aa96",
"metadata": {},
"source": [
"For sentence splitting, we use NLTK (faster compared to spacy).\n",
"Our cased and uncased models are training with an initial sequence length of 512\n",
"subwords for ~2-3M steps.\n"
]
},
{
"cell_type": "markdown",
"id": "ca0e58ee",
"metadata": {},
"source": [
"For the XXL Italian models, we use the same training data from OPUS and extend\n",
"it with data from the Italian part of the [OSCAR corpus](https://traces1.inria.fr/oscar/).\n",
"Thus, the final training corpus has a size of 81GB and 13,138,379,147 tokens.\n"
]
},
{
"cell_type": "markdown",
"id": "744e0851",
"metadata": {},
"source": [
"Note: Unfortunately, a wrong vocab size was used when training the XXL models.\n",
"This explains the mismatch of the \"real\" vocab size of 31102, compared to the\n",
"vocab size specified in `config.json`. However, the model is working and all\n",
"evaluations were done under those circumstances.\n",
"See [this issue](https://github.com/dbmdz/berts/issues/7) for more information.\n"
]
},
{
"cell_type": "markdown",
"id": "3bb28396",
"metadata": {},
"source": [
"The Italian ELECTRA model was trained on the \"XXL\" corpus for 1M steps in total using a batch\n",
"size of 128. We pretty much following the ELECTRA training procedure as used for\n",
"[BERTurk](https://github.com/stefan-it/turkish-bert/tree/master/electra).\n"
]
},
{
"cell_type": "markdown",
"id": "4c0c2ecb",
"metadata": {},
"source": [
"## Usage\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e059cf91",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "95a883f8",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dbmdz/bert-base-italian-xxl-cased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "6b2c856e",
"metadata": {},
"source": [
"# Reference"
]
},
{
"cell_type": "markdown",
"id": "ffdf7223",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/dbmdz/bert-base-italian-xxl-cased](https://huggingface.co/dbmdz/bert-base-italian-xxl-cased),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "41ca2df0",
"metadata": {},
"source": [
"# 🤗 + 📚 dbmdz BERT and ELECTRA models\n"
]
},
{
"cell_type": "markdown",
"id": "58e60a32",
"metadata": {},
"source": [
"In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State\n",
"Library open sources Italian BERT and ELECTRA models 🎉\n"
]
},
{
"cell_type": "markdown",
"id": "c7b5f379",
"metadata": {},
"source": [
"# Italian BERT\n"
]
},
{
"cell_type": "markdown",
"id": "5bf65013",
"metadata": {},
"source": [
"The source data for the Italian BERT model consists of a recent Wikipedia dump and\n",
"various texts from the [OPUS corpora](http://opus.nlpl.eu/) collection. The final\n",
"training corpus has a size of 13GB and 2,050,057,573 tokens.\n"
]
},
{
"cell_type": "markdown",
"id": "a3aadc8d",
"metadata": {},
"source": [
"For sentence splitting, we use NLTK (faster compared to spacy).\n",
"Our cased and uncased models are training with an initial sequence length of 512\n",
"subwords for ~2-3M steps.\n"
]
},
{
"cell_type": "markdown",
"id": "4d670485",
"metadata": {},
"source": [
"For the XXL Italian models, we use the same training data from OPUS and extend\n",
"it with data from the Italian part of the [OSCAR corpus](https://traces1.inria.fr/oscar/).\n",
"Thus, the final training corpus has a size of 81GB and 13,138,379,147 tokens.\n"
]
},
{
"cell_type": "markdown",
"id": "a366dc7d",
"metadata": {},
"source": [
"Note: Unfortunately, a wrong vocab size was used when training the XXL models.\n",
"This explains the mismatch of the \"real\" vocab size of 31102, compared to the\n",
"vocab size specified in `config.json`. However, the model is working and all\n",
"evaluations were done under those circumstances.\n",
"See [this issue](https://github.com/dbmdz/berts/issues/7) for more information.\n"
]
},
{
"cell_type": "markdown",
"id": "eaee3adf",
"metadata": {},
"source": [
"The Italian ELECTRA model was trained on the \"XXL\" corpus for 1M steps in total using a batch\n",
"size of 128. We pretty much following the ELECTRA training procedure as used for\n",
"[BERTurk](https://github.com/stefan-it/turkish-bert/tree/master/electra).\n"
]
},
{
"cell_type": "markdown",
"id": "c5151f8a",
"metadata": {},
"source": [
"## Usage\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f9e48e19",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c02e6f47",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dbmdz/bert-base-italian-xxl-cased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "13e03e4b",
"metadata": {},
"source": [
"# Reference"
]
},
{
"cell_type": "markdown",
"id": "66705724",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/dbmdz/bert-base-italian-xxl-cased](https://huggingface.co/dbmdz/bert-base-italian-xxl-cased) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Model_Info:
name: "dbmdz/bert-base-turkish-128k-cased"
description: "🤗 + 📚 dbmdz Turkish BERT model"
description_en: "🤗 + 📚 dbmdz Turkish BERT model"
icon: ""
from_repo: "https://huggingface.co/dbmdz/bert-base-turkish-128k-cased"
Task:
Example:
Datasets: ""
Publisher: "dbmdz"
License: "mit"
Language: "Turkish"
Paper:
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
Language: Turkish
License: mit
Model_Info:
description: 🤗 + 📚 dbmdz Turkish BERT model
description_en: 🤗 + 📚 dbmdz Turkish BERT model
from_repo: https://huggingface.co/dbmdz/bert-base-turkish-128k-cased
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: dbmdz/bert-base-turkish-128k-cased
Paper: null
Publisher: dbmdz
Task: null
{
"cells": [
{
"cell_type": "markdown",
"id": "bda1db47",
"metadata": {},
"source": [
"# 🤗 + 📚 dbmdz Turkish BERT model\n"
]
},
{
"cell_type": "markdown",
"id": "ba254a15",
"metadata": {},
"source": [
"In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State\n",
"Library open sources a cased model for Turkish 🎉\n"
]
},
{
"cell_type": "markdown",
"id": "bf2818ba",
"metadata": {},
"source": [
"# 🇹🇷 BERTurk\n"
]
},
{
"cell_type": "markdown",
"id": "788f7baa",
"metadata": {},
"source": [
"BERTurk is a community-driven cased BERT model for Turkish.\n"
]
},
{
"cell_type": "markdown",
"id": "5e051a7d",
"metadata": {},
"source": [
"Some datasets used for pretraining and evaluation are contributed from the\n",
"awesome Turkish NLP community, as well as the decision for the model name: BERTurk.\n"
]
},
{
"cell_type": "markdown",
"id": "1edbcf52",
"metadata": {},
"source": [
"## Stats\n"
]
},
{
"cell_type": "markdown",
"id": "5b7c3ff4",
"metadata": {},
"source": [
"The current version of the model is trained on a filtered and sentence\n",
"segmented version of the Turkish [OSCAR corpus](https://traces1.inria.fr/oscar/),\n",
"a recent Wikipedia dump, various [OPUS corpora](http://opus.nlpl.eu/) and a\n",
"special corpus provided by [Kemal Oflazer](http://www.andrew.cmu.edu/user/ko/).\n"
]
},
{
"cell_type": "markdown",
"id": "9413f21a",
"metadata": {},
"source": [
"The final training corpus has a size of 35GB and 44,04,976,662 tokens.\n"
]
},
{
"cell_type": "markdown",
"id": "25593952",
"metadata": {},
"source": [
"For this model we use a vocab size of 128k.\n"
]
},
{
"cell_type": "markdown",
"id": "962cf00d",
"metadata": {},
"source": [
"## Usage\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d4a4e8e3",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7157d7da",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dbmdz/bert-base-turkish-128k-cased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "e47155ee",
"metadata": {},
"source": [
"# Reference"
]
},
{
"cell_type": "markdown",
"id": "081638b2",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/dbmdz/bert-base-turkish-128k-cased](https://huggingface.co/dbmdz/bert-base-turkish-128k-cased),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "911a1be9",
"metadata": {},
"source": [
"# 🤗 + 📚 dbmdz Turkish BERT model\n"
]
},
{
"cell_type": "markdown",
"id": "4f09b0f1",
"metadata": {},
"source": [
"In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State\n",
"Library open sources a cased model for Turkish 🎉\n"
]
},
{
"cell_type": "markdown",
"id": "fa2a78a0",
"metadata": {},
"source": [
"# 🇹🇷 BERTurk\n"
]
},
{
"cell_type": "markdown",
"id": "8b2f8c68",
"metadata": {},
"source": [
"BERTurk is a community-driven cased BERT model for Turkish.\n"
]
},
{
"cell_type": "markdown",
"id": "fe23e365",
"metadata": {},
"source": [
"Some datasets used for pretraining and evaluation are contributed from the\n",
"awesome Turkish NLP community, as well as the decision for the model name: BERTurk.\n"
]
},
{
"cell_type": "markdown",
"id": "2da0ce24",
"metadata": {},
"source": [
"## Stats\n"
]
},
{
"cell_type": "markdown",
"id": "d3f6af43",
"metadata": {},
"source": [
"The current version of the model is trained on a filtered and sentence\n",
"segmented version of the Turkish [OSCAR corpus](https://traces1.inria.fr/oscar/),\n",
"a recent Wikipedia dump, various [OPUS corpora](http://opus.nlpl.eu/) and a\n",
"special corpus provided by [Kemal Oflazer](http://www.andrew.cmu.edu/user/ko/).\n"
]
},
{
"cell_type": "markdown",
"id": "0d8d60c1",
"metadata": {},
"source": [
"The final training corpus has a size of 35GB and 44,04,976,662 tokens.\n"
]
},
{
"cell_type": "markdown",
"id": "ce42504f",
"metadata": {},
"source": [
"For this model we use a vocab size of 128k.\n"
]
},
{
"cell_type": "markdown",
"id": "4815bfdb",
"metadata": {},
"source": [
"## Usage\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7a084604",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8d041c78",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dbmdz/bert-base-turkish-128k-cased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "da82079c",
"metadata": {},
"source": [
"# Reference"
]
},
{
"cell_type": "markdown",
"id": "b6632d46",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/dbmdz/bert-base-turkish-128k-cased](https://huggingface.co/dbmdz/bert-base-turkish-128k-cased) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Model_Info:
name: "dbmdz/bert-base-turkish-cased"
description: "🤗 + 📚 dbmdz Turkish BERT model"
description_en: "🤗 + 📚 dbmdz Turkish BERT model"
icon: ""
from_repo: "https://huggingface.co/dbmdz/bert-base-turkish-cased"
Task:
Example:
Datasets: ""
Publisher: "dbmdz"
License: "mit"
Language: "Turkish"
Paper:
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
Language: Turkish
License: mit
Model_Info:
description: 🤗 + 📚 dbmdz Turkish BERT model
description_en: 🤗 + 📚 dbmdz Turkish BERT model
from_repo: https://huggingface.co/dbmdz/bert-base-turkish-cased
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: dbmdz/bert-base-turkish-cased
Paper: null
Publisher: dbmdz
Task: null
{
"cells": [
{
"cell_type": "markdown",
"id": "e9075248",
"metadata": {},
"source": [
"# 🤗 + 📚 dbmdz Turkish BERT model\n"
]
},
{
"cell_type": "markdown",
"id": "0f68224a",
"metadata": {},
"source": [
"In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State\n",
"Library open sources a cased model for Turkish 🎉\n"
]
},
{
"cell_type": "markdown",
"id": "a800751f",
"metadata": {},
"source": [
"# 🇹🇷 BERTurk\n"
]
},
{
"cell_type": "markdown",
"id": "0f418bcc",
"metadata": {},
"source": [
"BERTurk is a community-driven cased BERT model for Turkish.\n"
]
},
{
"cell_type": "markdown",
"id": "059528cb",
"metadata": {},
"source": [
"Some datasets used for pretraining and evaluation are contributed from the\n",
"awesome Turkish NLP community, as well as the decision for the model name: BERTurk.\n"
]
},
{
"cell_type": "markdown",
"id": "ec8d00db",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1cb273ef",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "45fd943c",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dbmdz/bert-base-turkish-cased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "0653e10b",
"metadata": {},
"source": [
"# Reference"
]
},
{
"cell_type": "markdown",
"id": "14b8d869",
"metadata": {},
"source": [
"\n",
"\n",
"> 此模型介绍及权重来源于[https://huggingface.co/dbmdz/bert-base-turkish-cased](https://huggingface.co/dbmdz/bert-base-turkish-cased),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "22b9df2e",
"metadata": {},
"source": [
"# 🤗 + 📚 dbmdz Turkish BERT model\n"
]
},
{
"cell_type": "markdown",
"id": "509b461d",
"metadata": {},
"source": [
"In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State\n",
"Library open sources a cased model for Turkish 🎉\n"
]
},
{
"cell_type": "markdown",
"id": "84ab205a",
"metadata": {},
"source": [
"# 🇹🇷 BERTurk\n"
]
},
{
"cell_type": "markdown",
"id": "aafa4b5d",
"metadata": {},
"source": [
"BERTurk is a community-driven cased BERT model for Turkish.\n"
]
},
{
"cell_type": "markdown",
"id": "16251feb",
"metadata": {},
"source": [
"Some datasets used for pretraining and evaluation are contributed from the\n",
"awesome Turkish NLP community, as well as the decision for the model name: BERTurk.\n"
]
},
{
"cell_type": "markdown",
"id": "1bdf0158",
"metadata": {},
"source": [
"## Usage\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "aa2b4d91",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0f55d31d",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dbmdz/bert-base-turkish-cased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "9aae3e54",
"metadata": {},
"source": [
"# Reference"
]
},
{
"cell_type": "markdown",
"id": "839b89b9",
"metadata": {},
"source": [
"\n",
"> The model introduction and model weights originate from [https://huggingface.co/dbmdz/bert-base-turkish-cased](https://huggingface.co/dbmdz/bert-base-turkish-cased) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Model_Info:
name: "dbmdz/bert-base-turkish-uncased"
description: "🤗 + 📚 dbmdz Turkish BERT model"
description_en: "🤗 + 📚 dbmdz Turkish BERT model"
icon: ""
from_repo: "https://huggingface.co/dbmdz/bert-base-turkish-uncased"
Task:
Example:
Datasets: ""
Publisher: "dbmdz"
License: "mit"
Language: "Turkish"
Paper:
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
Language: Turkish
License: mit
Model_Info:
description: 🤗 + 📚 dbmdz Turkish BERT model
description_en: 🤗 + 📚 dbmdz Turkish BERT model
from_repo: https://huggingface.co/dbmdz/bert-base-turkish-uncased
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: dbmdz/bert-base-turkish-uncased
Paper: null
Publisher: dbmdz
Task: null
{
"cells": [
{
"cell_type": "markdown",
"id": "f3dbf349",
"metadata": {},
"source": [
"# 🤗 + 📚 dbmdz Turkish BERT model\n"
]
},
{
"cell_type": "markdown",
"id": "479d8e10",
"metadata": {},
"source": [
"In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State\n",
"Library open sources an uncased model for Turkish 🎉\n"
]
},
{
"cell_type": "markdown",
"id": "fc31d938",
"metadata": {},
"source": [
"# 🇹🇷 BERTurk\n"
]
},
{
"cell_type": "markdown",
"id": "c05c98f4",
"metadata": {},
"source": [
"BERTurk is a community-driven uncased BERT model for Turkish.\n"
]
},
{
"cell_type": "markdown",
"id": "116bbd89",
"metadata": {},
"source": [
"Some datasets used for pretraining and evaluation are contributed from the\n",
"awesome Turkish NLP community, as well as the decision for the model name: BERTurk.\n"
]
},
{
"cell_type": "markdown",
"id": "eab29ae1",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "531e7c2f",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c23a11a9",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dbmdz/bert-base-turkish-uncased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "1a4d2556",
"metadata": {},
"source": [
"# Reference"
]
},
{
"cell_type": "markdown",
"id": "4e10e25f",
"metadata": {},
"source": [
"\n",
"\n",
"> 此模型介绍及权重来源于[https://huggingface.co/dbmdz/bert-base-turkish-uncased](https://huggingface.co/dbmdz/bert-base-turkish-uncased),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "f1968bb1",
"metadata": {},
"source": [
"# 🤗 + 📚 dbmdz Turkish BERT model\n"
]
},
{
"cell_type": "markdown",
"id": "37119e6e",
"metadata": {},
"source": [
"In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State\n",
"Library open sources an uncased model for Turkish 🎉\n"
]
},
{
"cell_type": "markdown",
"id": "e2428d3f",
"metadata": {},
"source": [
"# 🇹🇷 BERTurk\n"
]
},
{
"cell_type": "markdown",
"id": "455a98e2",
"metadata": {},
"source": [
"BERTurk is a community-driven uncased BERT model for Turkish.\n"
]
},
{
"cell_type": "markdown",
"id": "3c7b1272",
"metadata": {},
"source": [
"Some datasets used for pretraining and evaluation are contributed from the\n",
"awesome Turkish NLP community, as well as the decision for the model name: BERTurk.\n"
]
},
{
"cell_type": "markdown",
"id": "cdd8f852",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "81436ade",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "bd223538",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dbmdz/bert-base-turkish-uncased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "7edb6fa7",
"metadata": {},
"source": [
"# Reference"
]
},
{
"cell_type": "markdown",
"id": "95b108cb",
"metadata": {},
"source": [
"\n",
"\n",
"> The model introduction and model weights originate from [https://huggingface.co/dbmdz/bert-base-turkish-uncased](https://huggingface.co/dbmdz/bert-base-turkish-uncased) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: ''
License: mit
Model_Info:
name: "deepparag/Aeona"
description: "Aeona | Chatbot"
description_en: "Aeona | Chatbot"
icon: ""
from_repo: "https://huggingface.co/deepparag/Aeona"
description: Aeona | Chatbot
description_en: Aeona | Chatbot
from_repo: https://huggingface.co/deepparag/Aeona
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: deepparag/Aeona
Paper: null
Publisher: deepparag
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text Generation"
sub_tag: "文本生成"
Example:
Datasets: ""
Publisher: "deepparag"
License: "mit"
Language: ""
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- sub_tag: 文本生成
sub_tag_en: Text Generation
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "9d9bb2aa",
"metadata": {},
"source": [
"# Aeona | Chatbot\n"
]
},
{
"cell_type": "markdown",
"id": "7361d804",
"metadata": {},
"source": [
"An generative AI made using [microsoft/DialoGPT-small](https://huggingface.co/microsoft/DialoGPT-small).\n"
]
},
{
"cell_type": "markdown",
"id": "008bcb8d",
"metadata": {},
"source": [
"Recommended to use along with an [AIML Chatbot](https://github.com/deepsarda/Aeona-Aiml) to reduce load, get better replies, add name and personality to your bot.\n",
"Using an AIML Chatbot will allow you to hardcode some replies also.\n"
]
},
{
"cell_type": "markdown",
"id": "b4bfb9cd",
"metadata": {},
"source": [
"## Evaluation\n",
"Below is a comparison of Aeona vs. other baselines on the mixed dataset given above using automatic evaluation metrics.\n"
]
},
{
"cell_type": "markdown",
"id": "4d478ffa",
"metadata": {},
"source": [
"| Model | Perplexity |\n",
"|---|---|\n",
"| Seq2seq Baseline [3] | 29.8 |\n",
"| Wolf et al. [5] | 16.3 |\n",
"| GPT-2 baseline | 99.5 |\n",
"| DialoGPT baseline | 56.6 |\n",
"| DialoGPT finetuned | 11.4 |\n",
"| PersonaGPT | 10.2 |\n",
"| **Aeona** | **7.9** |\n"
]
},
{
"cell_type": "markdown",
"id": "ebb927ce",
"metadata": {},
"source": [
"## Usage\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ea2a9d8e",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "dc15795c",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"deepparag/Aeona\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "074bd20d",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/deepparag/Aeona](https://huggingface.co/deepparag/Aeona),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "f8079990",
"metadata": {},
"source": [
"# Aeona | Chatbot\n"
]
},
{
"cell_type": "markdown",
"id": "6a69f81a",
"metadata": {},
"source": [
"An generative AI made using [microsoft/DialoGPT-small](https://huggingface.co/microsoft/DialoGPT-small).\n"
]
},
{
"cell_type": "markdown",
"id": "a65479b8",
"metadata": {},
"source": [
"Recommended to use along with an [AIML Chatbot](https://github.com/deepsarda/Aeona-Aiml) to reduce load, get better replies, add name and personality to your bot.\n",
"Using an AIML Chatbot will allow you to hardcode some replies also.\n"
]
},
{
"cell_type": "markdown",
"id": "ea390590",
"metadata": {},
"source": [
"## Evaluation\n",
"Below is a comparison of Aeona vs. other baselines on the mixed dataset given above using automatic evaluation metrics.\n"
]
},
{
"cell_type": "markdown",
"id": "5b64325a",
"metadata": {},
"source": [
"| Model | Perplexity |\n",
"|---|---|\n",
"| Seq2seq Baseline [3] | 29.8 |\n",
"| Wolf et al. [5] | 16.3 |\n",
"| GPT-2 baseline | 99.5 |\n",
"| DialoGPT baseline | 56.6 |\n",
"| DialoGPT finetuned | 11.4 |\n",
"| PersonaGPT | 10.2 |\n",
"| **Aeona** | **7.9** |\n"
]
},
{
"cell_type": "markdown",
"id": "bf7f0d0e",
"metadata": {},
"source": [
"## Usage\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "16b58290",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "cf18c96e",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"deepparag/Aeona\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "fae16f6e",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/deepparag/Aeona](https://huggingface.co/deepparag/Aeona) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
# 模型列表
## deepparag/DumBot
| 模型名称 | 模型介绍 | 模型大小 | 模型下载 |
| --- | --- | --- | --- |
|deepparag/DumBot| | 621.95MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/deepparag/DumBot/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/deepparag/DumBot/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/deepparag/DumBot/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/deepparag/DumBot/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/deepparag/DumBot/vocab.json) |
也可以通过`paddlenlp` cli 工具来下载对应的模型权重,使用步骤如下所示:
* 安装paddlenlp
```shell
pip install --upgrade paddlenlp
```
* 下载命令行
```shell
paddlenlp download --cache-dir ./pretrained_models deepparag/DumBot
```
有任何下载的问题都可以到[PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP)中发Issue提问。
\ No newline at end of file
# model list
##
| model | description | model_size | download |
| --- | --- | --- | --- |
|deepparag/DumBot| | 621.95MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/deepparag/DumBot/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/deepparag/DumBot/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/deepparag/DumBot/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/deepparag/DumBot/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/deepparag/DumBot/vocab.json) |
or you can download all of model file with the following steps:
* install paddlenlp
```shell
pip install --upgrade paddlenlp
```
* download model with cli tool
```shell
paddlenlp download --cache-dir ./pretrained_models deepparag/DumBot
```
If you have any problems with it, you can post issue on [PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP) to get support.
Model_Info:
name: "deepparag/DumBot"
description: "THIS AI IS OUTDATED. See [Aeona](https://huggingface.co/deepparag/Aeona)"
description_en: "THIS AI IS OUTDATED. See [Aeona](https://huggingface.co/deepparag/Aeona)"
icon: ""
from_repo: "https://huggingface.co/deepparag/DumBot"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text Generation"
sub_tag: "文本生成"
Example:
Datasets: ""
Publisher: "deepparag"
License: "mit"
Language: ""
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
Datasets: squad_v2
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: mit
Model_Info:
name: "deepset/roberta-base-squad2-distilled"
description: "Overview"
description_en: "Overview"
icon: ""
from_repo: "https://huggingface.co/deepset/roberta-base-squad2-distilled"
description: Overview
description_en: Overview
from_repo: https://huggingface.co/deepset/roberta-base-squad2-distilled
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: deepset/roberta-base-squad2-distilled
Paper: null
Publisher: deepset
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Question Answering"
sub_tag: "回答问题"
Example:
Datasets: "squad_v2"
Publisher: "deepset"
License: "mit"
Language: "English"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- sub_tag: 回答问题
sub_tag_en: Question Answering
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "85b7cc2e",
"metadata": {},
"source": [
"## Overview\n",
"\n",
"Language model: deepset/roberta-base-squad2-distilled\n",
"\n",
"Language: English\n",
"\n",
"Training data: SQuAD 2.0 training set Eval data: SQuAD 2.0 dev set Infrastructure: 4x V100 GPU\n",
"\n",
"Published: Dec 8th, 2021"
]
},
{
"cell_type": "markdown",
"id": "a455ff64",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d51fa907",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4590c7eb",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"deepset/roberta-base-squad2-distilled\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "ac6e34fd",
"metadata": {},
"source": [
"## Authors\n",
"- Timo Möller: `timo.moeller [at] deepset.ai`\n",
"- Julian Risch: `julian.risch [at] deepset.ai`\n",
"- Malte Pietsch: `malte.pietsch [at] deepset.ai`\n",
"- Michel Bartels: `michel.bartels [at] deepset.ai`\n",
"## About us\n",
"![deepset logo](https://workablehr.s3.amazonaws.com/uploads/account/logo/476306/logo)\n",
"We bring NLP to the industry via open source!\n",
"Our focus: Industry specific language models & large scale QA systems.\n"
]
},
{
"cell_type": "markdown",
"id": "3d22bf87",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/deepset/roberta-base-squad2-distilled](https://huggingface.co/deepset/roberta-base-squad2-distilled),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "b917c220",
"metadata": {},
"source": [
"## Overview\n",
"\n",
"Language model: deepset/roberta-base-squad2-distilled\n",
"\n",
"Language: English\n",
"\n",
"Training data: SQuAD 2.0 training set Eval data: SQuAD 2.0 dev set Infrastructure: 4x V100 GPU\n",
"\n",
"Published: Dec 8th, 2021"
]
},
{
"cell_type": "markdown",
"id": "94e41c66",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8b2c9009",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b9472a8e",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"deepset/roberta-base-squad2-distilled\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "942ce61d",
"metadata": {},
"source": [
"## Authors\n",
"- Timo Möller: `timo.moeller [at] deepset.ai`\n",
"- Julian Risch: `julian.risch [at] deepset.ai`\n",
"- Malte Pietsch: `malte.pietsch [at] deepset.ai`\n",
"- Michel Bartels: `michel.bartels [at] deepset.ai`\n",
"## About us\n",
"![deepset logo](https://workablehr.s3.amazonaws.com/uploads/account/logo/476306/logo)\n",
"We bring NLP to the industry via open source!\n",
"Our focus: Industry specific language models & large scale QA systems.\n"
]
},
{
"cell_type": "markdown",
"id": "d65be46f",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/deepset/roberta-base-squad2-distilled](https://huggingface.co/deepset/roberta-base-squad2-distilled) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: conll2003
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: mit
Model_Info:
name: "dslim/bert-base-NER"
description: "bert-base-NER"
description_en: "bert-base-NER"
icon: ""
from_repo: "https://huggingface.co/dslim/bert-base-NER"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Token Classification"
sub_tag: "Token分类"
Example:
Datasets: "conll2003"
Publisher: "dslim"
License: "mit"
Language: "English"
description: bert-base-NER
description_en: bert-base-NER
from_repo: https://huggingface.co/dslim/bert-base-NER
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: dslim/bert-base-NER
Paper:
- title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding'
url: 'http://arxiv.org/abs/1810.04805v2'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding'
url: http://arxiv.org/abs/1810.04805v2
Publisher: dslim
Task:
- sub_tag: Token分类
sub_tag_en: Token Classification
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "4dd2d9a8",
"metadata": {},
"source": [
"# bert-base-NER\n"
]
},
{
"cell_type": "markdown",
"id": "39c0b4be",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "0961b24f",
"metadata": {},
"source": [
"**bert-base-NER** is a fine-tuned BERT model that is ready to use for **Named Entity Recognition** and achieves **state-of-the-art performance** for the NER task. It has been trained to recognize four types of entities: location (LOC), organizations (ORG), person (PER) and Miscellaneous (MISC).\n"
]
},
{
"cell_type": "markdown",
"id": "58641459",
"metadata": {},
"source": [
"Specifically, this model is a *bert-base-cased* model that was fine-tuned on the English version of the standard [CoNLL-2003 Named Entity Recognition](https://www.aclweb.org/anthology/W03-0419.pdf) dataset.\n"
]
},
{
"cell_type": "markdown",
"id": "9da0ddda",
"metadata": {},
"source": [
"If you'd like to use a larger BERT-large model fine-tuned on the same dataset, a **bert-large-NER** version is also available.\n"
]
},
{
"cell_type": "markdown",
"id": "4d5adc68",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "88f3ea49",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "76ef1f0e",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dslim/bert-base-NER\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "137e381c",
"metadata": {},
"source": [
"## Citation\n",
"```\n",
"@inproceedings{tjong-kim-sang-de-meulder-2003-introduction,\n",
"title = \"Introduction to the {C}o{NLL}-2003 Shared Task: Language-Independent Named Entity Recognition\",\n",
"author = \"Tjong Kim Sang, Erik F. and\n",
"De Meulder, Fien\",\n",
"booktitle = \"Proceedings of the Seventh Conference on Natural Language Learning at {HLT}-{NAACL} 2003\",\n",
"year = \"2003\",\n",
"url = \"https://www.aclweb.org/anthology/W03-0419\",\n",
"pages = \"142--147\",\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "3a632df2",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/dslim/bert-base-NER](https://huggingface.co/dslim/bert-base-NER),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "c5180cf2",
"metadata": {},
"source": [
"# bert-base-NER\n"
]
},
{
"cell_type": "markdown",
"id": "dbf08fd8",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "11690dda",
"metadata": {},
"source": [
"**bert-base-NER** is a fine-tuned BERT model that is ready to use for **Named Entity Recognition** and achieves **state-of-the-art performance** for the NER task. It has been trained to recognize four types of entities: location (LOC), organizations (ORG), person (PER) and Miscellaneous (MISC).\n"
]
},
{
"cell_type": "markdown",
"id": "738f98db",
"metadata": {},
"source": [
"Specifically, this model is a *bert-base-cased* model that was fine-tuned on the English version of the standard [CoNLL-2003 Named Entity Recognition](https://www.aclweb.org/anthology/W03-0419.pdf) dataset.\n"
]
},
{
"cell_type": "markdown",
"id": "03c5db03",
"metadata": {},
"source": [
"If you'd like to use a larger BERT-large model fine-tuned on the same dataset, a **bert-large-NER** version is also available.\n"
]
},
{
"cell_type": "markdown",
"id": "da040b29",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "726ee6e9",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "73564a0c",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dslim/bert-base-NER\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "c08bc233",
"metadata": {},
"source": [
"## Citation\n",
"\n",
"```\n",
"@inproceedings{tjong-kim-sang-de-meulder-2003-introduction,\n",
"title = \"Introduction to the {C}o{NLL}-2003 Shared Task: Language-Independent Named Entity Recognition\",\n",
"author = \"Tjong Kim Sang, Erik F. and\n",
"De Meulder, Fien\",\n",
"booktitle = \"Proceedings of the Seventh Conference on Natural Language Learning at {HLT}-{NAACL} 2003\",\n",
"year = \"2003\",\n",
"url = \"https://www.aclweb.org/anthology/W03-0419\",\n",
"pages = \"142--147\",\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "a56e1055",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/dslim/bert-base-NER](https://huggingface.co/dslim/bert-base-NER) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: conll2003
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: mit
Model_Info:
name: "dslim/bert-large-NER"
description: "bert-base-NER"
description_en: "bert-base-NER"
icon: ""
from_repo: "https://huggingface.co/dslim/bert-large-NER"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Token Classification"
sub_tag: "Token分类"
Example:
Datasets: "conll2003"
Publisher: "dslim"
License: "mit"
Language: "English"
description: bert-base-NER
description_en: bert-base-NER
from_repo: https://huggingface.co/dslim/bert-large-NER
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: dslim/bert-large-NER
Paper:
- title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding'
url: 'http://arxiv.org/abs/1810.04805v2'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding'
url: http://arxiv.org/abs/1810.04805v2
Publisher: dslim
Task:
- sub_tag: Token分类
sub_tag_en: Token Classification
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "70a24d70",
"metadata": {},
"source": [
"# bert-base-NER\n"
]
},
{
"cell_type": "markdown",
"id": "72df2cc8",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "d1aabc01",
"metadata": {},
"source": [
"**bert-large-NER** is a fine-tuned BERT model that is ready to use for **Named Entity Recognition** and achieves **state-of-the-art performance** for the NER task. It has been trained to recognize four types of entities: location (LOC), organizations (ORG), person (PER) and Miscellaneous (MISC).\n"
]
},
{
"cell_type": "markdown",
"id": "2d53a70b",
"metadata": {},
"source": [
"Specifically, this model is a *bert-large-cased* model that was fine-tuned on the English version of the standard [CoNLL-2003 Named Entity Recognition](https://www.aclweb.org/anthology/W03-0419.pdf) dataset.\n"
]
},
{
"cell_type": "markdown",
"id": "60c2ceb7",
"metadata": {},
"source": [
"If you'd like to use a smaller BERT model fine-tuned on the same dataset, a **bert-base-NER** version is also available.\n"
]
},
{
"cell_type": "markdown",
"id": "42d984a9",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9ef70bc3",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1177b32e",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dslim/bert-large-NER\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "353c5156",
"metadata": {},
"source": [
"## Citation\n",
"\n",
"```\n",
"@inproceedings{tjong-kim-sang-de-meulder-2003-introduction,\n",
"title = \"Introduction to the {C}o{NLL}-2003 Shared Task: Language-Independent Named Entity Recognition\",\n",
"author = \"Tjong Kim Sang, Erik F. and\n",
"De Meulder, Fien\",\n",
"booktitle = \"Proceedings of the Seventh Conference on Natural Language Learning at {HLT}-{NAACL} 2003\",\n",
"year = \"2003\",\n",
"url = \"https://www.aclweb.org/anthology/W03-0419\",\n",
"pages = \"142--147\",\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "5705ae48",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/dslim/bert-large-NER](https://huggingface.co/dslim/bert-large-NER),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "574c41aa",
"metadata": {},
"source": [
"# bert-base-NER\n"
]
},
{
"cell_type": "markdown",
"id": "430be48c",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "4bdfd881",
"metadata": {},
"source": [
"**bert-large-NER** is a fine-tuned BERT model that is ready to use for **Named Entity Recognition** and achieves **state-of-the-art performance** for the NER task. It has been trained to recognize four types of entities: location (LOC), organizations (ORG), person (PER) and Miscellaneous (MISC).\n"
]
},
{
"cell_type": "markdown",
"id": "84ed4f8f",
"metadata": {},
"source": [
"Specifically, this model is a *bert-large-cased* model that was fine-tuned on the English version of the standard [CoNLL-2003 Named Entity Recognition](https://www.aclweb.org/anthology/W03-0419.pdf) dataset.\n"
]
},
{
"cell_type": "markdown",
"id": "b4acb8a4",
"metadata": {},
"source": [
"If you'd like to use a smaller BERT model fine-tuned on the same dataset, a **bert-base-NER** version is also available.\n"
]
},
{
"cell_type": "markdown",
"id": "46ad4f1f",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b7e64545",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "380aa0dc",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dslim/bert-large-NER\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "220eb907",
"metadata": {},
"source": [
"## Citation\n",
"\n",
"```\n",
"@inproceedings{tjong-kim-sang-de-meulder-2003-introduction,\n",
"title = \"Introduction to the {C}o{NLL}-2003 Shared Task: Language-Independent Named Entity Recognition\",\n",
"author = \"Tjong Kim Sang, Erik F. and\n",
"De Meulder, Fien\",\n",
"booktitle = \"Proceedings of the Seventh Conference on Natural Language Learning at {HLT}-{NAACL} 2003\",\n",
"year = \"2003\",\n",
"url = \"https://www.aclweb.org/anthology/W03-0419\",\n",
"pages = \"142--147\",\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "dc0b9503",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/dslim/bert-large-NER](https://huggingface.co/dslim/bert-large-NER) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: Romanian
License: mit
Model_Info:
name: "dumitrescustefan/bert-base-romanian-cased-v1"
description: "bert-base-romanian-cased-v1"
description_en: "bert-base-romanian-cased-v1"
icon: ""
from_repo: "https://huggingface.co/dumitrescustefan/bert-base-romanian-cased-v1"
description: bert-base-romanian-cased-v1
description_en: bert-base-romanian-cased-v1
from_repo: https://huggingface.co/dumitrescustefan/bert-base-romanian-cased-v1
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: dumitrescustefan/bert-base-romanian-cased-v1
Paper: null
Publisher: dumitrescustefan
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: ""
Publisher: "dumitrescustefan"
License: "mit"
Language: "Romanian"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "2a485f7a",
"metadata": {},
"source": [
"# bert-base-romanian-cased-v1\n"
]
},
{
"cell_type": "markdown",
"id": "5f911938",
"metadata": {},
"source": [
"The BERT **base**, **cased** model for Romanian, trained on a 15GB corpus.\n"
]
},
{
"cell_type": "markdown",
"id": "e2cccf2e",
"metadata": {},
"source": [
"### How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "64b86b26",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f84498aa",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dumitrescustefan/bert-base-romanian-cased-v1\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "40176abc",
"metadata": {},
"source": [
"## Citation\n",
"\n",
"```\n",
"@inproceedings{dumitrescu-etal-2020-birth,\n",
"title = \"The birth of {R}omanian {BERT}\",\n",
"author = \"Dumitrescu, Stefan and\n",
"Avram, Andrei-Marius and\n",
"Pyysalo, Sampo\",\n",
"booktitle = \"Findings of the Association for Computational Linguistics: EMNLP 2020\",\n",
"month = nov,\n",
"year = \"2020\",\n",
"address = \"Online\",\n",
"publisher = \"Association for Computational Linguistics\",\n",
"url = \"https://aclanthology.org/2020.findings-emnlp.387\",\n",
"doi = \"10.18653/v1/2020.findings-emnlp.387\",\n",
"pages = \"4324--4328\",\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "550b09f3",
"metadata": {},
"source": [
"#### Acknowledgements\n"
]
},
{
"cell_type": "markdown",
"id": "fda88b7c",
"metadata": {},
"source": [
"- We'd like to thank [Sampo Pyysalo](https://github.com/spyysalo) from TurkuNLP for helping us out with the compute needed to pretrain the v1.0 BERT models. He's awesome!\n",
"> 此模型介绍及权重来源于[https://huggingface.co/dumitrescustefan/bert-base-romanian-cased-v1](https://huggingface.co/dumitrescustefan/bert-base-romanian-cased-v1),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "28a330b8",
"metadata": {},
"source": [
"# bert-base-romanian-cased-v1\n"
]
},
{
"cell_type": "markdown",
"id": "36f0d74f",
"metadata": {},
"source": [
"The BERT **base**, **cased** model for Romanian, trained on a 15GB corpus."
]
},
{
"cell_type": "markdown",
"id": "0104e14e",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b4ca4271",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9f3ca553",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dumitrescustefan/bert-base-romanian-cased-v1\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "51754d3f",
"metadata": {},
"source": [
"## Citation\n",
"\n",
"```\n",
"@inproceedings{dumitrescu-etal-2020-birth,\n",
"title = \"The birth of {R}omanian {BERT}\",\n",
"author = \"Dumitrescu, Stefan and\n",
"Avram, Andrei-Marius and\n",
"Pyysalo, Sampo\",\n",
"booktitle = \"Findings of the Association for Computational Linguistics: EMNLP 2020\",\n",
"month = nov,\n",
"year = \"2020\",\n",
"address = \"Online\",\n",
"publisher = \"Association for Computational Linguistics\",\n",
"url = \"https://aclanthology.org/2020.findings-emnlp.387\",\n",
"doi = \"10.18653/v1/2020.findings-emnlp.387\",\n",
"pages = \"4324--4328\",\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "2143146f",
"metadata": {},
"source": [
"#### Acknowledgements\n"
]
},
{
"cell_type": "markdown",
"id": "d983ac22",
"metadata": {},
"source": [
"- We'd like to thank [Sampo Pyysalo](https://github.com/spyysalo) from TurkuNLP for helping us out with the compute needed to pretrain the v1.0 BERT models. He's awesome!\n",
"> The model introduction and model weights originate from [https://huggingface.co/dumitrescustefan/bert-base-romanian-cased-v1](https://huggingface.co/dumitrescustefan/bert-base-romanian-cased-v1) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: Romanian
License: mit
Model_Info:
name: "dumitrescustefan/bert-base-romanian-uncased-v1"
description: "bert-base-romanian-uncased-v1"
description_en: "bert-base-romanian-uncased-v1"
icon: ""
from_repo: "https://huggingface.co/dumitrescustefan/bert-base-romanian-uncased-v1"
description: bert-base-romanian-uncased-v1
description_en: bert-base-romanian-uncased-v1
from_repo: https://huggingface.co/dumitrescustefan/bert-base-romanian-uncased-v1
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: dumitrescustefan/bert-base-romanian-uncased-v1
Paper: null
Publisher: dumitrescustefan
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: ""
Publisher: "dumitrescustefan"
License: "mit"
Language: "Romanian"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "922f44e2",
"metadata": {},
"source": [
"# bert-base-romanian-uncased-v1\n"
]
},
{
"cell_type": "markdown",
"id": "2f5259bd",
"metadata": {},
"source": [
"The BERT **base**, **uncased** model for Romanian, trained on a 15GB corpus, version ![v1.0](https://img.shields.io/badge/v1.0-21%20Apr%202020-ff6666)\n"
]
},
{
"cell_type": "markdown",
"id": "408f4468",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "acd14372",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "cc5d539c",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dumitrescustefan/bert-base-romanian-uncased-v1\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "adbbab44",
"metadata": {},
"source": [
"## Citation\n",
"\n",
"```\n",
"@inproceedings{dumitrescu-etal-2020-birth,\n",
"title = \"The birth of {R}omanian {BERT}\",\n",
"author = \"Dumitrescu, Stefan and\n",
"Avram, Andrei-Marius and\n",
"Pyysalo, Sampo\",\n",
"booktitle = \"Findings of the Association for Computational Linguistics: EMNLP 2020\",\n",
"month = nov,\n",
"year = \"2020\",\n",
"address = \"Online\",\n",
"publisher = \"Association for Computational Linguistics\",\n",
"url = \"https://aclanthology.org/2020.findings-emnlp.387\",\n",
"doi = \"10.18653/v1/2020.findings-emnlp.387\",\n",
"pages = \"4324--4328\",\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "4276651e",
"metadata": {},
"source": [
"#### Acknowledgements\n"
]
},
{
"cell_type": "markdown",
"id": "84a91796",
"metadata": {},
"source": [
"- We'd like to thank [Sampo Pyysalo](https://github.com/spyysalo) from TurkuNLP for helping us out with the compute needed to pretrain the v1.0 BERT models. He's awesome!\n",
"> 此模型介绍及权重来源于[https://huggingface.co/dumitrescustefan/bert-base-romanian-uncased-v1](https://huggingface.co/dumitrescustefan/bert-base-romanian-uncased-v1),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "cf268d0f",
"metadata": {},
"source": [
"# bert-base-romanian-uncased-v1\n"
]
},
{
"cell_type": "markdown",
"id": "453405af",
"metadata": {},
"source": [
"The BERT **base**, **uncased** model for Romanian, trained on a 15GB corpus, version ![v1.0](https://img.shields.io/badge/v1.0-21%20Apr%202020-ff6666)\n"
]
},
{
"cell_type": "markdown",
"id": "4100824e",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "cd182732",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1c16cd09",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"dumitrescustefan/bert-base-romanian-uncased-v1\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "ba32e8ff",
"metadata": {},
"source": [
"## Citation\n",
"\n",
"```\n",
"@inproceedings{dumitrescu-etal-2020-birth,\n",
"title = \"The birth of {R}omanian {BERT}\",\n",
"author = \"Dumitrescu, Stefan and\n",
"Avram, Andrei-Marius and\n",
"Pyysalo, Sampo\",\n",
"booktitle = \"Findings of the Association for Computational Linguistics: EMNLP 2020\",\n",
"month = nov,\n",
"year = \"2020\",\n",
"address = \"Online\",\n",
"publisher = \"Association for Computational Linguistics\",\n",
"url = \"https://aclanthology.org/2020.findings-emnlp.387\",\n",
"doi = \"10.18653/v1/2020.findings-emnlp.387\",\n",
"pages = \"4324--4328\",\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "faa200c7",
"metadata": {},
"source": [
"#### Acknowledgements\n"
]
},
{
"cell_type": "markdown",
"id": "cb74a943",
"metadata": {},
"source": [
"- We'd like to thank [Sampo Pyysalo](https://github.com/spyysalo) from TurkuNLP for helping us out with the compute needed to pretrain the v1.0 BERT models. He's awesome!\n",
"> The model introduction and model weights originate from [https://huggingface.co/dumitrescustefan/bert-base-romanian-uncased-v1](https://huggingface.co/dumitrescustefan/bert-base-romanian-uncased-v1) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: mit
Model_Info:
name: "emilyalsentzer/Bio_ClinicalBERT"
description: "ClinicalBERT - Bio + Clinical BERT Model"
description_en: "ClinicalBERT - Bio + Clinical BERT Model"
icon: ""
from_repo: "https://huggingface.co/emilyalsentzer/Bio_ClinicalBERT"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: ""
Publisher: "emilyalsentzer"
License: "mit"
Language: "English"
description: ClinicalBERT - Bio + Clinical BERT Model
description_en: ClinicalBERT - Bio + Clinical BERT Model
from_repo: https://huggingface.co/emilyalsentzer/Bio_ClinicalBERT
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: emilyalsentzer/Bio_ClinicalBERT
Paper:
- title: 'Publicly Available Clinical BERT Embeddings'
url: 'http://arxiv.org/abs/1904.03323v3'
- title: 'BioBERT: a pre-trained biomedical language representation model for biomedical text mining'
url: 'http://arxiv.org/abs/1901.08746v4'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- title: Publicly Available Clinical BERT Embeddings
url: http://arxiv.org/abs/1904.03323v3
- title: 'BioBERT: a pre-trained biomedical language representation model for biomedical
text mining'
url: http://arxiv.org/abs/1901.08746v4
Publisher: emilyalsentzer
Task:
- sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "22b0e4db",
"metadata": {},
"source": [
"# ClinicalBERT - Bio + Clinical BERT Model\n"
]
},
{
"cell_type": "markdown",
"id": "f9d9ac37",
"metadata": {},
"source": [
"The [Publicly Available Clinical BERT Embeddings](https://arxiv.org/abs/1904.03323) paper contains four unique clinicalBERT models: initialized with BERT-Base (`cased_L-12_H-768_A-12`) or BioBERT (`BioBERT-Base v1.0 + PubMed 200K + PMC 270K`) & trained on either all MIMIC notes or only discharge summaries.\n"
]
},
{
"cell_type": "markdown",
"id": "24aaa0b1",
"metadata": {},
"source": [
"This model card describes the Bio+Clinical BERT model, which was initialized from [BioBERT](https://arxiv.org/abs/1901.08746) & trained on all MIMIC notes.\n"
]
},
{
"cell_type": "markdown",
"id": "1449fef2",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "be5241ea",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f4c3cf6f",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"emilyalsentzer/Bio_ClinicalBERT\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "451e4ff6",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/emilyalsentzer/Bio_ClinicalBERT](https://huggingface.co/emilyalsentzer/Bio_ClinicalBERT),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "91c5f94f",
"metadata": {},
"source": [
"# ClinicalBERT - Bio + Clinical BERT Model\n"
]
},
{
"cell_type": "markdown",
"id": "ec471b16",
"metadata": {},
"source": [
"The [Publicly Available Clinical BERT Embeddings](https://arxiv.org/abs/1904.03323) paper contains four unique clinicalBERT models: initialized with BERT-Base (`cased_L-12_H-768_A-12`) or BioBERT (`BioBERT-Base v1.0 + PubMed 200K + PMC 270K`) & trained on either all MIMIC notes or only discharge summaries.\n"
]
},
{
"cell_type": "markdown",
"id": "9ab166b8",
"metadata": {},
"source": [
"This model card describes the Bio+Clinical BERT model, which was initialized from [BioBERT](https://arxiv.org/abs/1901.08746) & trained on all MIMIC notes.\n"
]
},
{
"cell_type": "markdown",
"id": "69f6ed08",
"metadata": {},
"source": [
"## How to use the model\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "62913fa8",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7b055241",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"emilyalsentzer/Bio_ClinicalBERT\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "0716a06f",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/emilyalsentzer/Bio_ClinicalBERT](https://huggingface.co/emilyalsentzer/Bio_ClinicalBERT) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: mit
Model_Info:
name: "emilyalsentzer/Bio_Discharge_Summary_BERT"
description: "ClinicalBERT - Bio + Discharge Summary BERT Model"
description_en: "ClinicalBERT - Bio + Discharge Summary BERT Model"
icon: ""
from_repo: "https://huggingface.co/emilyalsentzer/Bio_Discharge_Summary_BERT"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: ""
Publisher: "emilyalsentzer"
License: "mit"
Language: "English"
description: ClinicalBERT - Bio + Discharge Summary BERT Model
description_en: ClinicalBERT - Bio + Discharge Summary BERT Model
from_repo: https://huggingface.co/emilyalsentzer/Bio_Discharge_Summary_BERT
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: emilyalsentzer/Bio_Discharge_Summary_BERT
Paper:
- title: 'Publicly Available Clinical BERT Embeddings'
url: 'http://arxiv.org/abs/1904.03323v3'
- title: 'BioBERT: a pre-trained biomedical language representation model for biomedical text mining'
url: 'http://arxiv.org/abs/1901.08746v4'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- title: Publicly Available Clinical BERT Embeddings
url: http://arxiv.org/abs/1904.03323v3
- title: 'BioBERT: a pre-trained biomedical language representation model for biomedical
text mining'
url: http://arxiv.org/abs/1901.08746v4
Publisher: emilyalsentzer
Task:
- sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "67503ba7",
"metadata": {},
"source": [
"# ClinicalBERT - Bio + Discharge Summary BERT Model\n"
]
},
{
"cell_type": "markdown",
"id": "e2d38260",
"metadata": {},
"source": [
"The [Publicly Available Clinical BERT Embeddings](https://arxiv.org/abs/1904.03323) paper contains four unique clinicalBERT models: initialized with BERT-Base (`cased_L-12_H-768_A-12`) or BioBERT (`BioBERT-Base v1.0 + PubMed 200K + PMC 270K`) & trained on either all MIMIC notes or only discharge summaries.\n"
]
},
{
"cell_type": "markdown",
"id": "1c92755e",
"metadata": {},
"source": [
"This model card describes the Bio+Discharge Summary BERT model, which was initialized from [BioBERT](https://arxiv.org/abs/1901.08746) & trained on only discharge summaries from MIMIC.\n"
]
},
{
"cell_type": "markdown",
"id": "068ba168",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "24e8b203",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4bcd1b84",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"emilyalsentzer/Bio_Discharge_Summary_BERT\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "0cebe09b",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/emilyalsentzer/Bio_Discharge_Summary_BERT](https://huggingface.co/emilyalsentzer/Bio_Discharge_Summary_BERT),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "a786c8f0",
"metadata": {},
"source": [
"# ClinicalBERT - Bio + Discharge Summary BERT Model\n"
]
},
{
"cell_type": "markdown",
"id": "4d8e4f1f",
"metadata": {},
"source": [
"The [Publicly Available Clinical BERT Embeddings](https://arxiv.org/abs/1904.03323) paper contains four unique clinicalBERT models: initialized with BERT-Base (`cased_L-12_H-768_A-12`) or BioBERT (`BioBERT-Base v1.0 + PubMed 200K + PMC 270K`) & trained on either all MIMIC notes or only discharge summaries.\n"
]
},
{
"cell_type": "markdown",
"id": "83bf8287",
"metadata": {},
"source": [
"This model card describes the Bio+Discharge Summary BERT model, which was initialized from [BioBERT](https://arxiv.org/abs/1901.08746) & trained on only discharge summaries from MIMIC.\n"
]
},
{
"cell_type": "markdown",
"id": "ee7d03ef",
"metadata": {},
"source": [
"## How to use the model\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3ef75bb2",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c04f99b3",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"emilyalsentzer/Bio_Discharge_Summary_BERT\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "e4459a1c",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/emilyalsentzer/Bio_Discharge_Summary_BERT](https://huggingface.co/emilyalsentzer/Bio_Discharge_Summary_BERT) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: c4
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info:
name: "google/t5-base-lm-adapt"
description: "Version 1.1 - LM-Adapted"
description_en: "Version 1.1 - LM-Adapted"
icon: ""
from_repo: "https://huggingface.co/google/t5-base-lm-adapt"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text2Text Generation"
sub_tag: "文本生成"
Example:
Datasets: "c4"
Publisher: "google"
License: "apache-2.0"
Language: "English"
description: Version 1.1 - LM-Adapted
description_en: Version 1.1 - LM-Adapted
from_repo: https://huggingface.co/google/t5-base-lm-adapt
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: google/t5-base-lm-adapt
Paper:
- title: 'GLU Variants Improve Transformer'
url: 'http://arxiv.org/abs/2002.05202v1'
- title: 'Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer'
url: 'http://arxiv.org/abs/1910.10683v3'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- title: GLU Variants Improve Transformer
url: http://arxiv.org/abs/2002.05202v1
- title: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
url: http://arxiv.org/abs/1910.10683v3
Publisher: google
Task:
- sub_tag: 文本生成
sub_tag_en: Text2Text Generation
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "c59ed826",
"metadata": {},
"source": [
"[Google's T5](https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html) Version 1.1 - LM-Adapted\n",
"\n",
"\n",
"## Version 1.1 - LM-Adapted\n",
"\n",
"[T5 Version 1.1 - LM Adapted](https://github.com/google-research/text-to-text-transfer-transformer/blob/main/released_checkpoints.md#lm-adapted-t511lm100k) includes the following improvements compared to the original T5 model:\n",
"\n",
"- GEGLU activation in feed-forward hidden layer, rather than ReLU - see [here](https://arxiv.org/abs/2002.05202).\n",
"\n",
"- Dropout was turned off in pre-training (quality win). Dropout should be re-enabled during fine-tuning.\n",
"\n",
"- Pre-trained on C4 only without mixing in the downstream tasks.\n",
"\n",
"- no parameter sharing between embedding and classifier layer\n",
"\n",
"- \"xl\" and \"xxl\" replace \"3B\" and \"11B\". The model shapes are a bit different - larger `d_model` and smaller `num_heads` and `d_ff`.\n",
"\n",
"and is pretrained on both the denoising and language modeling objective.\n",
"\n",
"More specifically, this checkpoint is initialized from T5 Version 1.1 - Base\n",
"and then trained for an additional 100K steps on the LM objective discussed in the [T5 paper](https://arxiv.org/pdf/1910.10683.pdf).\n",
"This adaptation improves the ability of the model to be used for prompt tuning.\n",
"\n",
"**Note**: A popular fine-tuned version of the *T5 Version 1.1 - LM Adapted* model is BigScience's T0pp.\n",
"\n",
"Pretraining Dataset: C4\n",
"\n",
"Paper: [Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/pdf/1910.10683.pdf)\n",
"\n",
"Authors: *Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu*\n",
"\n",
"\n",
"## Abstract\n",
"\n",
"Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts every language problem into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new “Colossal Clean Crawled Corpus”, we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our dataset, pre-trained models, and code.\n",
"\n",
"![model image](https://camo.githubusercontent.com/623b4dea0b653f2ad3f36c71ebfe749a677ac0a1/68747470733a2f2f6d69726f2e6d656469756d2e636f6d2f6d61782f343030362f312a44304a31674e51663876727255704b657944387750412e706e67)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6db9a194",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b70ecb24",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"google/t5-base-lm-adapt\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "46aee335",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/google/t5-base-lm-adapt](https://huggingface.co/google/t5-base-lm-adapt),并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "35f226d7",
"metadata": {},
"source": [
"[Google's T5](https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html) Version 1.1 - LM-Adapted\n",
"\n",
"\n",
"## Version 1.1 - LM-Adapted\n",
"\n",
"[T5 Version 1.1 - LM Adapted](https://github.com/google-research/text-to-text-transfer-transformer/blob/main/released_checkpoints.md#lm-adapted-t511lm100k) includes the following improvements compared to the original T5 model:\n",
"\n",
"- GEGLU activation in feed-forward hidden layer, rather than ReLU - see [here](https://arxiv.org/abs/2002.05202).\n",
"\n",
"- Dropout was turned off in pre-training (quality win). Dropout should be re-enabled during fine-tuning.\n",
"\n",
"- Pre-trained on C4 only without mixing in the downstream tasks.\n",
"\n",
"- no parameter sharing between embedding and classifier layer\n",
"\n",
"- \"xl\" and \"xxl\" replace \"3B\" and \"11B\". The model shapes are a bit different - larger `d_model` and smaller `num_heads` and `d_ff`.\n",
"\n",
"and is pretrained on both the denoising and language modeling objective.\n",
"\n",
"More specifically, this checkpoint is initialized from T5 Version 1.1 - Base\n",
"and then trained for an additional 100K steps on the LM objective discussed in the [T5 paper](https://arxiv.org/pdf/1910.10683.pdf).\n",
"This adaptation improves the ability of the model to be used for prompt tuning.\n",
"\n",
"**Note**: A popular fine-tuned version of the *T5 Version 1.1 - LM Adapted* model is BigScience's T0pp.\n",
"\n",
"Pretraining Dataset: C4\n",
"\n",
"Paper: [Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/pdf/1910.10683.pdf)\n",
"\n",
"Authors: *Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu*\n",
"\n",
"\n",
"## Abstract\n",
"\n",
"Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts every language problem into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new “Colossal Clean Crawled Corpus”, we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our dataset, pre-trained models, and code.\n",
"\n",
"![model image](https://camo.githubusercontent.com/623b4dea0b653f2ad3f36c71ebfe749a677ac0a1/68747470733a2f2f6d69726f2e6d656469756d2e636f6d2f6d61782f343030362f312a44304a31674e51663876727255704b657944387750412e706e67)\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b471855d",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f74ec3ef",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"google/t5-base-lm-adapt\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "e431d080",
"metadata": {},
"source": [
"\n",
"> The model introduction and model weights originate from [https://huggingface.co/google/t5-base-lm-adapt](https://huggingface.co/google/t5-base-lm-adapt) and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: c4
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info:
name: "google/t5-large-lm-adapt"
description: "Version 1.1 - LM-Adapted"
description_en: "Version 1.1 - LM-Adapted"
icon: ""
from_repo: "https://huggingface.co/google/t5-large-lm-adapt"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text2Text Generation"
sub_tag: "文本生成"
Example:
Datasets: "c4"
Publisher: "google"
License: "apache-2.0"
Language: "English"
description: Version 1.1 - LM-Adapted
description_en: Version 1.1 - LM-Adapted
from_repo: https://huggingface.co/google/t5-large-lm-adapt
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: google/t5-large-lm-adapt
Paper:
- title: 'GLU Variants Improve Transformer'
url: 'http://arxiv.org/abs/2002.05202v1'
- title: 'Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer'
url: 'http://arxiv.org/abs/1910.10683v3'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- title: GLU Variants Improve Transformer
url: http://arxiv.org/abs/2002.05202v1
- title: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
url: http://arxiv.org/abs/1910.10683v3
Publisher: google
Task:
- sub_tag: 文本生成
sub_tag_en: Text2Text Generation
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "71352026",
"metadata": {},
"source": [
"[Google's T5](https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html) Version 1.1 - LM-Adapted\n",
"\n",
"\n",
"## Version 1.1 - LM-Adapted\n",
"\n",
"[T5 Version 1.1 - LM Adapted](https://github.com/google-research/text-to-text-transfer-transformer/blob/main/released_checkpoints.md#lm-adapted-t511lm100k) includes the following improvements compared to the original [T5 model](https://huggingface.co/t5-large):\n",
"\n",
"- GEGLU activation in feed-forward hidden layer, rather than ReLU - see [here](https://arxiv.org/abs/2002.05202).\n",
"\n",
"- Dropout was turned off in pre-training (quality win). Dropout should be re-enabled during fine-tuning.\n",
"\n",
"- Pre-trained on C4 only without mixing in the downstream tasks.\n",
"\n",
"- no parameter sharing between embedding and classifier layer\n",
"\n",
"- \"xl\" and \"xxl\" replace \"3B\" and \"11B\". The model shapes are a bit different - larger `d_model` and smaller `num_heads` and `d_ff`.\n",
"\n",
"and is pretrained on both the denoising and language modeling objective.\n",
"\n",
"More specifically, this checkpoint is initialized from T5 Version 1.1 - Large\n",
"and then trained for an additional 100K steps on the LM objective discussed in the [T5 paper](https://arxiv.org/pdf/1910.10683.pdf).\n",
"This adaptation improves the ability of the model to be used for prompt tuning.\n",
"\n",
"**Note**: A popular fine-tuned version of the *T5 Version 1.1 - LM Adapted* model is BigScience's T0pp.\n",
"\n",
"Pretraining Dataset: C4\n",
"\n",
"Paper: [Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/pdf/1910.10683.pdf)\n",
"\n",
"Authors: *Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu*\n",
"\n",
"\n",
"## Abstract\n",
"\n",
"Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts every language problem into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new “Colossal Clean Crawled Corpus”, we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our dataset, pre-trained models, and code.\n",
"\n",
"![model image](https://camo.githubusercontent.com/623b4dea0b653f2ad3f36c71ebfe749a677ac0a1/68747470733a2f2f6d69726f2e6d656469756d2e636f6d2f6d61782f343030362f312a44304a31674e51663876727255704b657944387750412e706e67)"
]
},
{
"cell_type": "markdown",
"id": "d41870db",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b9451fd3",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6392031c",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"google/t5-large-lm-adapt\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "3b35551f",
"metadata": {},
"source": [
"\n",
"> 此模型介绍及权重来源于[https://huggingface.co/google/t5-large-lm-adapt](https://huggingface.co/google/t5-large-lm-adapt),并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "70cb903f",
"metadata": {},
"source": [
"[Google's T5](https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html) Version 1.1 - LM-Adapted\n",
"\n",
"\n",
"## Version 1.1 - LM-Adapted\n",
"\n",
"[T5 Version 1.1 - LM Adapted](https://github.com/google-research/text-to-text-transfer-transformer/blob/main/released_checkpoints.md#lm-adapted-t511lm100k) includes the following improvements compared to the original [T5 model](https://huggingface.co/t5-large):\n",
"\n",
"- GEGLU activation in feed-forward hidden layer, rather than ReLU - see [here](https://arxiv.org/abs/2002.05202).\n",
"\n",
"- Dropout was turned off in pre-training (quality win). Dropout should be re-enabled during fine-tuning.\n",
"\n",
"- Pre-trained on C4 only without mixing in the downstream tasks.\n",
"\n",
"- no parameter sharing between embedding and classifier layer\n",
"\n",
"- \"xl\" and \"xxl\" replace \"3B\" and \"11B\". The model shapes are a bit different - larger `d_model` and smaller `num_heads` and `d_ff`.\n",
"\n",
"and is pretrained on both the denoising and language modeling objective.\n",
"\n",
"More specifically, this checkpoint is initialized from T5 Version 1.1 - Large\n",
"and then trained for an additional 100K steps on the LM objective discussed in the [T5 paper](https://arxiv.org/pdf/1910.10683.pdf).\n",
"This adaptation improves the ability of the model to be used for prompt tuning.\n",
"\n",
"**Note**: A popular fine-tuned version of the *T5 Version 1.1 - LM Adapted* model is BigScience's T0pp.\n",
"\n",
"Pretraining Dataset: C4\n",
"\n",
"Paper: [Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/pdf/1910.10683.pdf)\n",
"\n",
"Authors: *Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu*\n",
"\n",
"\n",
"## Abstract\n",
"\n",
"Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts every language problem into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new “Colossal Clean Crawled Corpus”, we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our dataset, pre-trained models, and code.\n",
"\n",
"![model image](https://camo.githubusercontent.com/623b4dea0b653f2ad3f36c71ebfe749a677ac0a1/68747470733a2f2f6d69726f2e6d656469756d2e636f6d2f6d61782f343030362f312a44304a31674e51663876727255704b657944387750412e706e67)"
]
},
{
"cell_type": "markdown",
"id": "cab0c1ea",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e37976b5",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0673661a",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"google/t5-large-lm-adapt\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "7b24e77f",
"metadata": {},
"source": [
"\n",
"> The model introduction and model weights originate from [https://huggingface.co/google/t5-large-lm-adapt](https://huggingface.co/google/t5-large-lm-adapt) and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
# 模型列表
## google/t5-large-ssm
| 模型名称 | 模型介绍 | 模型大小 | 模型下载 |
| --- | --- | --- | --- |
|google/t5-large-ssm| | 3.12G | [model_config.json](https://bj.bcebos.com/paddlenlp/models/community/google/t5-large-ssm/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/google/t5-large-ssm/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/google/t5-large-ssm/tokenizer_config.json) |
也可以通过`paddlenlp` cli 工具来下载对应的模型权重,使用步骤如下所示:
* 安装paddlenlp
```shell
pip install --upgrade paddlenlp
```
* 下载命令行
```shell
paddlenlp download --cache-dir ./pretrained_models google/t5-large-ssm
```
有任何下载的问题都可以到[PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP)中发Issue提问。
\ No newline at end of file
# model list
##
| model | description | model_size | download |
| --- | --- | --- | --- |
|google/t5-large-ssm| | 3.12G | [model_config.json](https://bj.bcebos.com/paddlenlp/models/community/google/t5-large-ssm/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/google/t5-large-ssm/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/google/t5-large-ssm/tokenizer_config.json) |
or you can download all of model file with the following steps:
* install paddlenlp
```shell
pip install --upgrade paddlenlp
```
* download model with cli tool
```shell
paddlenlp download --cache-dir ./pretrained_models google/t5-large-ssm
```
If you have any problems with it, you can post issue on [PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP) to get support.
Model_Info:
name: "google/t5-large-ssm"
description: "Abstract"
description_en: "Abstract"
icon: ""
from_repo: "https://huggingface.co/google/t5-large-ssm"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text2Text Generation"
sub_tag: "文本生成"
Example:
Datasets: "c4,wikipedia"
Publisher: "google"
License: "apache-2.0"
Language: "English"
Paper:
- title: 'REALM: Retrieval-Augmented Language Model Pre-Training'
url: 'http://arxiv.org/abs/2002.08909v1'
- title: 'Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer'
url: 'http://arxiv.org/abs/1910.10683v3'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
Datasets: c4
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info:
name: "google/t5-small-lm-adapt"
description: "Version 1.1 - LM-Adapted"
description_en: "Version 1.1 - LM-Adapted"
icon: ""
from_repo: "https://huggingface.co/google/t5-small-lm-adapt"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text2Text Generation"
sub_tag: "文本生成"
Example:
Datasets: "c4"
Publisher: "google"
License: "apache-2.0"
Language: "English"
description: Version 1.1 - LM-Adapted
description_en: Version 1.1 - LM-Adapted
from_repo: https://huggingface.co/google/t5-small-lm-adapt
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: google/t5-small-lm-adapt
Paper:
- title: 'GLU Variants Improve Transformer'
url: 'http://arxiv.org/abs/2002.05202v1'
- title: 'Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer'
url: 'http://arxiv.org/abs/1910.10683v3'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- title: GLU Variants Improve Transformer
url: http://arxiv.org/abs/2002.05202v1
- title: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
url: http://arxiv.org/abs/1910.10683v3
Publisher: google
Task:
- sub_tag: 文本生成
sub_tag_en: Text2Text Generation
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "9dca1445",
"metadata": {},
"source": [
"[Google's T5](https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html) Version 1.1 - LM-Adapted\n",
"\n",
"\n",
"## Version 1.1 - LM-Adapted\n",
"\n",
"[T5 Version 1.1 - LM Adapted](https://github.com/google-research/text-to-text-transfer-transformer/blob/main/released_checkpoints.md#lm-adapted-t511lm100k) includes the following improvements compared to the original [T5 model](https://huggingface.co/t5-small):\n",
"\n",
"- GEGLU activation in feed-forward hidden layer, rather than ReLU - see [here](https://arxiv.org/abs/2002.05202).\n",
"\n",
"- Dropout was turned off in pre-training (quality win). Dropout should be re-enabled during fine-tuning.\n",
"\n",
"- Pre-trained on C4 only without mixing in the downstream tasks.\n",
"\n",
"- no parameter sharing between embedding and classifier layer\n",
"\n",
"- \"xl\" and \"xxl\" replace \"3B\" and \"11B\". The model shapes are a bit different - larger `d_model` and smaller `num_heads` and `d_ff`.\n",
"\n",
"and is pretrained on both the denoising and language modeling objective.\n",
"\n",
"More specifically, this checkpoint is initialized from T5 Version 1.1 - Small\n",
"and then trained for an additional 100K steps on the LM objective discussed in the [T5 paper](https://arxiv.org/pdf/1910.10683.pdf).\n",
"This adaptation improves the ability of the model to be used for prompt tuning.\n",
"\n",
"**Note**: A popular fine-tuned version of the *T5 Version 1.1 - LM Adapted* model is BigScience's T0pp.\n",
"\n",
"Pretraining Dataset: C4\n",
"\n",
"Paper: [Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/pdf/1910.10683.pdf)\n",
"\n",
"Authors: *Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu*\n",
"\n",
"\n",
"## Abstract\n",
"\n",
"Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts every language problem into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new “Colossal Clean Crawled Corpus”, we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our dataset, pre-trained models, and code.\n",
"\n",
"![model image](https://camo.githubusercontent.com/623b4dea0b653f2ad3f36c71ebfe749a677ac0a1/68747470733a2f2f6d69726f2e6d656469756d2e636f6d2f6d61782f343030362f312a44304a31674e51663876727255704b657944387750412e706e67)\n"
]
},
{
"cell_type": "markdown",
"id": "4c63de98",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8030fcb4",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8f6f14dd",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"google/t5-small-lm-adapt\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "b8dd698b",
"metadata": {},
"source": [
"\n",
"> 此模型介绍及权重来源于[https://huggingface.co/google/t5-small-lm-adapt](https://huggingface.co/google/t5-small-lm-adapt),并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "42de6200",
"metadata": {},
"source": [
"[Google's T5](https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html) Version 1.1 - LM-Adapted\n",
"\n",
"\n",
"## Version 1.1 - LM-Adapted\n",
"\n",
"[T5 Version 1.1 - LM Adapted](https://github.com/google-research/text-to-text-transfer-transformer/blob/main/released_checkpoints.md#lm-adapted-t511lm100k) includes the following improvements compared to the original [T5 model](https://huggingface.co/t5-small):\n",
"\n",
"- GEGLU activation in feed-forward hidden layer, rather than ReLU - see [here](https://arxiv.org/abs/2002.05202).\n",
"\n",
"- Dropout was turned off in pre-training (quality win). Dropout should be re-enabled during fine-tuning.\n",
"\n",
"- Pre-trained on C4 only without mixing in the downstream tasks.\n",
"\n",
"- no parameter sharing between embedding and classifier layer\n",
"\n",
"- \"xl\" and \"xxl\" replace \"3B\" and \"11B\". The model shapes are a bit different - larger `d_model` and smaller `num_heads` and `d_ff`.\n",
"\n",
"and is pretrained on both the denoising and language modeling objective.\n",
"\n",
"More specifically, this checkpoint is initialized from [T5 Version 1.1 - Small](https://huggingface.co/google/https://huggingface.co/google/t5-v1_1-small)\n",
"and then trained for an additional 100K steps on the LM objective discussed in the [T5 paper](https://arxiv.org/pdf/1910.10683.pdf).\n",
"This adaptation improves the ability of the model to be used for prompt tuning.\n",
"\n",
"**Note**: A popular fine-tuned version of the *T5 Version 1.1 - LM Adapted* model is BigScience's T0pp.\n",
"\n",
"Pretraining Dataset: C4\n",
"\n",
"Paper: [Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/pdf/1910.10683.pdf)\n",
"\n",
"Authors: *Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu*\n",
"\n",
"\n",
"## Abstract\n",
"\n",
"Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts every language problem into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new “Colossal Clean Crawled Corpus”, we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our dataset, pre-trained models, and code.\n",
"\n",
"![model image](https://camo.githubusercontent.com/623b4dea0b653f2ad3f36c71ebfe749a677ac0a1/68747470733a2f2f6d69726f2e6d656469756d2e636f6d2f6d61782f343030362f312a44304a31674e51663876727255704b657944387750412e706e67)\n"
]
},
{
"cell_type": "markdown",
"id": "39071317",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "31a774de",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a1cd7a50",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"google/t5-small-lm-adapt\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "4d283d3d",
"metadata": {},
"source": [
"\n",
"> The model introduction and model weights originate from [https://huggingface.co/google/t5-small-lm-adapt](https://huggingface.co/google/t5-small-lm-adapt) and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: c4
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info:
name: "google/t5-v1_1-base"
description: "Version 1.1"
description_en: "Version 1.1"
icon: ""
from_repo: "https://huggingface.co/google/t5-v1_1-base"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text2Text Generation"
sub_tag: "文本生成"
Example:
Datasets: "c4"
Publisher: "google"
License: "apache-2.0"
Language: "English"
description: Version 1.1
description_en: Version 1.1
from_repo: https://huggingface.co/google/t5-v1_1-base
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: google/t5-v1_1-base
Paper:
- title: 'GLU Variants Improve Transformer'
url: 'http://arxiv.org/abs/2002.05202v1'
- title: 'Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer'
url: 'http://arxiv.org/abs/1910.10683v3'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- title: GLU Variants Improve Transformer
url: http://arxiv.org/abs/2002.05202v1
- title: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
url: http://arxiv.org/abs/1910.10683v3
Publisher: google
Task:
- sub_tag: 文本生成
sub_tag_en: Text2Text Generation
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "83a0cdfd",
"metadata": {},
"source": [
"[Google's T5](https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html) Version 1.1\n",
"\n",
"\n",
"## Version 1.1\n",
"\n",
"[T5 Version 1.1](https://github.com/google-research/text-to-text-transfer-transformer/blob/master/released_checkpoints.md#t511) includes the following improvements compared to the original T5 model- GEGLU activation in feed-forward hidden layer, rather than ReLU - see [here](https://arxiv.org/abs/2002.05202).\n",
"\n",
"- Dropout was turned off in pre-training (quality win). Dropout should be re-enabled during fine-tuning.\n",
"\n",
"- Pre-trained on C4 only without mixing in the downstream tasks.\n",
"\n",
"- no parameter sharing between embedding and classifier layer\n",
"\n",
"- \"xl\" and \"xxl\" replace \"3B\" and \"11B\". The model shapes are a bit different - larger `d_model` and smaller `num_heads` and `d_ff`.\n",
"\n",
"**Note**: T5 Version 1.1 was only pre-trained on C4 excluding any supervised training. Therefore, this model has to be fine-tuned before it is useable on a downstream task.\n",
"Pretraining Dataset: C4\n",
"\n",
"Paper: [Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/pdf/1910.10683.pdf)\n",
"\n",
"Authors: *Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu*\n",
"\n",
"\n",
"## Abstract\n",
"\n",
"Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts every language problem into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new “Colossal Clean Crawled Corpus”, we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our dataset, pre-trained models, and code.\n",
"\n",
"![model image](https://camo.githubusercontent.com/623b4dea0b653f2ad3f36c71ebfe749a677ac0a1/68747470733a2f2f6d69726f2e6d656469756d2e636f6d2f6d61782f343030362f312a44304a31674e51663876727255704b657944387750412e706e67)"
]
},
{
"cell_type": "markdown",
"id": "6196f74c",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6c13cb82",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1d62f626",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"google/t5-v1_1-base\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "016545f2",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/google/t5-v1_1-base](https://huggingface.co/google/t5-v1_1-base),并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "2656a571",
"metadata": {},
"source": [
"[Google's T5](https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html) Version 1.1\n",
"\n",
"\n",
"## Version 1.1\n",
"\n",
"[T5 Version 1.1](https://github.com/google-research/text-to-text-transfer-transformer/blob/master/released_checkpoints.md#t511) includes the following improvements compared to the original T5 model- GEGLU activation in feed-forward hidden layer, rather than ReLU - see [here](https://arxiv.org/abs/2002.05202).\n",
"\n",
"- Dropout was turned off in pre-training (quality win). Dropout should be re-enabled during fine-tuning.\n",
"\n",
"- Pre-trained on C4 only without mixing in the downstream tasks.\n",
"\n",
"- no parameter sharing between embedding and classifier layer\n",
"\n",
"- \"xl\" and \"xxl\" replace \"3B\" and \"11B\". The model shapes are a bit different - larger `d_model` and smaller `num_heads` and `d_ff`.\n",
"\n",
"**Note**: T5 Version 1.1 was only pre-trained on C4 excluding any supervised training. Therefore, this model has to be fine-tuned before it is useable on a downstream task.\n",
"Pretraining Dataset: C4\n",
"\n",
"Paper: [Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/pdf/1910.10683.pdf)\n",
"\n",
"Authors: *Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu*\n",
"\n",
"\n",
"## Abstract\n",
"\n",
"Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts every language problem into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new “Colossal Clean Crawled Corpus”, we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our dataset, pre-trained models, and code.\n",
"\n",
"![model image](https://camo.githubusercontent.com/623b4dea0b653f2ad3f36c71ebfe749a677ac0a1/68747470733a2f2f6d69726f2e6d656469756d2e636f6d2f6d61782f343030362f312a44304a31674e51663876727255704b657944387750412e706e67)"
]
},
{
"cell_type": "markdown",
"id": "c0cd9f02",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9323615d",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0b9994b3",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"google/t5-v1_1-base\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "8daa264b",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/google/t5-v1_1-base](https://huggingface.co/google/t5-v1_1-base) and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: c4
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info:
name: "google/t5-v1_1-large"
description: "Version 1.1"
description_en: "Version 1.1"
icon: ""
from_repo: "https://huggingface.co/google/t5-v1_1-large"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text2Text Generation"
sub_tag: "文本生成"
Example:
Datasets: "c4"
Publisher: "google"
License: "apache-2.0"
Language: "English"
description: Version 1.1
description_en: Version 1.1
from_repo: https://huggingface.co/google/t5-v1_1-large
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: google/t5-v1_1-large
Paper:
- title: 'GLU Variants Improve Transformer'
url: 'http://arxiv.org/abs/2002.05202v1'
- title: 'Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer'
url: 'http://arxiv.org/abs/1910.10683v3'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- title: GLU Variants Improve Transformer
url: http://arxiv.org/abs/2002.05202v1
- title: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
url: http://arxiv.org/abs/1910.10683v3
Publisher: google
Task:
- sub_tag: 文本生成
sub_tag_en: Text2Text Generation
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "11d36429",
"metadata": {},
"source": [
"[Google's T5](https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html) Version 1.1\n",
"\n",
"\n",
"## Version 1.1\n",
"\n",
"[T5 Version 1.1](https://github.com/google-research/text-to-text-transfer-transformer/blob/master/released_checkpoints.md#t511) includes the following improvements compared to the original T5 model- GEGLU activation in feed-forward hidden layer, rather than ReLU - see [here](https://arxiv.org/abs/2002.05202).\n",
"\n",
"- Dropout was turned off in pre-training (quality win). Dropout should be re-enabled during fine-tuning.\n",
"\n",
"- Pre-trained on C4 only without mixing in the downstream tasks.\n",
"\n",
"- no parameter sharing between embedding and classifier layer\n",
"\n",
"- \"xl\" and \"xxl\" replace \"3B\" and \"11B\". The model shapes are a bit different - larger `d_model` and smaller `num_heads` and `d_ff`.\n",
"\n",
"**Note**: T5 Version 1.1 was only pre-trained on C4 excluding any supervised training. Therefore, this model has to be fine-tuned before it is useable on a downstream task.\n",
"Pretraining Dataset: C4\n",
"\n",
"Paper: [Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/pdf/1910.10683.pdf)\n",
"\n",
"Authors: *Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu*\n",
"\n",
"\n",
"## Abstract\n",
"\n",
"Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts every language problem into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new “Colossal Clean Crawled Corpus”, we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our dataset, pre-trained models, and code.\n",
"\n",
"![model image](https://camo.githubusercontent.com/623b4dea0b653f2ad3f36c71ebfe749a677ac0a1/68747470733a2f2f6d69726f2e6d656469756d2e636f6d2f6d61782f343030362f312a44304a31674e51663876727255704b657944387750412e706e67)"
]
},
{
"cell_type": "markdown",
"id": "c11ad8cf",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "480104f2",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e1323ff9",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"google/t5-v1_1-large\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "4348828e",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/google/t5-v1_1-large](https://huggingface.co/google/t5-v1_1-large),并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "5f0c769f",
"metadata": {},
"source": [
"[Google's T5](https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html) Version 1.1\n",
"\n",
"\n",
"## Version 1.1\n",
"\n",
"[T5 Version 1.1](https://github.com/google-research/text-to-text-transfer-transformer/blob/master/released_checkpoints.md#t511) includes the following improvements compared to the original T5 model- GEGLU activation in feed-forward hidden layer, rather than ReLU - see [here](https://arxiv.org/abs/2002.05202).\n",
"\n",
"- Dropout was turned off in pre-training (quality win). Dropout should be re-enabled during fine-tuning.\n",
"\n",
"- Pre-trained on C4 only without mixing in the downstream tasks.\n",
"\n",
"- no parameter sharing between embedding and classifier layer\n",
"\n",
"- \"xl\" and \"xxl\" replace \"3B\" and \"11B\". The model shapes are a bit different - larger `d_model` and smaller `num_heads` and `d_ff`.\n",
"\n",
"**Note**: T5 Version 1.1 was only pre-trained on C4 excluding any supervised training. Therefore, this model has to be fine-tuned before it is useable on a downstream task.\n",
"Pretraining Dataset: C4\n",
"\n",
"Paper: [Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/pdf/1910.10683.pdf)\n",
"\n",
"Authors: *Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu*\n",
"\n",
"\n",
"## Abstract\n",
"\n",
"Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts every language problem into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new “Colossal Clean Crawled Corpus”, we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our dataset, pre-trained models, and code.\n",
"\n",
"![model image](https://camo.githubusercontent.com/623b4dea0b653f2ad3f36c71ebfe749a677ac0a1/68747470733a2f2f6d69726f2e6d656469756d2e636f6d2f6d61782f343030362f312a44304a31674e51663876727255704b657944387750412e706e67)"
]
},
{
"cell_type": "markdown",
"id": "27e206b9",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3cf23148",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "467b7ff7",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"google/t5-v1_1-large\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "a5616bca",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/google/t5-v1_1-large](https://huggingface.co/google/t5-v1_1-large) and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: c4
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info:
name: "google/t5-v1_1-small"
description: "Version 1.1"
description_en: "Version 1.1"
icon: ""
from_repo: "https://huggingface.co/google/t5-v1_1-small"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text2Text Generation"
sub_tag: "文本生成"
Example:
Datasets: "c4"
Publisher: "google"
License: "apache-2.0"
Language: "English"
description: Version 1.1
description_en: Version 1.1
from_repo: https://huggingface.co/google/t5-v1_1-small
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: google/t5-v1_1-small
Paper:
- title: 'GLU Variants Improve Transformer'
url: 'http://arxiv.org/abs/2002.05202v1'
- title: 'Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer'
url: 'http://arxiv.org/abs/1910.10683v3'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- title: GLU Variants Improve Transformer
url: http://arxiv.org/abs/2002.05202v1
- title: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
url: http://arxiv.org/abs/1910.10683v3
Publisher: google
Task:
- sub_tag: 文本生成
sub_tag_en: Text2Text Generation
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "51d7e9ca",
"metadata": {},
"source": [
"[Google's T5](https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html) Version 1.1\n",
"\n",
"\n",
"## Version 1.1\n",
"\n",
"[T5 Version 1.1](https://github.com/google-research/text-to-text-transfer-transformer/blob/master/released_checkpoints.md#t511) includes the following improvements compared to the original T5 model- GEGLU activation in feed-forward hidden layer, rather than ReLU - see [here](https://arxiv.org/abs/2002.05202).\n",
"\n",
"- Dropout was turned off in pre-training (quality win). Dropout should be re-enabled during fine-tuning.\n",
"\n",
"- Pre-trained on C4 only without mixing in the downstream tasks.\n",
"\n",
"- no parameter sharing between embedding and classifier layer\n",
"\n",
"- \"xl\" and \"xxl\" replace \"3B\" and \"11B\". The model shapes are a bit different - larger `d_model` and smaller `num_heads` and `d_ff`.\n",
"\n",
"**Note**: T5 Version 1.1 was only pre-trained on C4 excluding any supervised training. Therefore, this model has to be fine-tuned before it is useable on a downstream task.\n",
"Pretraining Dataset: [C4](https://huggingface.co/datasets/c4)\n",
"\n",
"Other Community Checkpoints: [here](https://huggingface.co/models?search=t5-v1_1)\n",
"\n",
"Paper: [Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/pdf/1910.10683.pdf)\n",
"\n",
"Authors: *Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu*\n",
"\n",
"\n",
"## Abstract\n",
"\n",
"Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts every language problem into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new “Colossal Clean Crawled Corpus”, we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our dataset, pre-trained models, and code.\n",
"\n",
"![model image](https://camo.githubusercontent.com/623b4dea0b653f2ad3f36c71ebfe749a677ac0a1/68747470733a2f2f6d69726f2e6d656469756d2e636f6d2f6d61782f343030362f312a44304a31674e51663876727255704b657944387750412e706e67)\n"
]
},
{
"cell_type": "markdown",
"id": "b4b5fc59",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ae31cbc9",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "81d25d09",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"google/t5-v1_1-small\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "6e0459f7",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/google/t5-v1_1-small](https://huggingface.co/google/t5-v1_1-small),并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "95b64b6f",
"metadata": {},
"source": [
"[Google's T5](https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html) Version 1.1\n",
"\n",
"\n",
"## Version 1.1\n",
"\n",
"[T5 Version 1.1](https://github.com/google-research/text-to-text-transfer-transformer/blob/master/released_checkpoints.md#t511) includes the following improvements compared to the original T5 model- GEGLU activation in feed-forward hidden layer, rather than ReLU - see [here](https://arxiv.org/abs/2002.05202).\n",
"\n",
"- Dropout was turned off in pre-training (quality win). Dropout should be re-enabled during fine-tuning.\n",
"\n",
"- Pre-trained on C4 only without mixing in the downstream tasks.\n",
"\n",
"- no parameter sharing between embedding and classifier layer\n",
"\n",
"- \"xl\" and \"xxl\" replace \"3B\" and \"11B\". The model shapes are a bit different - larger `d_model` and smaller `num_heads` and `d_ff`.\n",
"\n",
"**Note**: T5 Version 1.1 was only pre-trained on C4 excluding any supervised training. Therefore, this model has to be fine-tuned before it is useable on a downstream task.\n",
"Pretraining Dataset: [C4](https://huggingface.co/datasets/c4)\n",
"\n",
"Other Community Checkpoints: [here](https://huggingface.co/models?search=t5-v1_1)\n",
"\n",
"Paper: [Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/pdf/1910.10683.pdf)\n",
"\n",
"Authors: *Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu*\n",
"\n",
"\n",
"## Abstract\n",
"\n",
"Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts every language problem into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new “Colossal Clean Crawled Corpus”, we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our dataset, pre-trained models, and code.\n",
"\n",
"![model image](https://camo.githubusercontent.com/623b4dea0b653f2ad3f36c71ebfe749a677ac0a1/68747470733a2f2f6d69726f2e6d656469756d2e636f6d2f6d61782f343030362f312a44304a31674e51663876727255704b657944387750412e706e67)\n"
]
},
{
"cell_type": "markdown",
"id": "88ec53f3",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "082cae7f",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "05f7f4d0",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"google/t5-v1_1-small\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "c7a95cdf",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/google/t5-v1_1-small](https://huggingface.co/google/t5-v1_1-small) and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: Chinese
License: apache-2.0
Model_Info:
name: "hfl/chinese-bert-wwm-ext"
description: "Chinese BERT with Whole Word Masking"
description_en: "Chinese BERT with Whole Word Masking"
icon: ""
from_repo: "https://huggingface.co/hfl/chinese-bert-wwm-ext"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: ""
Publisher: "hfl"
License: "apache-2.0"
Language: "Chinese"
description: Chinese BERT with Whole Word Masking
description_en: Chinese BERT with Whole Word Masking
from_repo: https://huggingface.co/hfl/chinese-bert-wwm-ext
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: hfl/chinese-bert-wwm-ext
Paper:
- title: 'Pre-Training with Whole Word Masking for Chinese BERT'
url: 'http://arxiv.org/abs/1906.08101v3'
- title: 'Revisiting Pre-Trained Models for Chinese Natural Language Processing'
url: 'http://arxiv.org/abs/2004.13922v2'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- title: Pre-Training with Whole Word Masking for Chinese BERT
url: http://arxiv.org/abs/1906.08101v3
- title: Revisiting Pre-Trained Models for Chinese Natural Language Processing
url: http://arxiv.org/abs/2004.13922v2
Publisher: hfl
Task:
- sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "a5e1e8bd",
"metadata": {},
"source": [
"## Chinese BERT with Whole Word Masking\n",
"For further accelerating Chinese natural language processing, we provide **Chinese pre-trained BERT with Whole Word Masking**.\n",
"\n",
"**[Pre-Training with Whole Word Masking for Chinese BERT](https://arxiv.org/abs/1906.08101)**\n",
"Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Ziqing Yang, Shijin Wang, Guoping Hu\n",
"\n",
"This repository is developed based on:https://github.com/google-research/bert\n",
"\n",
"You may also interested in,\n",
"- Chinese BERT series: https://github.com/ymcui/Chinese-BERT-wwm\n",
"- Chinese MacBERT: https://github.com/ymcui/MacBERT\n",
"- Chinese ELECTRA: https://github.com/ymcui/Chinese-ELECTRA\n",
"- Chinese XLNet: https://github.com/ymcui/Chinese-XLNet\n",
"- Knowledge Distillation Toolkit - TextBrewer: https://github.com/airaria/TextBrewer\n",
"\n",
"More resources by HFL: https://github.com/ymcui/HFL-Anthology\n"
]
},
{
"cell_type": "markdown",
"id": "be498a8f",
"metadata": {},
"source": [
"## How to Use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0199d11d",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b71b0698",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"hfl/chinese-bert-wwm-ext\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "5d6bd99f",
"metadata": {},
"source": [
"\n",
"## Citation\n",
"If you find the technical report or resource is useful, please cite the following technical report in your paper.\n",
"- Primary: https://arxiv.org/abs/2004.13922"
]
},
{
"cell_type": "markdown",
"id": "456616b7",
"metadata": {},
"source": [
"```\n",
"@inproceedings{cui-etal-2020-revisiting,\n",
"title = \"Revisiting Pre-Trained Models for {C}hinese Natural Language Processing\",\n",
"author = \"Cui, Yiming and\n",
"Che, Wanxiang and\n",
"Liu, Ting and\n",
"Qin, Bing and\n",
"Wang, Shijin and\n",
"Hu, Guoping\",\n",
"booktitle = \"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings\",\n",
"month = nov,\n",
"year = \"2020\",\n",
"address = \"Online\",\n",
"publisher = \"Association for Computational Linguistics\",\n",
"url = \"https://www.aclweb.org/anthology/2020.findings-emnlp.58\",\n",
"pages = \"657--668\",\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "9784d9b7",
"metadata": {},
"source": [
"- Secondary: https://arxiv.org/abs/1906.08101\n"
]
},
{
"cell_type": "markdown",
"id": "15ed9adf",
"metadata": {},
"source": [
"```\n",
"@article{chinese-bert-wwm,\n",
"title={Pre-Training with Whole Word Masking for Chinese BERT},\n",
"author={Cui, Yiming and Che, Wanxiang and Liu, Ting and Qin, Bing and Yang, Ziqing and Wang, Shijin and Hu, Guoping},\n",
"journal={arXiv preprint arXiv:1906.08101},\n",
"year={2019}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "3593ecc9",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/hfl/chinese-bert-wwm-ext](https://huggingface.co/hfl/chinese-bert-wwm-ext),并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
},
"vscode": {
"interpreter": {
"hash": "606ea184b8fed3419d714b545dc1784fad6c99d0cc940b6b9d787dccf225faa5"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "faeb5f50",
"metadata": {},
"source": [
"## Chinese BERT with Whole Word Masking\n",
"For further accelerating Chinese natural language processing, we provide **Chinese pre-trained BERT with Whole Word Masking**.\n",
"\n",
"**[Pre-Training with Whole Word Masking for Chinese BERT](https://arxiv.org/abs/1906.08101)**\n",
"Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Ziqing Yang, Shijin Wang, Guoping Hu\n",
"\n",
"This repository is developed based on:https://github.com/google-research/bert\n",
"\n",
"You may also interested in,\n",
"- Chinese BERT series: https://github.com/ymcui/Chinese-BERT-wwm\n",
"- Chinese MacBERT: https://github.com/ymcui/MacBERT\n",
"- Chinese ELECTRA: https://github.com/ymcui/Chinese-ELECTRA\n",
"- Chinese XLNet: https://github.com/ymcui/Chinese-XLNet\n",
"- Knowledge Distillation Toolkit - TextBrewer: https://github.com/airaria/TextBrewer\n",
"\n",
"More resources by HFL: https://github.com/ymcui/HFL-Anthology\n"
]
},
{
"cell_type": "markdown",
"id": "fbf98c0e",
"metadata": {},
"source": [
"## How to Use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5f6b3ac7",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f380cab7",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"hfl/chinese-bert-wwm-ext\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "a39bca7c",
"metadata": {},
"source": [
"\n",
"## Citation\n",
"If you find the technical report or resource is useful, please cite the following technical report in your paper.\n",
"- Primary: https://arxiv.org/abs/2004.13922"
]
},
{
"cell_type": "markdown",
"id": "5cff4b49",
"metadata": {},
"source": [
"```\n",
"@inproceedings{cui-etal-2020-revisiting,\n",
"title = \"Revisiting Pre-Trained Models for {C}hinese Natural Language Processing\",\n",
"author = \"Cui, Yiming and\n",
"Che, Wanxiang and\n",
"Liu, Ting and\n",
"Qin, Bing and\n",
"Wang, Shijin and\n",
"Hu, Guoping\",\n",
"booktitle = \"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings\",\n",
"month = nov,\n",
"year = \"2020\",\n",
"address = \"Online\",\n",
"publisher = \"Association for Computational Linguistics\",\n",
"url = \"https://www.aclweb.org/anthology/2020.findings-emnlp.58\",\n",
"pages = \"657--668\",\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "a8781cbe",
"metadata": {},
"source": [
"- Secondary: https://arxiv.org/abs/1906.08101\n"
]
},
{
"cell_type": "markdown",
"id": "b7acc10f",
"metadata": {},
"source": [
"```\n",
"@article{chinese-bert-wwm,\n",
"title={Pre-Training with Whole Word Masking for Chinese BERT},\n",
"author={Cui, Yiming and Che, Wanxiang and Liu, Ting and Qin, Bing and Yang, Ziqing and Wang, Shijin and Hu, Guoping},\n",
"journal={arXiv preprint arXiv:1906.08101},\n",
"year={2019}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "86de1995",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/hfl/chinese-bert-wwm-ext](https://huggingface.co/hfl/chinese-bert-wwm-ext) and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
},
"vscode": {
"interpreter": {
"hash": "606ea184b8fed3419d714b545dc1784fad6c99d0cc940b6b9d787dccf225faa5"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: Chinese
License: apache-2.0
Model_Info:
name: "hfl/chinese-bert-wwm"
description: "Chinese BERT with Whole Word Masking"
description_en: "Chinese BERT with Whole Word Masking"
icon: ""
from_repo: "https://huggingface.co/hfl/chinese-bert-wwm"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: ""
Publisher: "hfl"
License: "apache-2.0"
Language: "Chinese"
description: Chinese BERT with Whole Word Masking
description_en: Chinese BERT with Whole Word Masking
from_repo: https://huggingface.co/hfl/chinese-bert-wwm
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: hfl/chinese-bert-wwm
Paper:
- title: 'Pre-Training with Whole Word Masking for Chinese BERT'
url: 'http://arxiv.org/abs/1906.08101v3'
- title: 'Revisiting Pre-Trained Models for Chinese Natural Language Processing'
url: 'http://arxiv.org/abs/2004.13922v2'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- title: Pre-Training with Whole Word Masking for Chinese BERT
url: http://arxiv.org/abs/1906.08101v3
- title: Revisiting Pre-Trained Models for Chinese Natural Language Processing
url: http://arxiv.org/abs/2004.13922v2
Publisher: hfl
Task:
- sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "a5e1e8bd",
"metadata": {},
"source": [
"## Chinese BERT with Whole Word Masking\n",
"For further accelerating Chinese natural language processing, we provide **Chinese pre-trained BERT with Whole Word Masking**.\n",
"\n",
"**[Pre-Training with Whole Word Masking for Chinese BERT](https://arxiv.org/abs/1906.08101)**\n",
"Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Ziqing Yang, Shijin Wang, Guoping Hu\n",
"\n",
"This repository is developed based on:https://github.com/google-research/bert\n",
"\n",
"You may also interested in,\n",
"- Chinese BERT series: https://github.com/ymcui/Chinese-BERT-wwm\n",
"- Chinese MacBERT: https://github.com/ymcui/MacBERT\n",
"- Chinese ELECTRA: https://github.com/ymcui/Chinese-ELECTRA\n",
"- Chinese XLNet: https://github.com/ymcui/Chinese-XLNet\n",
"- Knowledge Distillation Toolkit - TextBrewer: https://github.com/airaria/TextBrewer\n",
"\n",
"More resources by HFL: https://github.com/ymcui/HFL-Anthology\n"
]
},
{
"cell_type": "markdown",
"id": "be498a8f",
"metadata": {},
"source": [
"## How to Use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0199d11d",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b71b0698",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"hfl/chinese-bert-wwm\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "5d6bd99f",
"metadata": {},
"source": [
"\n",
"## Citation\n",
"If you find the technical report or resource is useful, please cite the following technical report in your paper.\n",
"- Primary: https://arxiv.org/abs/2004.13922"
]
},
{
"cell_type": "markdown",
"id": "376186df",
"metadata": {},
"source": [
"```\n",
"@inproceedings{cui-etal-2020-revisiting,\n",
"title = \"Revisiting Pre-Trained Models for {C}hinese Natural Language Processing\",\n",
"author = \"Cui, Yiming and\n",
"Che, Wanxiang and\n",
"Liu, Ting and\n",
"Qin, Bing and\n",
"Wang, Shijin and\n",
"Hu, Guoping\",\n",
"booktitle = \"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings\",\n",
"month = nov,\n",
"year = \"2020\",\n",
"address = \"Online\",\n",
"publisher = \"Association for Computational Linguistics\",\n",
"url = \"https://www.aclweb.org/anthology/2020.findings-emnlp.58\",\n",
"pages = \"657--668\",\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "9784d9b7",
"metadata": {},
"source": [
"- Secondary: https://arxiv.org/abs/1906.08101\n"
]
},
{
"cell_type": "markdown",
"id": "478fe6be",
"metadata": {},
"source": [
"```\n",
"@article{chinese-bert-wwm,\n",
"title={Pre-Training with Whole Word Masking for Chinese BERT},\n",
"author={Cui, Yiming and Che, Wanxiang and Liu, Ting and Qin, Bing and Yang, Ziqing and Wang, Shijin and Hu, Guoping},\n",
"journal={arXiv preprint arXiv:1906.08101},\n",
"year={2019}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "3593ecc9",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/hfl/chinese-bert-wwm](https://huggingface.co/hfl/chinese-bert-wwm),并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
},
"vscode": {
"interpreter": {
"hash": "606ea184b8fed3419d714b545dc1784fad6c99d0cc940b6b9d787dccf225faa5"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "faeb5f50",
"metadata": {},
"source": [
"## Chinese BERT with Whole Word Masking\n",
"For further accelerating Chinese natural language processing, we provide **Chinese pre-trained BERT with Whole Word Masking**.\n",
"\n",
"**[Pre-Training with Whole Word Masking for Chinese BERT](https://arxiv.org/abs/1906.08101)**\n",
"Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Ziqing Yang, Shijin Wang, Guoping Hu\n",
"\n",
"This repository is developed based on:https://github.com/google-research/bert\n",
"\n",
"You may also interested in,\n",
"- Chinese BERT series: https://github.com/ymcui/Chinese-BERT-wwm\n",
"- Chinese MacBERT: https://github.com/ymcui/MacBERT\n",
"- Chinese ELECTRA: https://github.com/ymcui/Chinese-ELECTRA\n",
"- Chinese XLNet: https://github.com/ymcui/Chinese-XLNet\n",
"- Knowledge Distillation Toolkit - TextBrewer: https://github.com/airaria/TextBrewer\n",
"\n",
"More resources by HFL: https://github.com/ymcui/HFL-Anthology\n"
]
},
{
"cell_type": "markdown",
"id": "fbf98c0e",
"metadata": {},
"source": [
"## How to Use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5f6b3ac7",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f380cab7",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"hfl/chinese-bert-wwm\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "a39bca7c",
"metadata": {},
"source": [
"\n",
"## Citation\n",
"If you find the technical report or resource is useful, please cite the following technical report in your paper.\n",
"- Primary: https://arxiv.org/abs/2004.13922"
]
},
{
"cell_type": "markdown",
"id": "0ebe185e",
"metadata": {},
"source": [
"```\n",
"@inproceedings{cui-etal-2020-revisiting,\n",
"title = \"Revisiting Pre-Trained Models for {C}hinese Natural Language Processing\",\n",
"author = \"Cui, Yiming and\n",
"Che, Wanxiang and\n",
"Liu, Ting and\n",
"Qin, Bing and\n",
"Wang, Shijin and\n",
"Hu, Guoping\",\n",
"booktitle = \"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings\",\n",
"month = nov,\n",
"year = \"2020\",\n",
"address = \"Online\",\n",
"publisher = \"Association for Computational Linguistics\",\n",
"url = \"https://www.aclweb.org/anthology/2020.findings-emnlp.58\",\n",
"pages = \"657--668\",\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "a8781cbe",
"metadata": {},
"source": [
"- Secondary: https://arxiv.org/abs/1906.08101\n"
]
},
{
"cell_type": "markdown",
"id": "85d2437a",
"metadata": {},
"source": [
"```\n",
"@article{chinese-bert-wwm,\n",
"title={Pre-Training with Whole Word Masking for Chinese BERT},\n",
"author={Cui, Yiming and Che, Wanxiang and Liu, Ting and Qin, Bing and Yang, Ziqing and Wang, Shijin and Hu, Guoping},\n",
"journal={arXiv preprint arXiv:1906.08101},\n",
"year={2019}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "86de1995",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/hfl/chinese-bert-wwm-ext](https://huggingface.co/hfl/chinese-bert-wwm-ext) and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
},
"vscode": {
"interpreter": {
"hash": "606ea184b8fed3419d714b545dc1784fad6c99d0cc940b6b9d787dccf225faa5"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: Chinese
License: apache-2.0
Model_Info:
name: "hfl/chinese-roberta-wwm-ext-large"
description: "Please use 'Bert' related functions to load this model!"
description_en: "Please use 'Bert' related functions to load this model!"
icon: ""
from_repo: "https://huggingface.co/hfl/chinese-roberta-wwm-ext-large"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: ""
Publisher: "hfl"
License: "apache-2.0"
Language: "Chinese"
description: Please use 'Bert' related functions to load this model!
description_en: Please use 'Bert' related functions to load this model!
from_repo: https://huggingface.co/hfl/chinese-roberta-wwm-ext-large
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: hfl/chinese-roberta-wwm-ext-large
Paper:
- title: 'Pre-Training with Whole Word Masking for Chinese BERT'
url: 'http://arxiv.org/abs/1906.08101v3'
- title: 'Revisiting Pre-Trained Models for Chinese Natural Language Processing'
url: 'http://arxiv.org/abs/2004.13922v2'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- title: Pre-Training with Whole Word Masking for Chinese BERT
url: http://arxiv.org/abs/1906.08101v3
- title: Revisiting Pre-Trained Models for Chinese Natural Language Processing
url: http://arxiv.org/abs/2004.13922v2
Publisher: hfl
Task:
- sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "a5e1e8bd",
"metadata": {},
"source": [
"## Chinese BERT with Whole Word Masking\n",
"\n",
"### Please use 'Bert' related functions to load this model!\n",
"\n",
"For further accelerating Chinese natural language processing, we provide **Chinese pre-trained BERT with Whole Word Masking**.\n",
"\n",
"**[Pre-Training with Whole Word Masking for Chinese BERT](https://arxiv.org/abs/1906.08101)**\n",
"Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Ziqing Yang, Shijin Wang, Guoping Hu\n",
"\n",
"This repository is developed based on:https://github.com/google-research/bert\n",
"\n",
"You may also interested in,\n",
"- Chinese BERT series: https://github.com/ymcui/Chinese-BERT-wwm\n",
"- Chinese MacBERT: https://github.com/ymcui/MacBERT\n",
"- Chinese ELECTRA: https://github.com/ymcui/Chinese-ELECTRA\n",
"- Chinese XLNet: https://github.com/ymcui/Chinese-XLNet\n",
"- Knowledge Distillation Toolkit - TextBrewer: https://github.com/airaria/TextBrewer\n",
"\n",
"More resources by HFL: https://github.com/ymcui/HFL-Anthology\n"
]
},
{
"cell_type": "markdown",
"id": "be498a8f",
"metadata": {},
"source": [
"## How to Use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0199d11d",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b71b0698",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"hfl/chinese-roberta-wwm-ext-large\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "5d6bd99f",
"metadata": {},
"source": [
"\n",
"## Citation\n",
"If you find the technical report or resource is useful, please cite the following technical report in your paper.\n",
"- Primary: https://arxiv.org/abs/2004.13922"
]
},
{
"cell_type": "markdown",
"id": "9429c396",
"metadata": {},
"source": [
"```\n",
"@inproceedings{cui-etal-2020-revisiting,\n",
"title = \"Revisiting Pre-Trained Models for {C}hinese Natural Language Processing\",\n",
"author = \"Cui, Yiming and\n",
"Che, Wanxiang and\n",
"Liu, Ting and\n",
"Qin, Bing and\n",
"Wang, Shijin and\n",
"Hu, Guoping\",\n",
"booktitle = \"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings\",\n",
"month = nov,\n",
"year = \"2020\",\n",
"address = \"Online\",\n",
"publisher = \"Association for Computational Linguistics\",\n",
"url = \"https://www.aclweb.org/anthology/2020.findings-emnlp.58\",\n",
"pages = \"657--668\",\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "9784d9b7",
"metadata": {},
"source": [
"- Secondary: https://arxiv.org/abs/1906.08101\n"
]
},
{
"cell_type": "markdown",
"id": "eb3e56a1",
"metadata": {},
"source": [
"```\n",
"@article{chinese-bert-wwm,\n",
"title={Pre-Training with Whole Word Masking for Chinese BERT},\n",
"author={Cui, Yiming and Che, Wanxiang and Liu, Ting and Qin, Bing and Yang, Ziqing and Wang, Shijin and Hu, Guoping},\n",
"journal={arXiv preprint arXiv:1906.08101},\n",
"year={2019}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "3593ecc9",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/hfl/chinese-roberta-wwm-ext-large](https://huggingface.co/hfl/chinese-roberta-wwm-ext-large),并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
},
"vscode": {
"interpreter": {
"hash": "606ea184b8fed3419d714b545dc1784fad6c99d0cc940b6b9d787dccf225faa5"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "faeb5f50",
"metadata": {},
"source": [
"## Chinese BERT with Whole Word Masking\n",
"\n",
"### Please use 'Bert' related functions to load this model!\n",
"\n",
"For further accelerating Chinese natural language processing, we provide **Chinese pre-trained BERT with Whole Word Masking**.\n",
"\n",
"**[Pre-Training with Whole Word Masking for Chinese BERT](https://arxiv.org/abs/1906.08101)**\n",
"Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Ziqing Yang, Shijin Wang, Guoping Hu\n",
"\n",
"This repository is developed based on:https://github.com/google-research/bert\n",
"\n",
"You may also interested in,\n",
"- Chinese BERT series: https://github.com/ymcui/Chinese-BERT-wwm\n",
"- Chinese MacBERT: https://github.com/ymcui/MacBERT\n",
"- Chinese ELECTRA: https://github.com/ymcui/Chinese-ELECTRA\n",
"- Chinese XLNet: https://github.com/ymcui/Chinese-XLNet\n",
"- Knowledge Distillation Toolkit - TextBrewer: https://github.com/airaria/TextBrewer\n",
"\n",
"More resources by HFL: https://github.com/ymcui/HFL-Anthology\n"
]
},
{
"cell_type": "markdown",
"id": "fbf98c0e",
"metadata": {},
"source": [
"## How to Use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5f6b3ac7",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f380cab7",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"hfl/chinese-roberta-wwm-ext-large\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "a39bca7c",
"metadata": {},
"source": [
"\n",
"## Citation\n",
"If you find the technical report or resource is useful, please cite the following technical report in your paper.\n",
"- Primary: https://arxiv.org/abs/2004.13922"
]
},
{
"cell_type": "markdown",
"id": "b01c1973",
"metadata": {},
"source": [
"```\n",
"@inproceedings{cui-etal-2020-revisiting,\n",
"title = \"Revisiting Pre-Trained Models for {C}hinese Natural Language Processing\",\n",
"author = \"Cui, Yiming and\n",
"Che, Wanxiang and\n",
"Liu, Ting and\n",
"Qin, Bing and\n",
"Wang, Shijin and\n",
"Hu, Guoping\",\n",
"booktitle = \"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings\",\n",
"month = nov,\n",
"year = \"2020\",\n",
"address = \"Online\",\n",
"publisher = \"Association for Computational Linguistics\",\n",
"url = \"https://www.aclweb.org/anthology/2020.findings-emnlp.58\",\n",
"pages = \"657--668\",\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "a8781cbe",
"metadata": {},
"source": [
"- Secondary: https://arxiv.org/abs/1906.08101\n"
]
},
{
"cell_type": "markdown",
"id": "7ad8a810",
"metadata": {},
"source": [
"```\n",
"@article{chinese-bert-wwm,\n",
"title={Pre-Training with Whole Word Masking for Chinese BERT},\n",
"author={Cui, Yiming and Che, Wanxiang and Liu, Ting and Qin, Bing and Yang, Ziqing and Wang, Shijin and Hu, Guoping},\n",
"journal={arXiv preprint arXiv:1906.08101},\n",
"year={2019}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "86de1995",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/hfl/chinese-roberta-wwm-ext-large](https://huggingface.co/hfl/chinese-roberta-wwm-ext-large) and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
},
"vscode": {
"interpreter": {
"hash": "606ea184b8fed3419d714b545dc1784fad6c99d0cc940b6b9d787dccf225faa5"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: Chinese
License: apache-2.0
Model_Info:
name: "hfl/chinese-roberta-wwm-ext"
description: "Please use 'Bert' related functions to load this model!"
description_en: "Please use 'Bert' related functions to load this model!"
icon: ""
from_repo: "https://huggingface.co/hfl/chinese-roberta-wwm-ext"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: ""
Publisher: "hfl"
License: "apache-2.0"
Language: "Chinese"
description: Please use 'Bert' related functions to load this model!
description_en: Please use 'Bert' related functions to load this model!
from_repo: https://huggingface.co/hfl/chinese-roberta-wwm-ext
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: hfl/chinese-roberta-wwm-ext
Paper:
- title: 'Pre-Training with Whole Word Masking for Chinese BERT'
url: 'http://arxiv.org/abs/1906.08101v3'
- title: 'Revisiting Pre-Trained Models for Chinese Natural Language Processing'
url: 'http://arxiv.org/abs/2004.13922v2'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- title: Pre-Training with Whole Word Masking for Chinese BERT
url: http://arxiv.org/abs/1906.08101v3
- title: Revisiting Pre-Trained Models for Chinese Natural Language Processing
url: http://arxiv.org/abs/2004.13922v2
Publisher: hfl
Task:
- sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "a5e1e8bd",
"metadata": {},
"source": [
"## Chinese BERT with Whole Word Masking\n",
"For further accelerating Chinese natural language processing, we provide **Chinese pre-trained BERT with Whole Word Masking**.\n",
"\n",
"**[Pre-Training with Whole Word Masking for Chinese BERT](https://arxiv.org/abs/1906.08101)**\n",
"Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Ziqing Yang, Shijin Wang, Guoping Hu\n",
"\n",
"This repository is developed based on:https://github.com/google-research/bert\n",
"\n",
"You may also interested in,\n",
"- Chinese BERT series: https://github.com/ymcui/Chinese-BERT-wwm\n",
"- Chinese MacBERT: https://github.com/ymcui/MacBERT\n",
"- Chinese ELECTRA: https://github.com/ymcui/Chinese-ELECTRA\n",
"- Chinese XLNet: https://github.com/ymcui/Chinese-XLNet\n",
"- Knowledge Distillation Toolkit - TextBrewer: https://github.com/airaria/TextBrewer\n",
"\n",
"More resources by HFL: https://github.com/ymcui/HFL-Anthology\n"
]
},
{
"cell_type": "markdown",
"id": "be498a8f",
"metadata": {},
"source": [
"## How to Use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0199d11d",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b71b0698",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"hfl/chinese-roberta-wwm-ext\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "5d6bd99f",
"metadata": {},
"source": [
"\n",
"## Citation\n",
"If you find the technical report or resource is useful, please cite the following technical report in your paper.\n",
"- Primary: https://arxiv.org/abs/2004.13922"
]
},
{
"cell_type": "markdown",
"id": "737822b2",
"metadata": {},
"source": [
"```\n",
"@inproceedings{cui-etal-2020-revisiting,\n",
"title = \"Revisiting Pre-Trained Models for {C}hinese Natural Language Processing\",\n",
"author = \"Cui, Yiming and\n",
"Che, Wanxiang and\n",
"Liu, Ting and\n",
"Qin, Bing and\n",
"Wang, Shijin and\n",
"Hu, Guoping\",\n",
"booktitle = \"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings\",\n",
"month = nov,\n",
"year = \"2020\",\n",
"address = \"Online\",\n",
"publisher = \"Association for Computational Linguistics\",\n",
"url = \"https://www.aclweb.org/anthology/2020.findings-emnlp.58\",\n",
"pages = \"657--668\",\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "9784d9b7",
"metadata": {},
"source": [
"- Secondary: https://arxiv.org/abs/1906.08101\n"
]
},
{
"cell_type": "markdown",
"id": "22d0c28d",
"metadata": {},
"source": [
"```\n",
"@article{chinese-bert-wwm,\n",
"title={Pre-Training with Whole Word Masking for Chinese BERT},\n",
"author={Cui, Yiming and Che, Wanxiang and Liu, Ting and Qin, Bing and Yang, Ziqing and Wang, Shijin and Hu, Guoping},\n",
"journal={arXiv preprint arXiv:1906.08101},\n",
"year={2019}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "3593ecc9",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/hfl/chinese-roberta-wwm-ext](https://huggingface.co/hfl/chinese-roberta-wwm-ext),并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
},
"vscode": {
"interpreter": {
"hash": "606ea184b8fed3419d714b545dc1784fad6c99d0cc940b6b9d787dccf225faa5"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "faeb5f50",
"metadata": {},
"source": [
"## Chinese BERT with Whole Word Masking\n",
"For further accelerating Chinese natural language processing, we provide **Chinese pre-trained BERT with Whole Word Masking**.\n",
"\n",
"**[Pre-Training with Whole Word Masking for Chinese BERT](https://arxiv.org/abs/1906.08101)**\n",
"Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Ziqing Yang, Shijin Wang, Guoping Hu\n",
"\n",
"This repository is developed based on:https://github.com/google-research/bert\n",
"\n",
"You may also interested in,\n",
"- Chinese BERT series: https://github.com/ymcui/Chinese-BERT-wwm\n",
"- Chinese MacBERT: https://github.com/ymcui/MacBERT\n",
"- Chinese ELECTRA: https://github.com/ymcui/Chinese-ELECTRA\n",
"- Chinese XLNet: https://github.com/ymcui/Chinese-XLNet\n",
"- Knowledge Distillation Toolkit - TextBrewer: https://github.com/airaria/TextBrewer\n",
"\n",
"More resources by HFL: https://github.com/ymcui/HFL-Anthology\n"
]
},
{
"cell_type": "markdown",
"id": "fbf98c0e",
"metadata": {},
"source": [
"## How to Use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5f6b3ac7",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f380cab7",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"hfl/chinese-roberta-wwm-ext\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "a39bca7c",
"metadata": {},
"source": [
"\n",
"## Citation\n",
"If you find the technical report or resource is useful, please cite the following technical report in your paper.\n",
"- Primary: https://arxiv.org/abs/2004.13922"
]
},
{
"cell_type": "markdown",
"id": "f495aec9",
"metadata": {},
"source": [
"```\n",
"@inproceedings{cui-etal-2020-revisiting,\n",
"title = \"Revisiting Pre-Trained Models for {C}hinese Natural Language Processing\",\n",
"author = \"Cui, Yiming and\n",
"Che, Wanxiang and\n",
"Liu, Ting and\n",
"Qin, Bing and\n",
"Wang, Shijin and\n",
"Hu, Guoping\",\n",
"booktitle = \"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings\",\n",
"month = nov,\n",
"year = \"2020\",\n",
"address = \"Online\",\n",
"publisher = \"Association for Computational Linguistics\",\n",
"url = \"https://www.aclweb.org/anthology/2020.findings-emnlp.58\",\n",
"pages = \"657--668\",\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "a8781cbe",
"metadata": {},
"source": [
"- Secondary: https://arxiv.org/abs/1906.08101\n"
]
},
{
"cell_type": "markdown",
"id": "8eebfbf4",
"metadata": {},
"source": [
"```\n",
"@article{chinese-bert-wwm,\n",
"title={Pre-Training with Whole Word Masking for Chinese BERT},\n",
"author={Cui, Yiming and Che, Wanxiang and Liu, Ting and Qin, Bing and Yang, Ziqing and Wang, Shijin and Hu, Guoping},\n",
"journal={arXiv preprint arXiv:1906.08101},\n",
"year={2019}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "86de1995",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/hfl/chinese-roberta-wwm-ext](https://huggingface.co/hfl/chinese-roberta-wwm-ext) and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
},
"vscode": {
"interpreter": {
"hash": "606ea184b8fed3419d714b545dc1784fad6c99d0cc940b6b9d787dccf225faa5"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: Chinese
License: apache-2.0
Model_Info:
name: "hfl/rbt3"
description: "This is a re-trained 3-layer RoBERTa-wwm-ext model."
description_en: "This is a re-trained 3-layer RoBERTa-wwm-ext model."
icon: ""
from_repo: "https://huggingface.co/hfl/rbt3"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: ""
Publisher: "hfl"
License: "apache-2.0"
Language: "Chinese"
description: This is a re-trained 3-layer RoBERTa-wwm-ext model.
description_en: This is a re-trained 3-layer RoBERTa-wwm-ext model.
from_repo: https://huggingface.co/hfl/rbt3
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: hfl/rbt3
Paper:
- title: 'Pre-Training with Whole Word Masking for Chinese BERT'
url: 'http://arxiv.org/abs/1906.08101v3'
- title: 'Revisiting Pre-Trained Models for Chinese Natural Language Processing'
url: 'http://arxiv.org/abs/2004.13922v2'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- title: Pre-Training with Whole Word Masking for Chinese BERT
url: http://arxiv.org/abs/1906.08101v3
- title: Revisiting Pre-Trained Models for Chinese Natural Language Processing
url: http://arxiv.org/abs/2004.13922v2
Publisher: hfl
Task:
- sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "a5e1e8bd",
"metadata": {},
"source": [
"## Chinese BERT with Whole Word Masking\n",
"\n",
"### Please use 'Bert' related functions to load this model!\n",
"\n",
"For further accelerating Chinese natural language processing, we provide **Chinese pre-trained BERT with Whole Word Masking**.\n",
"\n",
"**[Pre-Training with Whole Word Masking for Chinese BERT](https://arxiv.org/abs/1906.08101)**\n",
"Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Ziqing Yang, Shijin Wang, Guoping Hu\n",
"\n",
"This repository is developed based on:https://github.com/google-research/bert\n",
"\n",
"You may also interested in,\n",
"- Chinese BERT series: https://github.com/ymcui/Chinese-BERT-wwm\n",
"- Chinese MacBERT: https://github.com/ymcui/MacBERT\n",
"- Chinese ELECTRA: https://github.com/ymcui/Chinese-ELECTRA\n",
"- Chinese XLNet: https://github.com/ymcui/Chinese-XLNet\n",
"- Knowledge Distillation Toolkit - TextBrewer: https://github.com/airaria/TextBrewer\n",
"\n",
"More resources by HFL: https://github.com/ymcui/HFL-Anthology\n"
]
},
{
"cell_type": "markdown",
"id": "be498a8f",
"metadata": {},
"source": [
"## How to Use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0199d11d",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b71b0698",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"hfl/rbt3\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "5d6bd99f",
"metadata": {},
"source": [
"\n",
"## Citation\n",
"If you find the technical report or resource is useful, please cite the following technical report in your paper.\n",
"- Primary: https://arxiv.org/abs/2004.13922"
]
},
{
"cell_type": "markdown",
"id": "73e04675",
"metadata": {},
"source": [
"```\n",
"@inproceedings{cui-etal-2020-revisiting,\n",
"title = \"Revisiting Pre-Trained Models for {C}hinese Natural Language Processing\",\n",
"author = \"Cui, Yiming and\n",
"Che, Wanxiang and\n",
"Liu, Ting and\n",
"Qin, Bing and\n",
"Wang, Shijin and\n",
"Hu, Guoping\",\n",
"booktitle = \"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings\",\n",
"month = nov,\n",
"year = \"2020\",\n",
"address = \"Online\",\n",
"publisher = \"Association for Computational Linguistics\",\n",
"url = \"https://www.aclweb.org/anthology/2020.findings-emnlp.58\",\n",
"pages = \"657--668\",\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "9784d9b7",
"metadata": {},
"source": [
"- Secondary: https://arxiv.org/abs/1906.08101\n"
]
},
{
"cell_type": "markdown",
"id": "068895c6",
"metadata": {},
"source": [
"```\n",
"@article{chinese-bert-wwm,\n",
"title={Pre-Training with Whole Word Masking for Chinese BERT},\n",
"author={Cui, Yiming and Che, Wanxiang and Liu, Ting and Qin, Bing and Yang, Ziqing and Wang, Shijin and Hu, Guoping},\n",
"journal={arXiv preprint arXiv:1906.08101},\n",
"year={2019}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "3593ecc9",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/hfl/rbt3](https://huggingface.co/hfl/rbt3),并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
},
"vscode": {
"interpreter": {
"hash": "606ea184b8fed3419d714b545dc1784fad6c99d0cc940b6b9d787dccf225faa5"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "faeb5f50",
"metadata": {},
"source": [
"## Chinese BERT with Whole Word Masking\n",
"\n",
"### Please use 'Bert' related functions to load this model!\n",
"\n",
"For further accelerating Chinese natural language processing, we provide **Chinese pre-trained BERT with Whole Word Masking**.\n",
"\n",
"**[Pre-Training with Whole Word Masking for Chinese BERT](https://arxiv.org/abs/1906.08101)**\n",
"Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Ziqing Yang, Shijin Wang, Guoping Hu\n",
"\n",
"This repository is developed based on:https://github.com/google-research/bert\n",
"\n",
"You may also interested in,\n",
"- Chinese BERT series: https://github.com/ymcui/Chinese-BERT-wwm\n",
"- Chinese MacBERT: https://github.com/ymcui/MacBERT\n",
"- Chinese ELECTRA: https://github.com/ymcui/Chinese-ELECTRA\n",
"- Chinese XLNet: https://github.com/ymcui/Chinese-XLNet\n",
"- Knowledge Distillation Toolkit - TextBrewer: https://github.com/airaria/TextBrewer\n",
"\n",
"More resources by HFL: https://github.com/ymcui/HFL-Anthology\n"
]
},
{
"cell_type": "markdown",
"id": "fbf98c0e",
"metadata": {},
"source": [
"## How to Use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5f6b3ac7",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f380cab7",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"hfl/rbt3\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "a39bca7c",
"metadata": {},
"source": [
"\n",
"## Citation\n",
"If you find the technical report or resource is useful, please cite the following technical report in your paper.\n",
"- Primary: https://arxiv.org/abs/2004.13922"
]
},
{
"cell_type": "markdown",
"id": "370bfe67",
"metadata": {},
"source": [
"```\n",
"@inproceedings{cui-etal-2020-revisiting,\n",
"title = \"Revisiting Pre-Trained Models for {C}hinese Natural Language Processing\",\n",
"author = \"Cui, Yiming and\n",
"Che, Wanxiang and\n",
"Liu, Ting and\n",
"Qin, Bing and\n",
"Wang, Shijin and\n",
"Hu, Guoping\",\n",
"booktitle = \"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings\",\n",
"month = nov,\n",
"year = \"2020\",\n",
"address = \"Online\",\n",
"publisher = \"Association for Computational Linguistics\",\n",
"url = \"https://www.aclweb.org/anthology/2020.findings-emnlp.58\",\n",
"pages = \"657--668\",\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "a8781cbe",
"metadata": {},
"source": [
"- Secondary: https://arxiv.org/abs/1906.08101\n"
]
},
{
"cell_type": "markdown",
"id": "4a1fe5aa",
"metadata": {},
"source": [
"```\n",
"@article{chinese-bert-wwm,\n",
"title={Pre-Training with Whole Word Masking for Chinese BERT},\n",
"author={Cui, Yiming and Che, Wanxiang and Liu, Ting and Qin, Bing and Yang, Ziqing and Wang, Shijin and Hu, Guoping},\n",
"journal={arXiv preprint arXiv:1906.08101},\n",
"year={2019}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "86de1995",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/hfl/rbt3](https://huggingface.co/hfl/rbt3) and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
},
"vscode": {
"interpreter": {
"hash": "606ea184b8fed3419d714b545dc1784fad6c99d0cc940b6b9d787dccf225faa5"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: bookcorpus,wikipedia
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info:
name: "bert-base-cased"
description: "BERT base model (cased)"
description_en: "BERT base model (cased)"
icon: ""
from_repo: "https://huggingface.co/bert-base-cased"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: "bookcorpus,wikipedia"
Publisher: "huggingface"
License: "apache-2.0"
Language: "English"
description: BERT base model (cased)
description_en: BERT base model (cased)
from_repo: https://huggingface.co/bert-base-cased
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: bert-base-cased
Paper:
- title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding'
url: 'http://arxiv.org/abs/1810.04805v2'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding'
url: http://arxiv.org/abs/1810.04805v2
Publisher: huggingface
Task:
- sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "bb008e6f",
"metadata": {},
"source": [
"# BERT base model (cased)\n",
"\n",
"详细内容请看[Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "079266fb",
"metadata": {},
"source": [
"Pretrained model on English language using a masked language modeling (MLM) objective. It was introduced in\n",
"[this paper](https://arxiv.org/abs/1810.04805) and first released in\n",
"[this repository](https://github.com/google-research/bert). This model is case-sensitive: it makes a difference between\n",
"english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "5c8220aa",
"metadata": {},
"source": [
"Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by\n",
"the Hugging Face team.\n"
]
},
{
"cell_type": "markdown",
"id": "8564477f",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "7365685d",
"metadata": {},
"source": [
"BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it\n",
"was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of\n",
"publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it\n",
"was pretrained with two objectives:\n"
]
},
{
"cell_type": "markdown",
"id": "1c979d12",
"metadata": {},
"source": [
"- Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run\n",
"the entire masked sentence through the model and has to predict the masked words. This is different from traditional\n",
"recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like\n",
"GPT which internally mask the future tokens. It allows the model to learn a bidirectional representation of the\n",
"sentence.\n",
"- Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes\n",
"they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to\n",
"predict if the two sentences were following each other or not.\n"
]
},
{
"cell_type": "markdown",
"id": "cdc00722",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the English language that can then be used to extract features\n",
"useful for downstream tasks: if you have a dataset of labeled sentences for instance, you can train a standard\n",
"classifier using the features produced by the BERT model as inputs.\n"
]
},
{
"cell_type": "markdown",
"id": "9253a517",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9fbfcd0a",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6185db74",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-base-cased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "8e0ca3bd",
"metadata": {},
"source": [
"```\n",
"@article{DBLP:journals/corr/abs-1810-04805,\n",
"author = {Jacob Devlin and\n",
"Ming{-}Wei Chang and\n",
"Kenton Lee and\n",
"Kristina Toutanova},\n",
"title = {{BERT:} Pre-training of Deep Bidirectional Transformers for Language\n",
"Understanding},\n",
"journal = {CoRR},\n",
"volume = {abs/1810.04805},\n",
"year = {2018},\n",
"url = {http://arxiv.org/abs/1810.04805},\n",
"archivePrefix = {arXiv},\n",
"eprint = {1810.04805},\n",
"timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},\n",
"biburl = {https://dblp.org/rec/journals/corr/abs-1810-04805.bib},\n",
"bibsource = {dblp computer science bibliography, https://dblp.org}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "f14e9f06",
"metadata": {},
"source": [
"<a href=\"https://huggingface.co/exbert/?model=bert-base-cased\">\n",
"<img width=\"300px\" src=\"https://cdn-media.huggingface.co/exbert/button.png\">\n",
"</a>\n",
"\n",
"> 此模型介绍及权重来源于 https://huggingface.co/bert-base-cased ,并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "58235e68",
"metadata": {},
"source": [
"# BERT base model (cased)\n",
"\n",
"You can get more details from [Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "36c7d585",
"metadata": {},
"source": [
"Pretrained model on English language using a masked language modeling (MLM) objective. It was introduced in\n",
"[this paper](https://arxiv.org/abs/1810.04805) and first released in\n",
"[this repository](https://github.com/google-research/bert). This model is case-sensitive: it makes a difference between\n",
"english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "d361a880",
"metadata": {},
"source": [
"Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by\n",
"the Hugging Face team.\n"
]
},
{
"cell_type": "markdown",
"id": "47b0cf99",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "d1911491",
"metadata": {},
"source": [
"BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it\n",
"was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of\n",
"publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it\n",
"was pretrained with two objectives:\n"
]
},
{
"cell_type": "markdown",
"id": "94e45c66",
"metadata": {},
"source": [
"- Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run\n",
"the entire masked sentence through the model and has to predict the masked words. This is different from traditional\n",
"recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like\n",
"GPT which internally mask the future tokens. It allows the model to learn a bidirectional representation of the\n",
"sentence.\n",
"- Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes\n",
"they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to\n",
"predict if the two sentences were following each other or not.\n"
]
},
{
"cell_type": "markdown",
"id": "9fec6197",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the English language that can then be used to extract features\n",
"useful for downstream tasks: if you have a dataset of labeled sentences for instance, you can train a standard\n",
"classifier using the features produced by the BERT model as inputs.\n"
]
},
{
"cell_type": "markdown",
"id": "5e17ee3b",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "62ae31d8",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3c52bdd5",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-base-cased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "da7c4875",
"metadata": {},
"source": [
"```\n",
"@article{DBLP:journals/corr/abs-1810-04805,\n",
"author = {Jacob Devlin and\n",
"Ming{-}Wei Chang and\n",
"Kenton Lee and\n",
"Kristina Toutanova},\n",
"title = {{BERT:} Pre-training of Deep Bidirectional Transformers for Language\n",
"Understanding},\n",
"journal = {CoRR},\n",
"volume = {abs/1810.04805},\n",
"year = {2018},\n",
"url = {http://arxiv.org/abs/1810.04805},\n",
"archivePrefix = {arXiv},\n",
"eprint = {1810.04805},\n",
"timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},\n",
"biburl = {https://dblp.org/rec/journals/corr/abs-1810-04805.bib},\n",
"bibsource = {dblp computer science bibliography, https://dblp.org}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "86873e48",
"metadata": {},
"source": [
"<a href=\"https://huggingface.co/exbert/?model=bert-base-cased\">\n",
"<img width=\"300px\" src=\"https://cdn-media.huggingface.co/exbert/button.png\">\n",
"</a>\n",
"\n",
"\n",
"> The model introduction and model weights originate from https://huggingface.co/bert-base-cased and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: German
License: mit
Model_Info:
name: "bert-base-german-cased"
description: "German BERT"
description_en: "German BERT"
icon: ""
from_repo: "https://huggingface.co/bert-base-german-cased"
description: German BERT
description_en: German BERT
from_repo: https://huggingface.co/bert-base-german-cased
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: bert-base-german-cased
Paper: null
Publisher: huggingface
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: ""
Publisher: "huggingface"
License: "mit"
Language: "German"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "0870a629",
"metadata": {},
"source": [
"# German BERT\n",
"![bert_image](https://static.tildacdn.com/tild6438-3730-4164-b266-613634323466/german_bert.png)\n",
"## Overview\n",
"**Language model:** bert-base-cased\n",
"**Language:** German\n",
"**Training data:** Wiki, OpenLegalData, News (~ 12GB)\n",
"**Eval data:** Conll03 (NER), GermEval14 (NER), GermEval18 (Classification), GNAD (Classification)\n",
"**Infrastructure**: 1x TPU v2\n",
"**Published**: Jun 14th, 2019\n",
"\n",
"详细内容请看[Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "b2a6c897",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1790135e",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "99c714ac",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-base-german-cased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "54c2f398",
"metadata": {},
"source": [
"## Authors\n",
"- Branden Chan: `branden.chan [at] deepset.ai`\n",
"- Timo Möller: `timo.moeller [at] deepset.ai`\n",
"- Malte Pietsch: `malte.pietsch [at] deepset.ai`\n",
"- Tanay Soni: `tanay.soni [at] deepset.ai`\n"
]
},
{
"cell_type": "markdown",
"id": "94b669bc",
"metadata": {},
"source": [
"## About us\n",
"![deepset logo](https://raw.githubusercontent.com/deepset-ai/FARM/master/docs/img/deepset_logo.png)\n"
]
},
{
"cell_type": "markdown",
"id": "ce90710a",
"metadata": {},
"source": [
"We bring NLP to the industry via open source!\n",
"Our focus: Industry specific language models & large scale QA systems.\n"
]
},
{
"cell_type": "markdown",
"id": "5dc8ba63",
"metadata": {},
"source": [
"Some of our work:\n",
"- [German BERT (aka \"bert-base-german-cased\")](https://deepset.ai/german-bert)\n",
"- [FARM](https://github.com/deepset-ai/FARM)\n",
"- [Haystack](https://github.com/deepset-ai/haystack/)\n"
]
},
{
"cell_type": "markdown",
"id": "56a1a360",
"metadata": {},
"source": [
"Get in touch:\n",
"[Twitter](https://twitter.com/deepset_ai) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Website](https://deepset.ai)\n",
"\n",
"> 此模型介绍及权重来源于 https://huggingface.co/bert-base-german-cased ,并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "7aa268f7",
"metadata": {},
"source": [
"# German BERT\n",
"![bert_image](https://static.tildacdn.com/tild6438-3730-4164-b266-613634323466/german_bert.png)\n",
"## Overview\n",
"**Language model:** bert-base-cased\n",
"**Language:** German\n",
"**Training data:** Wiki, OpenLegalData, News (~ 12GB)\n",
"**Eval data:** Conll03 (NER), GermEval14 (NER), GermEval18 (Classification), GNAD (Classification)\n",
"**Infrastructure**: 1x TPU v2\n",
"**Published**: Jun 14th, 2019\n",
"\n",
"You can get more details from [Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "f407e80e",
"metadata": {},
"source": [
"**Update April 3rd, 2020**: we updated the vocabulary file on deepset's s3 to conform with the default tokenization of punctuation tokens.\n",
"For details see the related [FARM issue](https://github.com/deepset-ai/FARM/issues/60). If you want to use the old vocab we have also uploaded a deepset/bert-base-german-cased-oldvocab model.\n"
]
},
{
"cell_type": "markdown",
"id": "18d2ad8e",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b80052bd",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4ea9d4e3",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-base-german-cased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "9d560e75",
"metadata": {},
"source": [
"## Authors\n",
"- Branden Chan: `branden.chan [at] deepset.ai`\n",
"- Timo Möller: `timo.moeller [at] deepset.ai`\n",
"- Malte Pietsch: `malte.pietsch [at] deepset.ai`\n",
"- Tanay Soni: `tanay.soni [at] deepset.ai`\n"
]
},
{
"cell_type": "markdown",
"id": "a0e43273",
"metadata": {},
"source": [
"## About us\n",
"![deepset logo](https://raw.githubusercontent.com/deepset-ai/FARM/master/docs/img/deepset_logo.png)\n"
]
},
{
"cell_type": "markdown",
"id": "c1b05e60",
"metadata": {},
"source": [
"We bring NLP to the industry via open source!\n",
"Our focus: Industry specific language models & large scale QA systems.\n"
]
},
{
"cell_type": "markdown",
"id": "5196bee9",
"metadata": {},
"source": [
"Some of our work:\n",
"- [German BERT (aka \"bert-base-german-cased\")](https://deepset.ai/german-bert)\n",
"- [FARM](https://github.com/deepset-ai/FARM)\n",
"- [Haystack](https://github.com/deepset-ai/haystack/)\n"
]
},
{
"cell_type": "markdown",
"id": "18fe01d5",
"metadata": {},
"source": [
"Get in touch:\n",
"[Twitter](https://twitter.com/deepset_ai) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Website](https://deepset.ai)\n",
"\n",
"> The model introduction and model weights originate from https://huggingface.co/bert-base-german-cased and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: wikipedia
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: ''
License: apache-2.0
Model_Info:
name: "bert-base-multilingual-cased"
description: "BERT multilingual base model (cased)"
description_en: "BERT multilingual base model (cased)"
icon: ""
from_repo: "https://huggingface.co/bert-base-multilingual-cased"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: "wikipedia"
Publisher: "huggingface"
License: "apache-2.0"
Language: ""
description: BERT multilingual base model (cased)
description_en: BERT multilingual base model (cased)
from_repo: https://huggingface.co/bert-base-multilingual-cased
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: bert-base-multilingual-cased
Paper:
- title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding'
url: 'http://arxiv.org/abs/1810.04805v2'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding'
url: http://arxiv.org/abs/1810.04805v2
Publisher: huggingface
Task:
- sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "92e18984",
"metadata": {},
"source": [
"# BERT multilingual base model (cased)\n",
"\n",
"详细内容请看[Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "cc38bad3",
"metadata": {},
"source": [
"Pretrained model on the top 104 languages with the largest Wikipedia using a masked language modeling (MLM) objective.\n",
"It was introduced in [this paper](https://arxiv.org/abs/1810.04805) and first released in\n",
"[this repository](https://github.com/google-research/bert). This model is case sensitive: it makes a difference\n",
"between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "a54cdf6e",
"metadata": {},
"source": [
"Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by\n",
"the Hugging Face team.\n"
]
},
{
"cell_type": "markdown",
"id": "3be641ef",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "93fd337b",
"metadata": {},
"source": [
"BERT is a transformers model pretrained on a large corpus of multilingual data in a self-supervised fashion. This means\n",
"it was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of\n",
"publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it\n",
"was pretrained with two objectives:\n"
]
},
{
"cell_type": "markdown",
"id": "2222d4b6",
"metadata": {},
"source": [
"- Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run\n",
"the entire masked sentence through the model and has to predict the masked words. This is different from traditional\n",
"recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like\n",
"GPT which internally mask the future tokens. It allows the model to learn a bidirectional representation of the\n",
"sentence.\n",
"- Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes\n",
"they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to\n",
"predict if the two sentences were following each other or not.\n"
]
},
{
"cell_type": "markdown",
"id": "2f9ea64e",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the languages in the training set that can then be used to\n",
"extract features useful for downstream tasks: if you have a dataset of labeled sentences for instance, you can train a\n",
"standard classifier using the features produced by the BERT model as inputs.\n"
]
},
{
"cell_type": "markdown",
"id": "7363abb0",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "780c0123",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5a325830",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-base-multilingual-cased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "81ca575a",
"metadata": {},
"source": [
"```\n",
"@article{DBLP:journals/corr/abs-1810-04805,\n",
"author = {Jacob Devlin and\n",
"Ming{-}Wei Chang and\n",
"Kenton Lee and\n",
"Kristina Toutanova},\n",
"title = {{BERT:} Pre-training of Deep Bidirectional Transformers for Language\n",
"Understanding},\n",
"journal = {CoRR},\n",
"volume = {abs/1810.04805},\n",
"year = {2018},\n",
"url = {http://arxiv.org/abs/1810.04805},\n",
"archivePrefix = {arXiv},\n",
"eprint = {1810.04805},\n",
"timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},\n",
"biburl = {https://dblp.org/rec/journals/corr/abs-1810-04805.bib},\n",
"bibsource = {dblp computer science bibliography, https://dblp.org}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "216555c3",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于 https://huggingface.co/bert-base-multilingual-cased ,并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "19d62907",
"metadata": {},
"source": [
"# BERT multilingual base model (cased)\n",
"\n",
"You can get more details from [Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "09809b94",
"metadata": {},
"source": [
"Pretrained model on the top 104 languages with the largest Wikipedia using a masked language modeling (MLM) objective.\n",
"It was introduced in [this paper](https://arxiv.org/abs/1810.04805) and first released in\n",
"[this repository](https://github.com/google-research/bert). This model is case sensitive: it makes a difference\n",
"between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "d3a52162",
"metadata": {},
"source": [
"Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by\n",
"the Hugging Face team.\n"
]
},
{
"cell_type": "markdown",
"id": "f67f02dc",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "bf05022f",
"metadata": {},
"source": [
"BERT is a transformers model pretrained on a large corpus of multilingual data in a self-supervised fashion. This means\n",
"it was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of\n",
"publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it\n",
"was pretrained with two objectives:\n"
]
},
{
"cell_type": "markdown",
"id": "081a7a88",
"metadata": {},
"source": [
"- Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run\n",
"the entire masked sentence through the model and has to predict the masked words. This is different from traditional\n",
"recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like\n",
"GPT which internally mask the future tokens. It allows the model to learn a bidirectional representation of the\n",
"sentence.\n",
"- Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes\n",
"they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to\n",
"predict if the two sentences were following each other or not.\n"
]
},
{
"cell_type": "markdown",
"id": "79e6eda9",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the languages in the training set that can then be used to\n",
"extract features useful for downstream tasks: if you have a dataset of labeled sentences for instance, you can train a\n",
"standard classifier using the features produced by the BERT model as inputs.\n"
]
},
{
"cell_type": "markdown",
"id": "1696fb24",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4f7d20fd",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3c369c9a",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-base-multilingual-cased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "6338f981",
"metadata": {},
"source": [
"```\n",
"@article{DBLP:journals/corr/abs-1810-04805,\n",
"author = {Jacob Devlin and\n",
"Ming{-}Wei Chang and\n",
"Kenton Lee and\n",
"Kristina Toutanova},\n",
"title = {{BERT:} Pre-training of Deep Bidirectional Transformers for Language\n",
"Understanding},\n",
"journal = {CoRR},\n",
"volume = {abs/1810.04805},\n",
"year = {2018},\n",
"url = {http://arxiv.org/abs/1810.04805},\n",
"archivePrefix = {arXiv},\n",
"eprint = {1810.04805},\n",
"timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},\n",
"biburl = {https://dblp.org/rec/journals/corr/abs-1810-04805.bib},\n",
"bibsource = {dblp computer science bibliography, https://dblp.org}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "c55dc64e",
"metadata": {},
"source": [
"\n",
"> The model introduction and model weights originate from https://huggingface.co/bert-base-multilingual-cased and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: wikipedia
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: ''
License: apache-2.0
Model_Info:
name: "bert-base-multilingual-uncased"
description: "BERT multilingual base model (uncased)"
description_en: "BERT multilingual base model (uncased)"
icon: ""
from_repo: "https://huggingface.co/bert-base-multilingual-uncased"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: "wikipedia"
Publisher: "huggingface"
License: "apache-2.0"
Language: ""
description: BERT multilingual base model (uncased)
description_en: BERT multilingual base model (uncased)
from_repo: https://huggingface.co/bert-base-multilingual-uncased
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: bert-base-multilingual-uncased
Paper:
- title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding'
url: 'http://arxiv.org/abs/1810.04805v2'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding'
url: http://arxiv.org/abs/1810.04805v2
Publisher: huggingface
Task:
- sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "867cb6e6",
"metadata": {},
"source": [
"# BERT multilingual base model (uncased)\n",
"\n",
"详细内容请看[Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "207ffc57",
"metadata": {},
"source": [
"Pretrained model on the top 102 languages with the largest Wikipedia using a masked language modeling (MLM) objective.\n",
"It was introduced in [this paper](https://arxiv.org/abs/1810.04805) and first released in\n",
"[this repository](https://github.com/google-research/bert). This model is uncased: it does not make a difference\n",
"between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "8b2e2c13",
"metadata": {},
"source": [
"Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by\n",
"the Hugging Face team.\n"
]
},
{
"cell_type": "markdown",
"id": "40d071c9",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "af4a1260",
"metadata": {},
"source": [
"BERT is a transformers model pretrained on a large corpus of multilingual data in a self-supervised fashion. This means\n",
"it was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of\n",
"publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it\n",
"was pretrained with two objectives:\n"
]
},
{
"cell_type": "markdown",
"id": "81abfbcb",
"metadata": {},
"source": [
"- Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run\n",
"the entire masked sentence through the model and has to predict the masked words. This is different from traditional\n",
"recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like\n",
"GPT which internally mask the future tokens. It allows the model to learn a bidirectional representation of the\n",
"sentence.\n",
"- Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes\n",
"they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to\n",
"predict if the two sentences were following each other or not.\n"
]
},
{
"cell_type": "markdown",
"id": "64988b6b",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the languages in the training set that can then be used to\n",
"extract features useful for downstream tasks: if you have a dataset of labeled sentences for instance, you can train a\n",
"standard classifier using the features produced by the BERT model as inputs.\n"
]
},
{
"cell_type": "markdown",
"id": "79c3e104",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d7b2d0ec",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "52f8d16d",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-base-multilingual-uncased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "f11b298a",
"metadata": {},
"source": [
"```\n",
"@article{DBLP:journals/corr/abs-1810-04805,\n",
"author = {Jacob Devlin and\n",
"Ming{-}Wei Chang and\n",
"Kenton Lee and\n",
"Kristina Toutanova},\n",
"title = {{BERT:} Pre-training of Deep Bidirectional Transformers for Language\n",
"Understanding},\n",
"journal = {CoRR},\n",
"volume = {abs/1810.04805},\n",
"year = {2018},\n",
"url = {http://arxiv.org/abs/1810.04805},\n",
"archivePrefix = {arXiv},\n",
"eprint = {1810.04805},\n",
"timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},\n",
"biburl = {https://dblp.org/rec/journals/corr/abs-1810-04805.bib},\n",
"bibsource = {dblp computer science bibliography, https://dblp.org}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "548a9d6c",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于 https://huggingface.co/bert-base-multilingual-uncased ,并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "ff0c69a5",
"metadata": {},
"source": [
"# BERT multilingual base model (uncased)\n",
"\n",
"You can get more details from [Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "1ad499a9",
"metadata": {},
"source": [
"Pretrained model on the top 102 languages with the largest Wikipedia using a masked language modeling (MLM) objective.\n",
"It was introduced in [this paper](https://arxiv.org/abs/1810.04805) and first released in\n",
"[this repository](https://github.com/google-research/bert). This model is uncased: it does not make a difference\n",
"between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "a8878d0c",
"metadata": {},
"source": [
"Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by\n",
"the Hugging Face team.\n"
]
},
{
"cell_type": "markdown",
"id": "4581e670",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "c8d5f59f",
"metadata": {},
"source": [
"BERT is a transformers model pretrained on a large corpus of multilingual data in a self-supervised fashion. This means\n",
"it was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of\n",
"publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it\n",
"was pretrained with two objectives:\n"
]
},
{
"cell_type": "markdown",
"id": "836834df",
"metadata": {},
"source": [
"- Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run\n",
"the entire masked sentence through the model and has to predict the masked words. This is different from traditional\n",
"recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like\n",
"GPT which internally mask the future tokens. It allows the model to learn a bidirectional representation of the\n",
"sentence.\n",
"- Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes\n",
"they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to\n",
"predict if the two sentences were following each other or not.\n"
]
},
{
"cell_type": "markdown",
"id": "bafe70e4",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the languages in the training set that can then be used to\n",
"extract features useful for downstream tasks: if you have a dataset of labeled sentences for instance, you can train a\n",
"standard classifier using the features produced by the BERT model as inputs.\n"
]
},
{
"cell_type": "markdown",
"id": "cf2a29e2",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "dc792d6e",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7a6faf50",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-base-multilingual-uncased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "5b616f23",
"metadata": {},
"source": [
"```\n",
"@article{DBLP:journals/corr/abs-1810-04805,\n",
"author = {Jacob Devlin and\n",
"Ming{-}Wei Chang and\n",
"Kenton Lee and\n",
"Kristina Toutanova},\n",
"title = {{BERT:} Pre-training of Deep Bidirectional Transformers for Language\n",
"Understanding},\n",
"journal = {CoRR},\n",
"volume = {abs/1810.04805},\n",
"year = {2018},\n",
"url = {http://arxiv.org/abs/1810.04805},\n",
"archivePrefix = {arXiv},\n",
"eprint = {1810.04805},\n",
"timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},\n",
"biburl = {https://dblp.org/rec/journals/corr/abs-1810-04805.bib},\n",
"bibsource = {dblp computer science bibliography, https://dblp.org}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "67e01093",
"metadata": {},
"source": [
"> The model introduction and model weights originate from https://huggingface.co/bert-base-multilingual-uncased and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: bookcorpus,wikipedia
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info:
name: "bert-base-uncased"
description: "BERT base model (uncased)"
description_en: "BERT base model (uncased)"
icon: ""
from_repo: "https://huggingface.co/bert-base-uncased"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: "bookcorpus,wikipedia"
Publisher: "huggingface"
License: "apache-2.0"
Language: "English"
description: BERT base model (uncased)
description_en: BERT base model (uncased)
from_repo: https://huggingface.co/bert-base-uncased
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: bert-base-uncased
Paper:
- title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding'
url: 'http://arxiv.org/abs/1810.04805v2'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding'
url: http://arxiv.org/abs/1810.04805v2
Publisher: huggingface
Task:
- sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "a14866e7",
"metadata": {},
"source": [
"# BERT base model (uncased)\n",
"\n",
"详细内容请看[Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "d348c680",
"metadata": {},
"source": [
"Pretrained model on English language using a masked language modeling (MLM) objective. It was introduced in\n",
"[this paper](https://arxiv.org/abs/1810.04805) and first released in\n",
"[this repository](https://github.com/google-research/bert). This model is uncased: it does not make a difference\n",
"between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "9a790b40",
"metadata": {},
"source": [
"Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by\n",
"the Hugging Face team.\n"
]
},
{
"cell_type": "markdown",
"id": "985d2894",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "985bd7ee",
"metadata": {},
"source": [
"BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it\n",
"was pretrained on the raw texts only, with no humans labeling them in any way (which is why it can use lots of\n",
"publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it\n",
"was pretrained with two objectives:\n"
]
},
{
"cell_type": "markdown",
"id": "2e1ee5f4",
"metadata": {},
"source": [
"- Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run\n",
"the entire masked sentence through the model and has to predict the masked words. This is different from traditional\n",
"recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like\n",
"GPT which internally masks the future tokens. It allows the model to learn a bidirectional representation of the\n",
"sentence.\n",
"- Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes\n",
"they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to\n",
"predict if the two sentences were following each other or not.\n"
]
},
{
"cell_type": "markdown",
"id": "ae584a51",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the English language that can then be used to extract features\n",
"useful for downstream tasks: if you have a dataset of labeled sentences, for instance, you can train a standard\n",
"classifier using the features produced by the BERT model as inputs.\n"
]
},
{
"cell_type": "markdown",
"id": "a4d48848",
"metadata": {},
"source": [
"## Model variations\n"
]
},
{
"cell_type": "markdown",
"id": "dcb46068",
"metadata": {},
"source": [
"BERT has originally been released in base and large variations, for cased and uncased input text. The uncased models also strips out an accent markers.\n",
"Chinese and multilingual uncased and cased versions followed shortly after.\n",
"Modified preprocessing with whole word masking has replaced subpiece masking in a following work, with the release of two models.\n",
"Other 24 smaller models are released afterward.\n"
]
},
{
"cell_type": "markdown",
"id": "bdf3ec7e",
"metadata": {},
"source": [
"The detailed release history can be found on the [google-research/bert readme](https://github.com/google-research/bert/blob/master/README.md) on github.\n"
]
},
{
"cell_type": "markdown",
"id": "d66e6fc4",
"metadata": {},
"source": [
"| Model | #params | Language |\n",
"|------------------------|--------------------------------|-------|\n",
"| [`bert-base-uncased`](https://huggingface.co/bert-base-uncased) | 110M | English |\n",
"| [`bert-large-uncased`](https://huggingface.co/bert-large-uncased) | 340M | English | sub\n",
"| [`bert-base-cased`](https://huggingface.co/bert-base-cased) | 110M | English |\n",
"| [`bert-large-cased`](https://huggingface.co/bert-large-cased) | 340M | English |\n",
"| [`bert-base-chinese`](https://huggingface.co/bert-base-chinese) | 110M | Chinese |\n",
"| [`bert-base-multilingual-cased`](https://huggingface.co/bert-base-multilingual-cased) | 110M | Multiple |\n",
"| [`bert-large-uncased-whole-word-masking`](https://huggingface.co/bert-large-uncased-whole-word-masking) | 340M | English |\n",
"| [`bert-large-cased-whole-word-masking`](https://huggingface.co/bert-large-cased-whole-word-masking) | 340M | English |\n"
]
},
{
"cell_type": "markdown",
"id": "93c97712",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e4daab88",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "09dec4f3",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-base-uncased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "85541d34",
"metadata": {},
"source": [
"```\n",
"@article{DBLP:journals/corr/abs-1810-04805,\n",
"author = {Jacob Devlin and\n",
"Ming{-}Wei Chang and\n",
"Kenton Lee and\n",
"Kristina Toutanova},\n",
"title = {{BERT:} Pre-training of Deep Bidirectional Transformers for Language\n",
"Understanding},\n",
"journal = {CoRR},\n",
"volume = {abs/1810.04805},\n",
"year = {2018},\n",
"url = {http://arxiv.org/abs/1810.04805},\n",
"archivePrefix = {arXiv},\n",
"eprint = {1810.04805},\n",
"timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},\n",
"biburl = {https://dblp.org/rec/journals/corr/abs-1810-04805.bib},\n",
"bibsource = {dblp computer science bibliography, https://dblp.org}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "82898490",
"metadata": {},
"source": [
"<a href=\"https://huggingface.co/exbert/?model=bert-base-uncased\">\n",
"<img width=\"300px\" src=\"https://cdn-media.huggingface.co/exbert/button.png\">\n",
"</a>\n",
"\n",
"> 此模型介绍及权重来源于 https://huggingface.co/bert-base-uncased ,并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "86c2dd31",
"metadata": {},
"source": [
"# BERT base model (uncased)\n",
"\n",
"You can get more details from [Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "e25590e2",
"metadata": {},
"source": [
"Pretrained model on English language using a masked language modeling (MLM) objective. It was introduced in\n",
"[this paper](https://arxiv.org/abs/1810.04805) and first released in\n",
"[this repository](https://github.com/google-research/bert). This model is uncased: it does not make a difference\n",
"between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "632646c9",
"metadata": {},
"source": [
"Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by\n",
"the Hugging Face team.\n"
]
},
{
"cell_type": "markdown",
"id": "6d37733d",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "20eb0099",
"metadata": {},
"source": [
"BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it\n",
"was pretrained on the raw texts only, with no humans labeling them in any way (which is why it can use lots of\n",
"publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it\n",
"was pretrained with two objectives:\n"
]
},
{
"cell_type": "markdown",
"id": "a43bc44c",
"metadata": {},
"source": [
"- Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run\n",
"the entire masked sentence through the model and has to predict the masked words. This is different from traditional\n",
"recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like\n",
"GPT which internally masks the future tokens. It allows the model to learn a bidirectional representation of the\n",
"sentence.\n",
"- Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes\n",
"they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to\n",
"predict if the two sentences were following each other or not.\n"
]
},
{
"cell_type": "markdown",
"id": "3ea31760",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the English language that can then be used to extract features\n",
"useful for downstream tasks: if you have a dataset of labeled sentences, for instance, you can train a standard\n",
"classifier using the features produced by the BERT model as inputs.\n"
]
},
{
"cell_type": "markdown",
"id": "c44e01b0",
"metadata": {},
"source": [
"## Model variations\n"
]
},
{
"cell_type": "markdown",
"id": "6cb3e530",
"metadata": {},
"source": [
"BERT has originally been released in base and large variations, for cased and uncased input text. The uncased models also strips out an accent markers.\n",
"Chinese and multilingual uncased and cased versions followed shortly after.\n",
"Modified preprocessing with whole word masking has replaced subpiece masking in a following work, with the release of two models.\n",
"Other 24 smaller models are released afterward.\n"
]
},
{
"cell_type": "markdown",
"id": "557a417a",
"metadata": {},
"source": [
"The detailed release history can be found on the [google-research/bert readme](https://github.com/google-research/bert/blob/master/README.md) on github.\n"
]
},
{
"cell_type": "markdown",
"id": "0f4bf9e0",
"metadata": {},
"source": [
"| Model | #params | Language |\n",
"|------------------------|--------------------------------|-------|\n",
"| [`bert-base-uncased`](https://huggingface.co/bert-base-uncased) | 110M | English |\n",
"| [`bert-large-uncased`](https://huggingface.co/bert-large-uncased) | 340M | English | sub\n",
"| [`bert-base-cased`](https://huggingface.co/bert-base-cased) | 110M | English |\n",
"| [`bert-large-cased`](https://huggingface.co/bert-large-cased) | 340M | English |\n",
"| [`bert-base-chinese`](https://huggingface.co/bert-base-chinese) | 110M | Chinese |\n",
"| [`bert-base-multilingual-cased`](https://huggingface.co/bert-base-multilingual-cased) | 110M | Multiple |\n",
"| [`bert-large-uncased-whole-word-masking`](https://huggingface.co/bert-large-uncased-whole-word-masking) | 340M | English |\n",
"| [`bert-large-cased-whole-word-masking`](https://huggingface.co/bert-large-cased-whole-word-masking) | 340M | English |\n"
]
},
{
"cell_type": "markdown",
"id": "909c1c8d",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "68db3da7",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "04d6a56d",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-base-uncased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "76d1a4dc",
"metadata": {},
"source": [
"```\n",
"@article{DBLP:journals/corr/abs-1810-04805,\n",
"author = {Jacob Devlin and\n",
"Ming{-}Wei Chang and\n",
"Kenton Lee and\n",
"Kristina Toutanova},\n",
"title = {{BERT:} Pre-training of Deep Bidirectional Transformers for Language\n",
"Understanding},\n",
"journal = {CoRR},\n",
"volume = {abs/1810.04805},\n",
"year = {2018},\n",
"url = {http://arxiv.org/abs/1810.04805},\n",
"archivePrefix = {arXiv},\n",
"eprint = {1810.04805},\n",
"timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},\n",
"biburl = {https://dblp.org/rec/journals/corr/abs-1810-04805.bib},\n",
"bibsource = {dblp computer science bibliography, https://dblp.org}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "1bcee897",
"metadata": {},
"source": [
"<a href=\"https://huggingface.co/exbert/?model=bert-base-uncased\">\n",
"<img width=\"300px\" src=\"https://cdn-media.huggingface.co/exbert/button.png\">\n",
"</a>\n",
"\n",
"> The model introduction and model weights originate from https://huggingface.co/bert-base-uncased and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: bookcorpus,wikipedia
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info:
name: "bert-large-cased-whole-word-masking-finetuned-squad"
description: "BERT large model (cased) whole word masking finetuned on SQuAD"
description_en: "BERT large model (cased) whole word masking finetuned on SQuAD"
icon: ""
from_repo: "https://huggingface.co/bert-large-cased-whole-word-masking-finetuned-squad"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Question Answering"
sub_tag: "回答问题"
Example:
Datasets: "bookcorpus,wikipedia"
Publisher: "huggingface"
License: "apache-2.0"
Language: "English"
description: BERT large model (cased) whole word masking finetuned on SQuAD
description_en: BERT large model (cased) whole word masking finetuned on SQuAD
from_repo: https://huggingface.co/bert-large-cased-whole-word-masking-finetuned-squad
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: bert-large-cased-whole-word-masking-finetuned-squad
Paper:
- title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding'
url: 'http://arxiv.org/abs/1810.04805v2'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding'
url: http://arxiv.org/abs/1810.04805v2
Publisher: huggingface
Task:
- sub_tag: 回答问题
sub_tag_en: Question Answering
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "7b02f8e4",
"metadata": {},
"source": [
"# BERT large model (cased) whole word masking finetuned on SQuAD\n",
"\n",
"详细内容请看[Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "7804aeec",
"metadata": {},
"source": [
"Pretrained model on English language using a masked language modeling (MLM) objective. It was introduced in\n",
"[this paper](https://arxiv.org/abs/1810.04805) and first released in\n",
"[this repository](https://github.com/google-research/bert). This model is cased: it makes a difference between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "9ee7c4ee",
"metadata": {},
"source": [
"Differently to other BERT models, this model was trained with a new technique: Whole Word Masking. In this case, all of the tokens corresponding to a word are masked at once. The overall masking rate remains the same.\n"
]
},
{
"cell_type": "markdown",
"id": "2198ff25",
"metadata": {},
"source": [
"The training is identical -- each masked WordPiece token is predicted independently.\n"
]
},
{
"cell_type": "markdown",
"id": "159c04c3",
"metadata": {},
"source": [
"After pre-training, this model was fine-tuned on the SQuAD dataset with one of our fine-tuning scripts. See below for more information regarding this fine-tuning.\n"
]
},
{
"cell_type": "markdown",
"id": "cec53443",
"metadata": {},
"source": [
"Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by\n",
"the Hugging Face team.\n"
]
},
{
"cell_type": "markdown",
"id": "0a6d113e",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "3776a729",
"metadata": {},
"source": [
"BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it\n",
"was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of\n",
"publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it\n",
"was pretrained with two objectives:\n"
]
},
{
"cell_type": "markdown",
"id": "8ef5e147",
"metadata": {},
"source": [
"- Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run\n",
"the entire masked sentence through the model and has to predict the masked words. This is different from traditional\n",
"recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like\n",
"GPT which internally mask the future tokens. It allows the model to learn a bidirectional representation of the\n",
"sentence.\n",
"- Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes\n",
"they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to\n",
"predict if the two sentences were following each other or not.\n"
]
},
{
"cell_type": "markdown",
"id": "f494c97f",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the English language that can then be used to extract features\n",
"useful for downstream tasks: if you have a dataset of labeled sentences for instance, you can train a standard\n",
"classifier using the features produced by the BERT model as inputs.\n"
]
},
{
"cell_type": "markdown",
"id": "eeffccad",
"metadata": {},
"source": [
"This model has the following configuration:\n"
]
},
{
"cell_type": "markdown",
"id": "7754e7ed",
"metadata": {},
"source": [
"- 24-layer\n",
"- 1024 hidden dimension\n",
"- 16 attention heads\n",
"- 336M parameters.\n"
]
},
{
"cell_type": "markdown",
"id": "dc30e3d4",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6c0a8e7e",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3eb39f84",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-large-cased-whole-word-masking-finetuned-squad\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "82b3ff37",
"metadata": {},
"source": [
"```\n",
"@article{DBLP:journals/corr/abs-1810-04805,\n",
"author = {Jacob Devlin and\n",
"Ming{-}Wei Chang and\n",
"Kenton Lee and\n",
"Kristina Toutanova},\n",
"title = {{BERT:} Pre-training of Deep Bidirectional Transformers for Language\n",
"Understanding},\n",
"journal = {CoRR},\n",
"volume = {abs/1810.04805},\n",
"year = {2018},\n",
"url = {http://arxiv.org/abs/1810.04805},\n",
"archivePrefix = {arXiv},\n",
"eprint = {1810.04805},\n",
"timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},\n",
"biburl = {https://dblp.org/rec/journals/corr/abs-1810-04805.bib},\n",
"bibsource = {dblp computer science bibliography, https://dblp.org}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "cb789a5a",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于 https://huggingface.co/bert-large-cased-whole-word-masking-finetuned-squad ,并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "1dfd11c1",
"metadata": {},
"source": [
"# BERT large model (cased) whole word masking finetuned on SQuAD\n",
"\n",
"You can get more details from [Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "7105fb8c",
"metadata": {},
"source": [
"Pretrained model on English language using a masked language modeling (MLM) objective. It was introduced in\n",
"[this paper](https://arxiv.org/abs/1810.04805) and first released in\n",
"[this repository](https://github.com/google-research/bert). This model is cased: it makes a difference between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "e3d8b394",
"metadata": {},
"source": [
"Differently to other BERT models, this model was trained with a new technique: Whole Word Masking. In this case, all of the tokens corresponding to a word are masked at once. The overall masking rate remains the same.\n"
]
},
{
"cell_type": "markdown",
"id": "be078628",
"metadata": {},
"source": [
"The training is identical -- each masked WordPiece token is predicted independently.\n"
]
},
{
"cell_type": "markdown",
"id": "278aee7f",
"metadata": {},
"source": [
"After pre-training, this model was fine-tuned on the SQuAD dataset with one of our fine-tuning scripts. See below for more information regarding this fine-tuning.\n"
]
},
{
"cell_type": "markdown",
"id": "ce69aca2",
"metadata": {},
"source": [
"Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by\n",
"the Hugging Face team.\n"
]
},
{
"cell_type": "markdown",
"id": "89b52c17",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "2ebe9e94",
"metadata": {},
"source": [
"BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it\n",
"was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of\n",
"publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it\n",
"was pretrained with two objectives:\n"
]
},
{
"cell_type": "markdown",
"id": "7131c024",
"metadata": {},
"source": [
"- Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run\n",
"the entire masked sentence through the model and has to predict the masked words. This is different from traditional\n",
"recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like\n",
"GPT which internally mask the future tokens. It allows the model to learn a bidirectional representation of the\n",
"sentence.\n",
"- Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes\n",
"they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to\n",
"predict if the two sentences were following each other or not.\n"
]
},
{
"cell_type": "markdown",
"id": "4a8e4aea",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the English language that can then be used to extract features\n",
"useful for downstream tasks: if you have a dataset of labeled sentences for instance, you can train a standard\n",
"classifier using the features produced by the BERT model as inputs.\n"
]
},
{
"cell_type": "markdown",
"id": "717dd1f6",
"metadata": {},
"source": [
"This model has the following configuration:\n"
]
},
{
"cell_type": "markdown",
"id": "6778930f",
"metadata": {},
"source": [
"- 24-layer\n",
"- 1024 hidden dimension\n",
"- 16 attention heads\n",
"- 336M parameters.\n"
]
},
{
"cell_type": "markdown",
"id": "1ffc0609",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "678acd58",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c5318a0c",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-large-cased-whole-word-masking-finetuned-squad\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "f930fd97",
"metadata": {},
"source": [
"```\n",
"@article{DBLP:journals/corr/abs-1810-04805,\n",
"author = {Jacob Devlin and\n",
"Ming{-}Wei Chang and\n",
"Kenton Lee and\n",
"Kristina Toutanova},\n",
"title = {{BERT:} Pre-training of Deep Bidirectional Transformers for Language\n",
"Understanding},\n",
"journal = {CoRR},\n",
"volume = {abs/1810.04805},\n",
"year = {2018},\n",
"url = {http://arxiv.org/abs/1810.04805},\n",
"archivePrefix = {arXiv},\n",
"eprint = {1810.04805},\n",
"timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},\n",
"biburl = {https://dblp.org/rec/journals/corr/abs-1810-04805.bib},\n",
"bibsource = {dblp computer science bibliography, https://dblp.org}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "b3240bd3",
"metadata": {},
"source": [
"> The model introduction and model weights originate from https://huggingface.co/bert-large-cased-whole-word-masking-finetuned-squad and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: bookcorpus,wikipedia
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info:
name: "bert-large-cased-whole-word-masking"
description: "BERT large model (cased) whole word masking"
description_en: "BERT large model (cased) whole word masking"
icon: ""
from_repo: "https://huggingface.co/bert-large-cased-whole-word-masking"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: "bookcorpus,wikipedia"
Publisher: "huggingface"
License: "apache-2.0"
Language: "English"
description: BERT large model (cased) whole word masking
description_en: BERT large model (cased) whole word masking
from_repo: https://huggingface.co/bert-large-cased-whole-word-masking
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: bert-large-cased-whole-word-masking
Paper:
- title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding'
url: 'http://arxiv.org/abs/1810.04805v2'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding'
url: http://arxiv.org/abs/1810.04805v2
Publisher: huggingface
Task:
- sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "1d5ffd6a",
"metadata": {},
"source": [
"# BERT large model (cased) whole word masking\n",
"\n",
"详细内容请看[Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "9e7590bd",
"metadata": {},
"source": [
"Pretrained model on English language using a masked language modeling (MLM) objective. It was introduced in\n",
"[this paper](https://arxiv.org/abs/1810.04805) and first released in\n",
"[this repository](https://github.com/google-research/bert). This model is cased: it makes a difference between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "2456751c",
"metadata": {},
"source": [
"Differently to other BERT models, this model was trained with a new technique: Whole Word Masking. In this case, all of the tokens corresponding to a word are masked at once. The overall masking rate remains the same.\n"
]
},
{
"cell_type": "markdown",
"id": "204d6ee6",
"metadata": {},
"source": [
"The training is identical -- each masked WordPiece token is predicted independently.\n"
]
},
{
"cell_type": "markdown",
"id": "743ff269",
"metadata": {},
"source": [
"Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by\n",
"the Hugging Face team.\n"
]
},
{
"cell_type": "markdown",
"id": "bce1ffcc",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "d5d83b7c",
"metadata": {},
"source": [
"BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it\n",
"was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of\n",
"publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it\n",
"was pretrained with two objectives:\n"
]
},
{
"cell_type": "markdown",
"id": "38a98598",
"metadata": {},
"source": [
"- Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run\n",
"the entire masked sentence through the model and has to predict the masked words. This is different from traditional\n",
"recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like\n",
"GPT which internally mask the future tokens. It allows the model to learn a bidirectional representation of the\n",
"sentence.\n",
"- Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes\n",
"they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to\n",
"predict if the two sentences were following each other or not.\n"
]
},
{
"cell_type": "markdown",
"id": "89b5e554",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the English language that can then be used to extract features\n",
"useful for downstream tasks: if you have a dataset of labeled sentences for instance, you can train a standard\n",
"classifier using the features produced by the BERT model as inputs.\n"
]
},
{
"cell_type": "markdown",
"id": "3f205174",
"metadata": {},
"source": [
"This model has the following configuration:\n"
]
},
{
"cell_type": "markdown",
"id": "6b9cf751",
"metadata": {},
"source": [
"- 24-layer\n",
"- 1024 hidden dimension\n",
"- 16 attention heads\n",
"- 336M parameters.\n"
]
},
{
"cell_type": "markdown",
"id": "74a0400e",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d8952fcc",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "365e04c2",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-large-cased-whole-word-masking\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "1cef8f18",
"metadata": {},
"source": [
"```\n",
"@article{DBLP:journals/corr/abs-1810-04805,\n",
"author = {Jacob Devlin and\n",
"Ming{-}Wei Chang and\n",
"Kenton Lee and\n",
"Kristina Toutanova},\n",
"title = {{BERT:} Pre-training of Deep Bidirectional Transformers for Language\n",
"Understanding},\n",
"journal = {CoRR},\n",
"volume = {abs/1810.04805},\n",
"year = {2018},\n",
"url = {http://arxiv.org/abs/1810.04805},\n",
"archivePrefix = {arXiv},\n",
"eprint = {1810.04805},\n",
"timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},\n",
"biburl = {https://dblp.org/rec/journals/corr/abs-1810-04805.bib},\n",
"bibsource = {dblp computer science bibliography, https://dblp.org}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "0d54ff2d",
"metadata": {},
"source": [
"\n",
"> 此模型介绍及权重来源于 https://huggingface.co/bert-large-cased-whole-word-masking ,并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "58f64e54",
"metadata": {},
"source": [
"# BERT large model (cased) whole word masking\n",
"\n",
"You can get more details from [Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "6814fe73",
"metadata": {},
"source": [
"Pretrained model on English language using a masked language modeling (MLM) objective. It was introduced in\n",
"[this paper](https://arxiv.org/abs/1810.04805) and first released in\n",
"[this repository](https://github.com/google-research/bert). This model is cased: it makes a difference between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "7a6b1b28",
"metadata": {},
"source": [
"Differently to other BERT models, this model was trained with a new technique: Whole Word Masking. In this case, all of the tokens corresponding to a word are masked at once. The overall masking rate remains the same.\n"
]
},
{
"cell_type": "markdown",
"id": "e6c8ddc5",
"metadata": {},
"source": [
"The training is identical -- each masked WordPiece token is predicted independently.\n"
]
},
{
"cell_type": "markdown",
"id": "dfcd9c6b",
"metadata": {},
"source": [
"Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by\n",
"the Hugging Face team.\n"
]
},
{
"cell_type": "markdown",
"id": "d758dbd9",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "c4e44287",
"metadata": {},
"source": [
"BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it\n",
"was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of\n",
"publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it\n",
"was pretrained with two objectives:\n"
]
},
{
"cell_type": "markdown",
"id": "d07abc2a",
"metadata": {},
"source": [
"- Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run\n",
"the entire masked sentence through the model and has to predict the masked words. This is different from traditional\n",
"recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like\n",
"GPT which internally mask the future tokens. It allows the model to learn a bidirectional representation of the\n",
"sentence.\n",
"- Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes\n",
"they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to\n",
"predict if the two sentences were following each other or not.\n"
]
},
{
"cell_type": "markdown",
"id": "5fcb83d6",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the English language that can then be used to extract features\n",
"useful for downstream tasks: if you have a dataset of labeled sentences for instance, you can train a standard\n",
"classifier using the features produced by the BERT model as inputs.\n"
]
},
{
"cell_type": "markdown",
"id": "1be2f6a5",
"metadata": {},
"source": [
"This model has the following configuration:\n"
]
},
{
"cell_type": "markdown",
"id": "cd047a65",
"metadata": {},
"source": [
"- 24-layer\n",
"- 1024 hidden dimension\n",
"- 16 attention heads\n",
"- 336M parameters.\n"
]
},
{
"cell_type": "markdown",
"id": "93925c79",
"metadata": {},
"source": [
"## Intended uses & limitations\n"
]
},
{
"cell_type": "markdown",
"id": "f6c1f9b9",
"metadata": {},
"source": [
"You can use the raw model for either masked language modeling or next sentence prediction, but it's mostly intended to\n",
"be fine-tuned on a downstream task. See the [model hub](https://huggingface.co/models?filter=bert) to look for\n",
"fine-tuned versions on a task that interests you.\n"
]
},
{
"cell_type": "markdown",
"id": "a682ee5c",
"metadata": {},
"source": [
"Note that this model is primarily aimed at being fine-tuned on tasks that use the whole sentence (potentially masked)\n",
"to make decisions, such as sequence classification, token classification or question answering. For tasks such as text\n",
"generation you should look at model like GPT2.\n"
]
},
{
"cell_type": "markdown",
"id": "394e6456",
"metadata": {},
"source": [
"### How to use\n"
]
},
{
"cell_type": "markdown",
"id": "9e5fdb9a",
"metadata": {},
"source": [
"You can use this model directly with a pipeline for masked language modeling:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "77af91fe",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ae5caf8d",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-large-cased-whole-word-masking\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "0f43705d",
"metadata": {},
"source": [
"```\n",
"@article{DBLP:journals/corr/abs-1810-04805,\n",
"author = {Jacob Devlin and\n",
"Ming{-}Wei Chang and\n",
"Kenton Lee and\n",
"Kristina Toutanova},\n",
"title = {{BERT:} Pre-training of Deep Bidirectional Transformers for Language\n",
"Understanding},\n",
"journal = {CoRR},\n",
"volume = {abs/1810.04805},\n",
"year = {2018},\n",
"url = {http://arxiv.org/abs/1810.04805},\n",
"archivePrefix = {arXiv},\n",
"eprint = {1810.04805},\n",
"timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},\n",
"biburl = {https://dblp.org/rec/journals/corr/abs-1810-04805.bib},\n",
"bibsource = {dblp computer science bibliography, https://dblp.org}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "54ae4165",
"metadata": {},
"source": [
"\n",
"> The model introduction and model weights originate from https://huggingface.co/bert-large-cased-whole-word-masking and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: bookcorpus,wikipedia
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info:
name: "bert-large-cased"
description: "BERT large model (cased)"
description_en: "BERT large model (cased)"
icon: ""
from_repo: "https://huggingface.co/bert-large-cased"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: "bookcorpus,wikipedia"
Publisher: "huggingface"
License: "apache-2.0"
Language: "English"
description: BERT large model (cased)
description_en: BERT large model (cased)
from_repo: https://huggingface.co/bert-large-cased
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: bert-large-cased
Paper:
- title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding'
url: 'http://arxiv.org/abs/1810.04805v2'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding'
url: http://arxiv.org/abs/1810.04805v2
Publisher: huggingface
Task:
- sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "360e146a",
"metadata": {},
"source": [
"# BERT large model (cased)\n",
"\n",
"详细内容请看[Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "bb3eb868",
"metadata": {},
"source": [
"Pretrained model on English language using a masked language modeling (MLM) objective. It was introduced in\n",
"[this paper](https://arxiv.org/abs/1810.04805) and first released in\n",
"[this repository](https://github.com/google-research/bert). This model is cased: it makes a difference\n",
"between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "0f512012",
"metadata": {},
"source": [
"Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by\n",
"the Hugging Face team.\n"
]
},
{
"cell_type": "markdown",
"id": "8dfae0e4",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "29d97a32",
"metadata": {},
"source": [
"BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it\n",
"was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of\n",
"publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it\n",
"was pretrained with two objectives:\n"
]
},
{
"cell_type": "markdown",
"id": "84dd3c36",
"metadata": {},
"source": [
"- Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run\n",
"the entire masked sentence through the model and has to predict the masked words. This is different from traditional\n",
"recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like\n",
"GPT which internally mask the future tokens. It allows the model to learn a bidirectional representation of the\n",
"sentence.\n",
"- Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes\n",
"they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to\n",
"predict if the two sentences were following each other or not.\n"
]
},
{
"cell_type": "markdown",
"id": "dbb66981",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the English language that can then be used to extract features\n",
"useful for downstream tasks: if you have a dataset of labeled sentences for instance, you can train a standard\n",
"classifier using the features produced by the BERT model as inputs.\n"
]
},
{
"cell_type": "markdown",
"id": "4a3d9a5c",
"metadata": {},
"source": [
"This model has the following configuration:\n"
]
},
{
"cell_type": "markdown",
"id": "85a286cd",
"metadata": {},
"source": [
"- 24-layer\n",
"- 1024 hidden dimension\n",
"- 16 attention heads\n",
"- 336M parameters.\n"
]
},
{
"cell_type": "markdown",
"id": "e9f5c5f1",
"metadata": {},
"source": [
"## Intended uses & limitations\n"
]
},
{
"cell_type": "markdown",
"id": "d3ae1617",
"metadata": {},
"source": [
"You can use the raw model for either masked language modeling or next sentence prediction, but it's mostly intended to\n",
"be fine-tuned on a downstream task. See the [model hub](https://huggingface.co/models?filter=bert) to look for\n",
"fine-tuned versions on a task that interests you.\n"
]
},
{
"cell_type": "markdown",
"id": "1d814aa3",
"metadata": {},
"source": [
"Note that this model is primarily aimed at being fine-tuned on tasks that use the whole sentence (potentially masked)\n",
"to make decisions, such as sequence classification, token classification or question answering. For tasks such as text\n",
"generation you should look at model like GPT2.\n"
]
},
{
"cell_type": "markdown",
"id": "7c9cb698",
"metadata": {},
"source": [
"### How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "266349de",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e0d0fb84",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-large-cased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "d58fffcd",
"metadata": {},
"source": [
"```\n",
"@article{DBLP:journals/corr/abs-1810-04805,\n",
"author = {Jacob Devlin and\n",
"Ming{-}Wei Chang and\n",
"Kenton Lee and\n",
"Kristina Toutanova},\n",
"title = {{BERT:} Pre-training of Deep Bidirectional Transformers for Language\n",
"Understanding},\n",
"journal = {CoRR},\n",
"volume = {abs/1810.04805},\n",
"year = {2018},\n",
"url = {http://arxiv.org/abs/1810.04805},\n",
"archivePrefix = {arXiv},\n",
"eprint = {1810.04805},\n",
"timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},\n",
"biburl = {https://dblp.org/rec/journals/corr/abs-1810-04805.bib},\n",
"bibsource = {dblp computer science bibliography, https://dblp.org}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "8591ee7f",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于 https://huggingface.co/bert-large-cased ,并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "2460ffb6",
"metadata": {},
"source": [
"# BERT large model (cased)\n",
"\n",
"You can get more details from [Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "07c2aecf",
"metadata": {},
"source": [
"Pretrained model on English language using a masked language modeling (MLM) objective. It was introduced in\n",
"[this paper](https://arxiv.org/abs/1810.04805) and first released in\n",
"[this repository](https://github.com/google-research/bert). This model is cased: it makes a difference\n",
"between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "fb6201f0",
"metadata": {},
"source": [
"Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by\n",
"the Hugging Face team.\n"
]
},
{
"cell_type": "markdown",
"id": "ffd4c0b9",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "0b465123",
"metadata": {},
"source": [
"BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it\n",
"was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of\n",
"publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it\n",
"was pretrained with two objectives:\n"
]
},
{
"cell_type": "markdown",
"id": "7a5eb557",
"metadata": {},
"source": [
"- Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run\n",
"the entire masked sentence through the model and has to predict the masked words. This is different from traditional\n",
"recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like\n",
"GPT which internally mask the future tokens. It allows the model to learn a bidirectional representation of the\n",
"sentence.\n",
"- Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes\n",
"they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to\n",
"predict if the two sentences were following each other or not.\n"
]
},
{
"cell_type": "markdown",
"id": "d40678bb",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the English language that can then be used to extract features\n",
"useful for downstream tasks: if you have a dataset of labeled sentences for instance, you can train a standard\n",
"classifier using the features produced by the BERT model as inputs.\n"
]
},
{
"cell_type": "markdown",
"id": "8fc24335",
"metadata": {},
"source": [
"This model has the following configuration:\n"
]
},
{
"cell_type": "markdown",
"id": "355e9553",
"metadata": {},
"source": [
"- 24-layer\n",
"- 1024 hidden dimension\n",
"- 16 attention heads\n",
"- 336M parameters.\n"
]
},
{
"cell_type": "markdown",
"id": "47e2e497",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c4d80b50",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f73f3925",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-large-cased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "2873617b",
"metadata": {},
"source": [
"```\n",
"@article{DBLP:journals/corr/abs-1810-04805,\n",
"author = {Jacob Devlin and\n",
"Ming{-}Wei Chang and\n",
"Kenton Lee and\n",
"Kristina Toutanova},\n",
"title = {{BERT:} Pre-training of Deep Bidirectional Transformers for Language\n",
"Understanding},\n",
"journal = {CoRR},\n",
"volume = {abs/1810.04805},\n",
"year = {2018},\n",
"url = {http://arxiv.org/abs/1810.04805},\n",
"archivePrefix = {arXiv},\n",
"eprint = {1810.04805},\n",
"timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},\n",
"biburl = {https://dblp.org/rec/journals/corr/abs-1810-04805.bib},\n",
"bibsource = {dblp computer science bibliography, https://dblp.org}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "bc4aea4d",
"metadata": {},
"source": [
"> The model introduction and model weights originate from https://huggingface.co/bert-large-cased and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: bookcorpus,wikipedia
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info:
name: "bert-large-uncased-whole-word-masking-finetuned-squad"
description: "BERT large model (uncased) whole word masking finetuned on SQuAD"
description_en: "BERT large model (uncased) whole word masking finetuned on SQuAD"
icon: ""
from_repo: "https://huggingface.co/bert-large-uncased-whole-word-masking-finetuned-squad"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Question Answering"
sub_tag: "回答问题"
Example:
Datasets: "bookcorpus,wikipedia"
Publisher: "huggingface"
License: "apache-2.0"
Language: "English"
description: BERT large model (uncased) whole word masking finetuned on SQuAD
description_en: BERT large model (uncased) whole word masking finetuned on SQuAD
from_repo: https://huggingface.co/bert-large-uncased-whole-word-masking-finetuned-squad
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: bert-large-uncased-whole-word-masking-finetuned-squad
Paper:
- title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding'
url: 'http://arxiv.org/abs/1810.04805v2'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding'
url: http://arxiv.org/abs/1810.04805v2
Publisher: huggingface
Task:
- sub_tag: 回答问题
sub_tag_en: Question Answering
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "aad9532a",
"metadata": {},
"source": [
"# BERT large model (uncased) whole word masking finetuned on SQuAD\n"
]
},
{
"cell_type": "markdown",
"id": "724df271",
"metadata": {},
"source": [
"Pretrained model on English language using a masked language modeling (MLM) objective. It was introduced in\n",
"[this paper](https://arxiv.org/abs/1810.04805) and first released in\n",
"[this repository](https://github.com/google-research/bert). This model is uncased: it does not make a difference\n",
"between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "f2b9e3bf",
"metadata": {},
"source": [
"Differently to other BERT models, this model was trained with a new technique: Whole Word Masking. In this case, all of the tokens corresponding to a word are masked at once. The overall masking rate remains the same.\n"
]
},
{
"cell_type": "markdown",
"id": "6566eb12",
"metadata": {},
"source": [
"The training is identical -- each masked WordPiece token is predicted independently.\n"
]
},
{
"cell_type": "markdown",
"id": "7b45422b",
"metadata": {},
"source": [
"After pre-training, this model was fine-tuned on the SQuAD dataset with one of our fine-tuning scripts. See below for more information regarding this fine-tuning.\n"
]
},
{
"cell_type": "markdown",
"id": "c9957f91",
"metadata": {},
"source": [
"Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by\n",
"the Hugging Face team.\n"
]
},
{
"cell_type": "markdown",
"id": "43cba468",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "457bfeee",
"metadata": {},
"source": [
"BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it\n",
"was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of\n",
"publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it\n",
"was pretrained with two objectives:\n"
]
},
{
"cell_type": "markdown",
"id": "77c83270",
"metadata": {},
"source": [
"- Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run\n",
"the entire masked sentence through the model and has to predict the masked words. This is different from traditional\n",
"recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like\n",
"GPT which internally mask the future tokens. It allows the model to learn a bidirectional representation of the\n",
"sentence.\n",
"- Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes\n",
"they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to\n",
"predict if the two sentences were following each other or not.\n"
]
},
{
"cell_type": "markdown",
"id": "0ba87de6",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the English language that can then be used to extract features\n",
"useful for downstream tasks: if you have a dataset of labeled sentences for instance, you can train a standard\n",
"classifier using the features produced by the BERT model as inputs.\n"
]
},
{
"cell_type": "markdown",
"id": "f363132f",
"metadata": {},
"source": [
"This model has the following configuration:\n"
]
},
{
"cell_type": "markdown",
"id": "83a4e49f",
"metadata": {},
"source": [
"- 24-layer\n",
"- 1024 hidden dimension\n",
"- 16 attention heads\n",
"- 336M parameters.\n"
]
},
{
"cell_type": "markdown",
"id": "68565c6d",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "457a1c54",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a9369c0d",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-large-uncased-whole-word-masking-finetuned-squad\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "c5fefb8f",
"metadata": {},
"source": [
"```\n",
"@article{DBLP:journals/corr/abs-1810-04805,\n",
"author = {Jacob Devlin and\n",
"Ming{-}Wei Chang and\n",
"Kenton Lee and\n",
"Kristina Toutanova},\n",
"title = {{BERT:} Pre-training of Deep Bidirectional Transformers for Language\n",
"Understanding},\n",
"journal = {CoRR},\n",
"volume = {abs/1810.04805},\n",
"year = {2018},\n",
"url = {http://arxiv.org/abs/1810.04805},\n",
"archivePrefix = {arXiv},\n",
"eprint = {1810.04805},\n",
"timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},\n",
"biburl = {https://dblp.org/rec/journals/corr/abs-1810-04805.bib},\n",
"bibsource = {dblp computer science bibliography, https://dblp.org}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "654c0920",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于 https://huggingface.co/bert-large-uncased-whole-word-masking-finetuned-squad ,并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "2d4f4368",
"metadata": {},
"source": [
"# BERT large model (uncased) whole word masking finetuned on SQuAD\n",
"\n",
"You can get more details from [Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "afef45e0",
"metadata": {},
"source": [
"Pretrained model on English language using a masked language modeling (MLM) objective. It was introduced in\n",
"[this paper](https://arxiv.org/abs/1810.04805) and first released in\n",
"[this repository](https://github.com/google-research/bert). This model is uncased: it does not make a difference\n",
"between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "c94536b9",
"metadata": {},
"source": [
"Differently to other BERT models, this model was trained with a new technique: Whole Word Masking. In this case, all of the tokens corresponding to a word are masked at once. The overall masking rate remains the same.\n"
]
},
{
"cell_type": "markdown",
"id": "50254dea",
"metadata": {},
"source": [
"The training is identical -- each masked WordPiece token is predicted independently.\n"
]
},
{
"cell_type": "markdown",
"id": "4b482be9",
"metadata": {},
"source": [
"After pre-training, this model was fine-tuned on the SQuAD dataset with one of our fine-tuning scripts. See below for more information regarding this fine-tuning.\n"
]
},
{
"cell_type": "markdown",
"id": "adfc36af",
"metadata": {},
"source": [
"Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by\n",
"the Hugging Face team.\n"
]
},
{
"cell_type": "markdown",
"id": "22f554a7",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "eccd3048",
"metadata": {},
"source": [
"BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it\n",
"was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of\n",
"publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it\n",
"was pretrained with two objectives:\n"
]
},
{
"cell_type": "markdown",
"id": "3d4098e8",
"metadata": {},
"source": [
"- Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run\n",
"the entire masked sentence through the model and has to predict the masked words. This is different from traditional\n",
"recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like\n",
"GPT which internally mask the future tokens. It allows the model to learn a bidirectional representation of the\n",
"sentence.\n",
"- Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes\n",
"they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to\n",
"predict if the two sentences were following each other or not.\n"
]
},
{
"cell_type": "markdown",
"id": "1047d1ad",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the English language that can then be used to extract features\n",
"useful for downstream tasks: if you have a dataset of labeled sentences for instance, you can train a standard\n",
"classifier using the features produced by the BERT model as inputs.\n"
]
},
{
"cell_type": "markdown",
"id": "7046db0c",
"metadata": {},
"source": [
"This model has the following configuration:\n"
]
},
{
"cell_type": "markdown",
"id": "09659088",
"metadata": {},
"source": [
"- 24-layer\n",
"- 1024 hidden dimension\n",
"- 16 attention heads\n",
"- 336M parameters.\n"
]
},
{
"cell_type": "markdown",
"id": "65769919",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4449cfac",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1e8dcf70",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-large-uncased-whole-word-masking-finetuned-squad\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "49471f4b",
"metadata": {},
"source": [
"```\n",
"@article{DBLP:journals/corr/abs-1810-04805,\n",
"author = {Jacob Devlin and\n",
"Ming{-}Wei Chang and\n",
"Kenton Lee and\n",
"Kristina Toutanova},\n",
"title = {{BERT:} Pre-training of Deep Bidirectional Transformers for Language\n",
"Understanding},\n",
"journal = {CoRR},\n",
"volume = {abs/1810.04805},\n",
"year = {2018},\n",
"url = {http://arxiv.org/abs/1810.04805},\n",
"archivePrefix = {arXiv},\n",
"eprint = {1810.04805},\n",
"timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},\n",
"biburl = {https://dblp.org/rec/journals/corr/abs-1810-04805.bib},\n",
"bibsource = {dblp computer science bibliography, https://dblp.org}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "d783c8fc",
"metadata": {},
"source": [
"> The model introduction and model weights originate from https://huggingface.co/bert-large-uncased-whole-word-masking-finetuned-squad and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: bookcorpus,wikipedia
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info:
name: "bert-large-uncased-whole-word-masking"
description: "BERT large model (uncased) whole word masking"
description_en: "BERT large model (uncased) whole word masking"
icon: ""
from_repo: "https://huggingface.co/bert-large-uncased-whole-word-masking"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: "bookcorpus,wikipedia"
Publisher: "huggingface"
License: "apache-2.0"
Language: "English"
description: BERT large model (uncased) whole word masking
description_en: BERT large model (uncased) whole word masking
from_repo: https://huggingface.co/bert-large-uncased-whole-word-masking
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: bert-large-uncased-whole-word-masking
Paper:
- title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding'
url: 'http://arxiv.org/abs/1810.04805v2'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding'
url: http://arxiv.org/abs/1810.04805v2
Publisher: huggingface
Task:
- sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "cf43e770",
"metadata": {},
"source": [
"# BERT large model (uncased) whole word masking\n",
"\n",
"详细内容请看[Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "af8c3816",
"metadata": {},
"source": [
"Pretrained model on English language using a masked language modeling (MLM) objective. It was introduced in\n",
"[this paper](https://arxiv.org/abs/1810.04805) and first released in\n",
"[this repository](https://github.com/google-research/bert). This model is uncased: it does not make a difference\n",
"between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "c103e84b",
"metadata": {},
"source": [
"Differently to other BERT models, this model was trained with a new technique: Whole Word Masking. In this case, all of the tokens corresponding to a word are masked at once. The overall masking rate remains the same.\n"
]
},
{
"cell_type": "markdown",
"id": "19a76368",
"metadata": {},
"source": [
"The training is identical -- each masked WordPiece token is predicted independently.\n"
]
},
{
"cell_type": "markdown",
"id": "67f11a2c",
"metadata": {},
"source": [
"Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by\n",
"the Hugging Face team.\n"
]
},
{
"cell_type": "markdown",
"id": "778cf97d",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "dddbb307",
"metadata": {},
"source": [
"BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it\n",
"was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of\n",
"publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it\n",
"was pretrained with two objectives:\n"
]
},
{
"cell_type": "markdown",
"id": "40becad1",
"metadata": {},
"source": [
"- Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run\n",
"the entire masked sentence through the model and has to predict the masked words. This is different from traditional\n",
"recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like\n",
"GPT which internally mask the future tokens. It allows the model to learn a bidirectional representation of the\n",
"sentence.\n",
"- Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes\n",
"they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to\n",
"predict if the two sentences were following each other or not.\n"
]
},
{
"cell_type": "markdown",
"id": "3fc265b6",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the English language that can then be used to extract features\n",
"useful for downstream tasks: if you have a dataset of labeled sentences for instance, you can train a standard\n",
"classifier using the features produced by the BERT model as inputs.\n"
]
},
{
"cell_type": "markdown",
"id": "65e4a308",
"metadata": {},
"source": [
"This model has the following configuration:\n"
]
},
{
"cell_type": "markdown",
"id": "6d0b86c1",
"metadata": {},
"source": [
"- 24-layer\n",
"- 1024 hidden dimension\n",
"- 16 attention heads\n",
"- 336M parameters.\n"
]
},
{
"cell_type": "markdown",
"id": "dd94b8be",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "bc669f99",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4580650d",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-large-uncased-whole-word-masking\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "475fd35d",
"metadata": {},
"source": [
"```\n",
"@article{DBLP:journals/corr/abs-1810-04805,\n",
"author = {Jacob Devlin and\n",
"Ming{-}Wei Chang and\n",
"Kenton Lee and\n",
"Kristina Toutanova},\n",
"title = {{BERT:} Pre-training of Deep Bidirectional Transformers for Language\n",
"Understanding},\n",
"journal = {CoRR},\n",
"volume = {abs/1810.04805},\n",
"year = {2018},\n",
"url = {http://arxiv.org/abs/1810.04805},\n",
"archivePrefix = {arXiv},\n",
"eprint = {1810.04805},\n",
"timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},\n",
"biburl = {https://dblp.org/rec/journals/corr/abs-1810-04805.bib},\n",
"bibsource = {dblp computer science bibliography, https://dblp.org}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "f09b9b09",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于 https://huggingface.co/bert-large-uncased-whole-word-masking ,并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "ceefe52d",
"metadata": {},
"source": [
"# BERT large model (uncased) whole word masking\n",
"\n",
"You can get more details from [Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "14552c09",
"metadata": {},
"source": [
"Pretrained model on English language using a masked language modeling (MLM) objective. It was introduced in\n",
"[this paper](https://arxiv.org/abs/1810.04805) and first released in\n",
"[this repository](https://github.com/google-research/bert). This model is uncased: it does not make a difference\n",
"between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "78d1e4a0",
"metadata": {},
"source": [
"Differently to other BERT models, this model was trained with a new technique: Whole Word Masking. In this case, all of the tokens corresponding to a word are masked at once. The overall masking rate remains the same.\n"
]
},
{
"cell_type": "markdown",
"id": "cdbe484a",
"metadata": {},
"source": [
"The training is identical -- each masked WordPiece token is predicted independently.\n"
]
},
{
"cell_type": "markdown",
"id": "fdbba80d",
"metadata": {},
"source": [
"Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by\n",
"the Hugging Face team.\n"
]
},
{
"cell_type": "markdown",
"id": "aba33624",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "459ca6e6",
"metadata": {},
"source": [
"BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it\n",
"was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of\n",
"publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it\n",
"was pretrained with two objectives:\n"
]
},
{
"cell_type": "markdown",
"id": "65f2ae1a",
"metadata": {},
"source": [
"- Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run\n",
"the entire masked sentence through the model and has to predict the masked words. This is different from traditional\n",
"recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like\n",
"GPT which internally mask the future tokens. It allows the model to learn a bidirectional representation of the\n",
"sentence.\n",
"- Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes\n",
"they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to\n",
"predict if the two sentences were following each other or not.\n"
]
},
{
"cell_type": "markdown",
"id": "86e8d7eb",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the English language that can then be used to extract features\n",
"useful for downstream tasks: if you have a dataset of labeled sentences for instance, you can train a standard\n",
"classifier using the features produced by the BERT model as inputs.\n"
]
},
{
"cell_type": "markdown",
"id": "b81821d8",
"metadata": {},
"source": [
"This model has the following configuration:\n"
]
},
{
"cell_type": "markdown",
"id": "3a576172",
"metadata": {},
"source": [
"- 24-layer\n",
"- 1024 hidden dimension\n",
"- 16 attention heads\n",
"- 336M parameters.\n"
]
},
{
"cell_type": "markdown",
"id": "0038f06c",
"metadata": {},
"source": [
"## Intended uses & limitations\n"
]
},
{
"cell_type": "markdown",
"id": "ba8c18de",
"metadata": {},
"source": [
"You can use the raw model for either masked language modeling or next sentence prediction, but it's mostly intended to\n",
"be fine-tuned on a downstream task. See the [model hub](https://huggingface.co/models?filter=bert) to look for\n",
"fine-tuned versions on a task that interests you.\n"
]
},
{
"cell_type": "markdown",
"id": "bb72ad39",
"metadata": {},
"source": [
"Note that this model is primarily aimed at being fine-tuned on tasks that use the whole sentence (potentially masked)\n",
"to make decisions, such as sequence classification, token classification or question answering. For tasks such as text\n",
"generation you should look at model like GPT2.\n"
]
},
{
"cell_type": "markdown",
"id": "54b59ca8",
"metadata": {},
"source": [
"### How to use\n"
]
},
{
"cell_type": "markdown",
"id": "a0ff2a80",
"metadata": {},
"source": [
"You can use this model directly with a pipeline for masked language modeling:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "990ce14a",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7d468ffb",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-large-uncased-whole-word-masking\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "93d6e9e4",
"metadata": {},
"source": [
"```\n",
"@article{DBLP:journals/corr/abs-1810-04805,\n",
"author = {Jacob Devlin and\n",
"Ming{-}Wei Chang and\n",
"Kenton Lee and\n",
"Kristina Toutanova},\n",
"title = {{BERT:} Pre-training of Deep Bidirectional Transformers for Language\n",
"Understanding},\n",
"journal = {CoRR},\n",
"volume = {abs/1810.04805},\n",
"year = {2018},\n",
"url = {http://arxiv.org/abs/1810.04805},\n",
"archivePrefix = {arXiv},\n",
"eprint = {1810.04805},\n",
"timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},\n",
"biburl = {https://dblp.org/rec/journals/corr/abs-1810-04805.bib},\n",
"bibsource = {dblp computer science bibliography, https://dblp.org}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "c9d05272",
"metadata": {},
"source": [
"> The model introduction and model weights originate from https://huggingface.co/bert-large-uncased-whole-word-masking and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: bookcorpus,wikipedia
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info:
name: "bert-large-uncased"
description: "BERT large model (uncased)"
description_en: "BERT large model (uncased)"
icon: ""
from_repo: "https://huggingface.co/bert-large-uncased"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: "bookcorpus,wikipedia"
Publisher: "huggingface"
License: "apache-2.0"
Language: "English"
description: BERT large model (uncased)
description_en: BERT large model (uncased)
from_repo: https://huggingface.co/bert-large-uncased
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: bert-large-uncased
Paper:
- title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding'
url: 'http://arxiv.org/abs/1810.04805v2'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- title: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding'
url: http://arxiv.org/abs/1810.04805v2
Publisher: huggingface
Task:
- sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "c000df74",
"metadata": {},
"source": [
"# BERT large model (uncased)\n",
"\n",
"详细内容请看[Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "bd7436a9",
"metadata": {},
"source": [
"Pretrained model on English language using a masked language modeling (MLM) objective. It was introduced in\n",
"[this paper](https://arxiv.org/abs/1810.04805) and first released in\n",
"[this repository](https://github.com/google-research/bert). This model is uncased: it does not make a difference\n",
"between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "87c430c2",
"metadata": {},
"source": [
"Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by\n",
"the Hugging Face team.\n"
]
},
{
"cell_type": "markdown",
"id": "e2004f07",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "ad86c301",
"metadata": {},
"source": [
"BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it\n",
"was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of\n",
"publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it\n",
"was pretrained with two objectives:\n"
]
},
{
"cell_type": "markdown",
"id": "8f12ab3c",
"metadata": {},
"source": [
"- Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run\n",
"the entire masked sentence through the model and has to predict the masked words. This is different from traditional\n",
"recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like\n",
"GPT which internally mask the future tokens. It allows the model to learn a bidirectional representation of the\n",
"sentence.\n",
"- Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes\n",
"they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to\n",
"predict if the two sentences were following each other or not.\n"
]
},
{
"cell_type": "markdown",
"id": "3fc80525",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the English language that can then be used to extract features\n",
"useful for downstream tasks: if you have a dataset of labeled sentences for instance, you can train a standard\n",
"classifier using the features produced by the BERT model as inputs.\n"
]
},
{
"cell_type": "markdown",
"id": "c31d15b4",
"metadata": {},
"source": [
"This model has the following configuration:\n"
]
},
{
"cell_type": "markdown",
"id": "822f7f40",
"metadata": {},
"source": [
"- 24-layer\n",
"- 1024 hidden dimension\n",
"- 16 attention heads\n",
"- 336M parameters.\n"
]
},
{
"cell_type": "markdown",
"id": "7fcdeb04",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "db4ceaa3",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "dc6a0473",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-large-uncased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "1156d387",
"metadata": {},
"source": [
"```\n",
"@article{DBLP:journals/corr/abs-1810-04805,\n",
"author = {Jacob Devlin and\n",
"Ming{-}Wei Chang and\n",
"Kenton Lee and\n",
"Kristina Toutanova},\n",
"title = {{BERT:} Pre-training of Deep Bidirectional Transformers for Language\n",
"Understanding},\n",
"journal = {CoRR},\n",
"volume = {abs/1810.04805},\n",
"year = {2018},\n",
"url = {http://arxiv.org/abs/1810.04805},\n",
"archivePrefix = {arXiv},\n",
"eprint = {1810.04805},\n",
"timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},\n",
"biburl = {https://dblp.org/rec/journals/corr/abs-1810-04805.bib},\n",
"bibsource = {dblp computer science bibliography, https://dblp.org}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "9d07ca08",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于 https://huggingface.co/bert-large-uncased ,并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "a4fae520",
"metadata": {},
"source": [
"# BERT large model (uncased)\n",
"\n",
"You can get more details from [Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "c410d1ae",
"metadata": {},
"source": [
"Pretrained model on English language using a masked language modeling (MLM) objective. It was introduced in\n",
"[this paper](https://arxiv.org/abs/1810.04805) and first released in\n",
"[this repository](https://github.com/google-research/bert). This model is uncased: it does not make a difference\n",
"between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "40166ab8",
"metadata": {},
"source": [
"Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by\n",
"the Hugging Face team.\n"
]
},
{
"cell_type": "markdown",
"id": "dacb968e",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "c519206d",
"metadata": {},
"source": [
"BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it\n",
"was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of\n",
"publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it\n",
"was pretrained with two objectives:\n"
]
},
{
"cell_type": "markdown",
"id": "2dd87a78",
"metadata": {},
"source": [
"- Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run\n",
"the entire masked sentence through the model and has to predict the masked words. This is different from traditional\n",
"recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like\n",
"GPT which internally mask the future tokens. It allows the model to learn a bidirectional representation of the\n",
"sentence.\n",
"- Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes\n",
"they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to\n",
"predict if the two sentences were following each other or not.\n"
]
},
{
"cell_type": "markdown",
"id": "507ce60a",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the English language that can then be used to extract features\n",
"useful for downstream tasks: if you have a dataset of labeled sentences for instance, you can train a standard\n",
"classifier using the features produced by the BERT model as inputs.\n"
]
},
{
"cell_type": "markdown",
"id": "7fb7a8a0",
"metadata": {},
"source": [
"This model has the following configuration:\n"
]
},
{
"cell_type": "markdown",
"id": "ebe2c593",
"metadata": {},
"source": [
"- 24-layer\n",
"- 1024 hidden dimension\n",
"- 16 attention heads\n",
"- 336M parameters.\n"
]
},
{
"cell_type": "markdown",
"id": "547e3cc8",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "669cb05f",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "09a4bc02",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"bert-large-uncased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "3ae36313",
"metadata": {},
"source": [
"```\n",
"@article{DBLP:journals/corr/abs-1810-04805,\n",
"author = {Jacob Devlin and\n",
"Ming{-}Wei Chang and\n",
"Kenton Lee and\n",
"Kristina Toutanova},\n",
"title = {{BERT:} Pre-training of Deep Bidirectional Transformers for Language\n",
"Understanding},\n",
"journal = {CoRR},\n",
"volume = {abs/1810.04805},\n",
"year = {2018},\n",
"url = {http://arxiv.org/abs/1810.04805},\n",
"archivePrefix = {arXiv},\n",
"eprint = {1810.04805},\n",
"timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},\n",
"biburl = {https://dblp.org/rec/journals/corr/abs-1810-04805.bib},\n",
"bibsource = {dblp computer science bibliography, https://dblp.org}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "bed31ba3",
"metadata": {},
"source": [
"> The model introduction and model weights originate from https://huggingface.co/bert-large-uncased and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: wikipedia
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: ''
License: apache-2.0
Model_Info:
name: "distilbert-base-multilingual-cased"
description: "Model Card for DistilBERT base multilingual (cased)"
description_en: "Model Card for DistilBERT base multilingual (cased)"
icon: ""
from_repo: "https://huggingface.co/distilbert-base-multilingual-cased"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: "wikipedia"
Publisher: "huggingface"
License: "apache-2.0"
Language: ""
description: Model Card for DistilBERT base multilingual (cased)
description_en: Model Card for DistilBERT base multilingual (cased)
from_repo: https://huggingface.co/distilbert-base-multilingual-cased
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: distilbert-base-multilingual-cased
Paper:
- title: 'DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter'
url: 'http://arxiv.org/abs/1910.01108v4'
- title: 'Quantifying the Carbon Emissions of Machine Learning'
url: 'http://arxiv.org/abs/1910.09700v2'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- title: 'DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter'
url: http://arxiv.org/abs/1910.01108v4
- title: Quantifying the Carbon Emissions of Machine Learning
url: http://arxiv.org/abs/1910.09700v2
Publisher: huggingface
Task:
- sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "922fd8e5",
"metadata": {},
"source": [
"# Model Card for DistilBERT base multilingual (cased)\n",
"\n",
"详细内容请看[Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。\n"
]
},
{
"cell_type": "markdown",
"id": "a1024bec",
"metadata": {},
"source": [
"## Model Description\n"
]
},
{
"cell_type": "markdown",
"id": "bcdfe024",
"metadata": {},
"source": [
"This model is a distilled version of the [BERT base multilingual model](https://huggingface.co/bert-base-multilingual-cased/). The code for the distillation process can be found [here](https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation). This model is cased: it does make a difference between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "5051aaa6",
"metadata": {},
"source": [
"The model is trained on the concatenation of Wikipedia in 104 different languages listed [here](https://github.com/google-research/bert/blob/master/multilingual.md#list-of-languages).\n",
"The model has 6 layers, 768 dimension and 12 heads, totalizing 134M parameters (compared to 177M parameters for mBERT-base).\n",
"On average, this model, referred to as DistilmBERT, is twice as fast as mBERT-base.\n"
]
},
{
"cell_type": "markdown",
"id": "cdddc273",
"metadata": {},
"source": [
"We encourage potential users of this model to check out the [BERT base multilingual model card](https://huggingface.co/bert-base-multilingual-cased) to learn more about usage, limitations and potential biases.\n"
]
},
{
"cell_type": "markdown",
"id": "8eebedbf",
"metadata": {},
"source": [
"- **Developed by:** Victor Sanh, Lysandre Debut, Julien Chaumond, Thomas Wolf (Hugging Face)\n",
"- **Model type:** Transformer-based language model\n",
"- **Language(s) (NLP):** 104 languages; see full list [here](https://github.com/google-research/bert/blob/master/multilingual.md#list-of-languages)\n",
"- **License:** Apache 2.0\n",
"- **Related Models:** [BERT base multilingual model](https://huggingface.co/bert-base-multilingual-cased)\n",
"- **Resources for more information:**\n",
"- [GitHub Repository](https://github.com/huggingface/transformers/blob/main/examples/research_projects/distillation/README.md)\n",
"- [Associated Paper](https://arxiv.org/abs/1910.01108)\n"
]
},
{
"cell_type": "markdown",
"id": "e9f48c0b",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f4dde273",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b940cddf",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"distilbert-base-multilingual-cased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "7ab62874",
"metadata": {},
"source": [
"# Citation\n",
"\n",
"```\n",
"@article{Sanh2019DistilBERTAD,\n",
" title={DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter},\n",
" author={Victor Sanh and Lysandre Debut and Julien Chaumond and Thomas Wolf},\n",
" journal={ArXiv},\n",
" year={2019},\n",
" volume={abs/1910.01108}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "8bdb4ee1",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于 https://huggingface.co/distilbert-base-multilingual-cased ,并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "4260a150",
"metadata": {},
"source": [
"# Model Card for DistilBERT base multilingual (cased)\n",
"\n",
"You can get more details from [Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "53f1b1c2",
"metadata": {},
"source": [
"This model is a distilled version of the [BERT base multilingual model](https://huggingface.co/bert-base-multilingual-cased/). The code for the distillation process can be found [here](https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation). This model is cased: it does make a difference between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "f417583b",
"metadata": {},
"source": [
"The model is trained on the concatenation of Wikipedia in 104 different languages listed [here](https://github.com/google-research/bert/blob/master/multilingual.md#list-of-languages).\n",
"The model has 6 layers, 768 dimension and 12 heads, totalizing 134M parameters (compared to 177M parameters for mBERT-base).\n",
"On average, this model, referred to as DistilmBERT, is twice as fast as mBERT-base.\n",
"\n",
"- **Developed by:** Victor Sanh, Lysandre Debut, Julien Chaumond, Thomas Wolf (Hugging Face)\n",
"- **Model type:** Transformer-based language model\n",
"- **Language(s) (NLP):** 104 languages; see full list [here](https://github.com/google-research/bert/blob/master/multilingual.md#list-of-languages)\n",
"- **License:** Apache 2.0\n",
"- **Related Models:** [BERT base multilingual model](https://huggingface.co/bert-base-multilingual-cased)\n",
"- **Resources for more information:**\n",
"- [GitHub Repository](https://github.com/huggingface/transformers/blob/main/examples/research_projects/distillation/README.md)\n",
"- [Associated Paper](https://arxiv.org/abs/1910.01108)\n"
]
},
{
"cell_type": "markdown",
"id": "f47ce9b7",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a1353b5f",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e23a860f",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"distilbert-base-multilingual-cased\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "38c30ea4",
"metadata": {},
"source": [
"# Citation\n",
"\n",
"```\n",
"@article{Sanh2019DistilBERTAD,\n",
" title={DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter},\n",
" author={Victor Sanh and Lysandre Debut and Julien Chaumond and Thomas Wolf},\n",
" journal={ArXiv},\n",
" year={2019},\n",
" volume={abs/1910.01108}\n",
"}\n",
"```\n"
]
},
{
"cell_type": "markdown",
"id": "0ee03d6a",
"metadata": {},
"source": [
"> The model introduction and model weights originate from https://huggingface.co/distilbert-base-multilingual-cased and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: openwebtext
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info:
name: "distilgpt2"
description: "DistilGPT2"
description_en: "DistilGPT2"
icon: ""
from_repo: "https://huggingface.co/distilgpt2"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text Generation"
sub_tag: "文本生成"
Example:
Datasets: "openwebtext"
Publisher: "huggingface"
License: "apache-2.0"
Language: "English"
description: DistilGPT2
description_en: DistilGPT2
from_repo: https://huggingface.co/distilgpt2
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: distilgpt2
Paper:
- title: 'DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter'
url: 'http://arxiv.org/abs/1910.01108v4'
- title: 'Can Model Compression Improve NLP Fairness'
url: 'http://arxiv.org/abs/2201.08542v1'
- title: 'Mitigating Gender Bias in Distilled Language Models via Counterfactual Role Reversal'
url: 'http://arxiv.org/abs/2203.12574v1'
- title: 'Quantifying the Carbon Emissions of Machine Learning'
url: 'http://arxiv.org/abs/1910.09700v2'
- title: 'Distilling the Knowledge in a Neural Network'
url: 'http://arxiv.org/abs/1503.02531v1'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- title: 'DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter'
url: http://arxiv.org/abs/1910.01108v4
- title: Can Model Compression Improve NLP Fairness
url: http://arxiv.org/abs/2201.08542v1
- title: Mitigating Gender Bias in Distilled Language Models via Counterfactual Role
Reversal
url: http://arxiv.org/abs/2203.12574v1
- title: Quantifying the Carbon Emissions of Machine Learning
url: http://arxiv.org/abs/1910.09700v2
- title: Distilling the Knowledge in a Neural Network
url: http://arxiv.org/abs/1503.02531v1
Publisher: huggingface
Task:
- sub_tag: 文本生成
sub_tag_en: Text Generation
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "72047643",
"metadata": {},
"source": [
"# DistilGPT2\n",
"\n",
"详细内容请看[GPT2 in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/gpt/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "20c299c9",
"metadata": {},
"source": [
"DistilGPT2 (short for Distilled-GPT2) is an English-language model pre-trained with the supervision of the smallest version of Generative Pre-trained Transformer 2 (GPT-2). Like GPT-2, DistilGPT2 can be used to generate text. Users of this model card should also consider information about the design, training, and limitations of GPT-2.\n"
]
},
{
"cell_type": "markdown",
"id": "c624b3d1",
"metadata": {},
"source": [
"## Model Details\n"
]
},
{
"cell_type": "markdown",
"id": "92002396",
"metadata": {},
"source": [
"- **Developed by:** Hugging Face\n",
"- **Model type:** Transformer-based Language Model\n",
"- **Language:** English\n",
"- **License:** Apache 2.0\n",
"- **Model Description:** DistilGPT2 is an English-language model pre-trained with the supervision of the 124 million parameter version of GPT-2. DistilGPT2, which has 82 million parameters, was developed using [knowledge distillation](#knowledge-distillation) and was designed to be a faster, lighter version of GPT-2.\n",
"- **Resources for more information:** See this repository for more about Distil\\* (a class of compressed models including Distilled-GPT2), [Sanh et al. (2019)](https://arxiv.org/abs/1910.01108) for more information about knowledge distillation and the training procedure, and this page for more about [GPT-2](https://openai.com/blog/better-language-models/).\n"
]
},
{
"cell_type": "markdown",
"id": "a1a84778",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f9c6043d",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a9f0754d",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"distilgpt2\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "03d3d465",
"metadata": {},
"source": [
"## Citation\n",
"\n",
"```\n",
"@inproceedings{sanh2019distilbert,\n",
"title={DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter},\n",
"author={Sanh, Victor and Debut, Lysandre and Chaumond, Julien and Wolf, Thomas},\n",
"booktitle={NeurIPS EMC^2 Workshop},\n",
"year={2019}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "7966636a",
"metadata": {},
"source": [
"## Glossary\n"
]
},
{
"cell_type": "markdown",
"id": "533038ef",
"metadata": {},
"source": [
"-\t<a name=\"knowledge-distillation\">**Knowledge Distillation**</a>: As described in [Sanh et al. (2019)](https://arxiv.org/pdf/1910.01108.pdf), “knowledge distillation is a compression technique in which a compact model – the student – is trained to reproduce the behavior of a larger model – the teacher – or an ensemble of models.” Also see [Bucila et al. (2006)](https://www.cs.cornell.edu/~caruana/compression.kdd06.pdf) and [Hinton et al. (2015)](https://arxiv.org/abs/1503.02531).\n"
]
},
{
"cell_type": "markdown",
"id": "a7ff7cc1",
"metadata": {},
"source": [
"<a href=\"https://huggingface.co/exbert/?model=distilgpt2\">\n",
"<img width=\"300px\" src=\"https://cdn-media.huggingface.co/exbert/button.png\">\n",
"</a>\n",
"\n",
"> 此模型介绍及权重来源于[https://huggingface.co/distilgpt2](https://huggingface.co/distilgpt2),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "1b34fb8a",
"metadata": {},
"source": [
"# DistilGPT2\n",
"\n",
"You can get more details from [GPT2 in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/gpt/README.md)."
]
},
{
"cell_type": "markdown",
"id": "f3ab8949",
"metadata": {},
"source": [
"DistilGPT2 (short for Distilled-GPT2) is an English-language model pre-trained with the supervision of the smallest version of Generative Pre-trained Transformer 2 (GPT-2). Like GPT-2, DistilGPT2 can be used to generate text. Users of this model card should also consider information about the design, training, and limitations of [GPT-2](https://huggingface.co/gpt2).\n"
]
},
{
"cell_type": "markdown",
"id": "c6fbc1da",
"metadata": {},
"source": [
"## Model Details\n"
]
},
{
"cell_type": "markdown",
"id": "e2929e2f",
"metadata": {},
"source": [
"- **Developed by:** Hugging Face\n",
"- **Model type:** Transformer-based Language Model\n",
"- **Language:** English\n",
"- **License:** Apache 2.0\n",
"- **Model Description:** DistilGPT2 is an English-language model pre-trained with the supervision of the 124 million parameter version of GPT-2. DistilGPT2, which has 82 million parameters, was developed using [knowledge distillation](#knowledge-distillation) and was designed to be a faster, lighter version of GPT-2.\n",
"- **Resources for more information:** See [this repository](https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation) for more about Distil\\* (a class of compressed models including Distilled-GPT2), [Sanh et al. (2019)](https://arxiv.org/abs/1910.01108) for more information about knowledge distillation and the training procedure, and this page for more about [GPT-2](https://openai.com/blog/better-language-models/).\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5e226406",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "51f32d75",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"distilgpt2\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "adb84dc8",
"metadata": {},
"source": [
"## Citation\n",
"\n",
"```\n",
"@inproceedings{sanh2019distilbert,\n",
"title={DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter},\n",
"author={Sanh, Victor and Debut, Lysandre and Chaumond, Julien and Wolf, Thomas},\n",
"booktitle={NeurIPS EMC^2 Workshop},\n",
"year={2019}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "7d2aaec2",
"metadata": {},
"source": [
"## Glossary\n"
]
},
{
"cell_type": "markdown",
"id": "004026dd",
"metadata": {},
"source": [
"-\t<a name=\"knowledge-distillation\">**Knowledge Distillation**</a>: As described in [Sanh et al. (2019)](https://arxiv.org/pdf/1910.01108.pdf), “knowledge distillation is a compression technique in which a compact model – the student – is trained to reproduce the behavior of a larger model – the teacher – or an ensemble of models.” Also see [Bucila et al. (2006)](https://www.cs.cornell.edu/~caruana/compression.kdd06.pdf) and [Hinton et al. (2015)](https://arxiv.org/abs/1503.02531).\n"
]
},
{
"cell_type": "markdown",
"id": "f8d12799",
"metadata": {},
"source": [
"<a href=\"https://huggingface.co/exbert/?model=distilgpt2\">\n",
"<img width=\"300px\" src=\"https://cdn-media.huggingface.co/exbert/button.png\">\n",
"</a>\n",
"\n",
"> The model introduction and model weights originate from [https://huggingface.co/distilgpt2](https://huggingface.co/distilgpt2) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: openwebtext
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: apache-2.0
Model_Info:
name: "distilroberta-base"
description: "Model Card for DistilRoBERTa base"
description_en: "Model Card for DistilRoBERTa base"
icon: ""
from_repo: "https://huggingface.co/distilroberta-base"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Fill-Mask"
sub_tag: "槽位填充"
Example:
Datasets: "openwebtext"
Publisher: "huggingface"
License: "apache-2.0"
Language: "English"
description: Model Card for DistilRoBERTa base
description_en: Model Card for DistilRoBERTa base
from_repo: https://huggingface.co/distilroberta-base
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: distilroberta-base
Paper:
- title: 'DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter'
url: 'http://arxiv.org/abs/1910.01108v4'
- title: 'Quantifying the Carbon Emissions of Machine Learning'
url: 'http://arxiv.org/abs/1910.09700v2'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- title: 'DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter'
url: http://arxiv.org/abs/1910.01108v4
- title: Quantifying the Carbon Emissions of Machine Learning
url: http://arxiv.org/abs/1910.09700v2
Publisher: huggingface
Task:
- sub_tag: 槽位填充
sub_tag_en: Fill-Mask
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "7f49bb4b",
"metadata": {},
"source": [
"# Model Card for DistilRoBERTa base\n"
]
},
{
"cell_type": "markdown",
"id": "88c832ab",
"metadata": {},
"source": [
"## Model Description\n"
]
},
{
"cell_type": "markdown",
"id": "3a2333a1",
"metadata": {},
"source": [
"This model is a distilled version of the RoBERTa-base model. It follows the same training procedure as DistilBERT.\n",
"The code for the distillation process can be found [here](https://github.com/huggingface/transformers/tree/master/examples/distillation).\n",
"This model is case-sensitive: it makes a difference between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "9ac70255",
"metadata": {},
"source": [
"The model has 6 layers, 768 dimension and 12 heads, totalizing 82M parameters (compared to 125M parameters for RoBERTa-base).\n",
"On average DistilRoBERTa is twice as fast as Roberta-base.\n"
]
},
{
"cell_type": "markdown",
"id": "a0757c23",
"metadata": {},
"source": [
"We encourage users of this model card to check out the RoBERTa-base model card to learn more about usage, limitations and potential biases.\n"
]
},
{
"cell_type": "markdown",
"id": "2865466d",
"metadata": {},
"source": [
"- **Developed by:** Victor Sanh, Lysandre Debut, Julien Chaumond, Thomas Wolf (Hugging Face)\n",
"- **Model type:** Transformer-based language model\n",
"- **Language(s) (NLP):** English\n",
"- **License:** Apache 2.0\n",
"- **Related Models:** RoBERTa-base model card\n",
"- [Associated Paper](https://arxiv.org/abs/1910.01108)\n"
]
},
{
"cell_type": "markdown",
"id": "a204fad3",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b2e488ed",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "43d7726b",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"distilroberta-base\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "e30fb0eb",
"metadata": {},
"source": [
"<a href=\"https://huggingface.co/exbert/?model=distilroberta-base\">\n",
"<img width=\"300px\" src=\"https://cdn-media.huggingface.co/exbert/button.png\">\n",
"</a>\n",
"\n",
"> 此模型介绍及权重来源于[https://huggingface.co/distilroberta-base](https://huggingface.co/distilroberta-base),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "4bd898ca",
"metadata": {},
"source": [
"# Model Card for DistilRoBERTa base\n"
]
},
{
"cell_type": "markdown",
"id": "7d39a086",
"metadata": {},
"source": [
"## Model Description\n"
]
},
{
"cell_type": "markdown",
"id": "e2043d14",
"metadata": {},
"source": [
"This model is a distilled version of the [RoBERTa-base model](https://huggingface.co/roberta-base). It follows the same training procedure as [DistilBERT](https://huggingface.co/distilbert-base-uncased).\n",
"The code for the distillation process can be found [here](https://github.com/huggingface/transformers/tree/master/examples/distillation).\n",
"This model is case-sensitive: it makes a difference between english and English.\n"
]
},
{
"cell_type": "markdown",
"id": "10aefe84",
"metadata": {},
"source": [
"The model has 6 layers, 768 dimension and 12 heads, totalizing 82M parameters (compared to 125M parameters for RoBERTa-base).\n",
"On average DistilRoBERTa is twice as fast as Roberta-base.\n"
]
},
{
"cell_type": "markdown",
"id": "d7ebd775",
"metadata": {},
"source": [
"We encourage users of this model card to check out the [RoBERTa-base model card](https://huggingface.co/roberta-base) to learn more about usage, limitations and potential biases.\n"
]
},
{
"cell_type": "markdown",
"id": "423d28b1",
"metadata": {},
"source": [
"- **Developed by:** Victor Sanh, Lysandre Debut, Julien Chaumond, Thomas Wolf (Hugging Face)\n",
"- **Model type:** Transformer-based language model\n",
"- **Language(s) (NLP):** English\n",
"- **License:** Apache 2.0\n",
"- **Related Models:** [RoBERTa-base model card](https://huggingface.co/roberta-base)\n",
"- **Resources for more information:**\n",
"- [GitHub Repository](https://github.com/huggingface/transformers/blob/main/examples/research_projects/distillation/README.md)\n",
"- [Associated Paper](https://arxiv.org/abs/1910.01108)\n"
]
},
{
"cell_type": "markdown",
"id": "715b4360",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9ad9b1a9",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "94e4d093",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"distilroberta-base\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "e258a20c",
"metadata": {},
"source": [
"<a href=\"https://huggingface.co/exbert/?model=distilroberta-base\">\n",
"<img width=\"300px\" src=\"https://cdn-media.huggingface.co/exbert/button.png\">\n",
"</a>\n",
"\n",
"> The model introduction and model weights originate from [https://huggingface.co/distilroberta-base](https://huggingface.co/distilroberta-base) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: mit
Model_Info:
name: "gpt2-large"
description: "GPT-2 Large"
description_en: "GPT-2 Large"
icon: ""
from_repo: "https://huggingface.co/gpt2-large"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text Generation"
sub_tag: "文本生成"
Example:
Datasets: ""
Publisher: "huggingface"
License: "mit"
Language: "English"
description: GPT-2 Large
description_en: GPT-2 Large
from_repo: https://huggingface.co/gpt2-large
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: gpt2-large
Paper:
- title: 'Quantifying the Carbon Emissions of Machine Learning'
url: 'http://arxiv.org/abs/1910.09700v2'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- title: Quantifying the Carbon Emissions of Machine Learning
url: http://arxiv.org/abs/1910.09700v2
Publisher: huggingface
Task:
- sub_tag: 文本生成
sub_tag_en: Text Generation
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "32b8730a",
"metadata": {},
"source": [
"# GPT-2 Large\n",
"\n",
"详细内容请看[GPT2 in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/gpt/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "de66cac3",
"metadata": {},
"source": [
"## Table of Contents\n",
"- [Model Details](#model-details)\n",
"- [How To Get Started With the Model](#how-to-get-started-with-the-model)\n",
"- [Uses](#uses)\n",
"- [Risks, Limitations and Biases](#risks-limitations-and-biases)\n",
"- [Training](#training)\n",
"- [Evaluation](#evaluation)\n",
"- [Environmental Impact](#environmental-impact)\n",
"- [Technical Specifications](#technical-specifications)\n",
"- [Citation Information](#citation-information)\n",
"- [Model Card Authors](#model-card-author)\n"
]
},
{
"cell_type": "markdown",
"id": "8afa58ef",
"metadata": {},
"source": [
"## Model Details\n"
]
},
{
"cell_type": "markdown",
"id": "e4e46496",
"metadata": {},
"source": [
"**Model Description:** GPT-2 Large is the **774M parameter** version of GPT-2, a transformer-based language model created and released by OpenAI. The model is a pretrained model on English language using a causal language modeling (CLM) objective.\n"
]
},
{
"cell_type": "markdown",
"id": "15b8f634",
"metadata": {},
"source": [
"- **Developed by:** OpenAI, see [associated research paper](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf) and [GitHub repo](https://github.com/openai/gpt-2) for model developers.\n",
"- **Model Type:** Transformer-based language model\n",
"- **Language(s):** English\n",
"- **License:** [Modified MIT License](https://github.com/openai/gpt-2/blob/master/LICENSE)\n",
"- **Related Models:** [GPT-2](https://huggingface.co/gpt2), [GPT-Medium](https://huggingface.co/gpt2-medium) and [GPT-XL](https://huggingface.co/gpt2-xl)\n",
"- **Resources for more information:**\n",
"- [Research Paper](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf)\n",
"- [OpenAI Blog Post](https://openai.com/blog/better-language-models/)\n",
"- [GitHub Repo](https://github.com/openai/gpt-2)\n",
"- [OpenAI Model Card for GPT-2](https://github.com/openai/gpt-2/blob/master/model_card.md)\n",
"- Test the full generation capabilities here: https://transformer.huggingface.co/doc/gpt2-large\n"
]
},
{
"cell_type": "markdown",
"id": "6c2023d9",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b17e6efb",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "33c1f565",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"gpt2-large\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "8060d283",
"metadata": {},
"source": [
"## Citatioin\n",
"\n",
"```\n",
"@article{radford2019language,\n",
"title={Language models are unsupervised multitask learners},\n",
"author={Radford, Alec and Wu, Jeffrey and Child, Rewon and Luan, David and Amodei, Dario and Sutskever, Ilya and others},\n",
"journal={OpenAI blog},\n",
"volume={1},\n",
"number={8},\n",
"pages={9},\n",
"year={2019}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "083f0d9c",
"metadata": {},
"source": [
"## Model Card Authors\n"
]
},
{
"cell_type": "markdown",
"id": "f9e4bb43",
"metadata": {},
"source": [
"This model card was written by the Hugging Face team.\n",
"\n",
"> 此模型介绍及权重来源于[https://huggingface.co/gpt2-large](https://huggingface.co/gpt2-large),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "dc26013b",
"metadata": {},
"source": [
"# GPT-2 Large\n",
"\n",
"You can get more details from [GPT2 in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/gpt/README.md)."
]
},
{
"cell_type": "markdown",
"id": "38e29e37",
"metadata": {},
"source": [
"## Table of Contents\n",
"- [Model Details](#model-details)\n",
"- [How To Get Started With the Model](#how-to-get-started-with-the-model)\n",
"- [Uses](#uses)\n",
"- [Risks, Limitations and Biases](#risks-limitations-and-biases)\n",
"- [Training](#training)\n",
"- [Evaluation](#evaluation)\n",
"- [Environmental Impact](#environmental-impact)\n",
"- [Technical Specifications](#technical-specifications)\n",
"- [Citation Information](#citation-information)\n",
"- [Model Card Authors](#model-card-author)\n"
]
},
{
"cell_type": "markdown",
"id": "590c3fbd",
"metadata": {},
"source": [
"## Model Details\n"
]
},
{
"cell_type": "markdown",
"id": "1a2cd621",
"metadata": {},
"source": [
"**Model Description:** GPT-2 Large is the **774M parameter** version of GPT-2, a transformer-based language model created and released by OpenAI. The model is a pretrained model on English language using a causal language modeling (CLM) objective.\n"
]
},
{
"cell_type": "markdown",
"id": "0155f43f",
"metadata": {},
"source": [
"- **Developed by:** OpenAI, see [associated research paper](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf) and [GitHub repo](https://github.com/openai/gpt-2) for model developers.\n",
"- **Model Type:** Transformer-based language model\n",
"- **Language(s):** English\n",
"- **License:** [Modified MIT License](https://github.com/openai/gpt-2/blob/master/LICENSE)\n",
"- **Related Models:** https://huggingface.co/gpt2, GPT-Medium and GPT-XL\n",
"- **Resources for more information:**\n",
"- [Research Paper](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf)\n",
"- [OpenAI Blog Post](https://openai.com/blog/better-language-models/)\n",
"- [GitHub Repo](https://github.com/openai/gpt-2)\n",
"- [OpenAI Model Card for GPT-2](https://github.com/openai/gpt-2/blob/master/model_card.md)\n"
]
},
{
"cell_type": "markdown",
"id": "18e2772d",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "30207821",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2ae65fe6",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"gpt2-large\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "e8b7c92b",
"metadata": {},
"source": [
"## Citation\n",
"\n",
"```\n",
"@article{radford2019language,\n",
"title={Language models are unsupervised multitask learners},\n",
"author={Radford, Alec and Wu, Jeffrey and Child, Rewon and Luan, David and Amodei, Dario and Sutskever, Ilya and others},\n",
"journal={OpenAI blog},\n",
"volume={1},\n",
"number={8},\n",
"pages={9},\n",
"year={2019}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "7cded70d",
"metadata": {},
"source": [
"## Model Card Authors\n"
]
},
{
"cell_type": "markdown",
"id": "ff9ab2d4",
"metadata": {},
"source": [
"This model card was written by the Hugging Face team.\n",
"\n",
"> The model introduction and model weights originate from [https://huggingface.co/gpt2-large](https://huggingface.co/gpt2-large) and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: mit
Model_Info:
name: "gpt2-medium"
description: "GPT-2 Medium"
description_en: "GPT-2 Medium"
icon: ""
from_repo: "https://huggingface.co/gpt2-medium"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text Generation"
sub_tag: "文本生成"
Example:
Datasets: ""
Publisher: "huggingface"
License: "mit"
Language: "English"
description: GPT-2 Medium
description_en: GPT-2 Medium
from_repo: https://huggingface.co/gpt2-medium
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: gpt2-medium
Paper:
- title: 'Quantifying the Carbon Emissions of Machine Learning'
url: 'http://arxiv.org/abs/1910.09700v2'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- title: Quantifying the Carbon Emissions of Machine Learning
url: http://arxiv.org/abs/1910.09700v2
Publisher: huggingface
Task:
- sub_tag: 文本生成
sub_tag_en: Text Generation
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "25324e9c",
"metadata": {},
"source": [
"# GPT-2 Medium\n",
"\n",
"详细内容请看[GPT2 in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/gpt/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "806177e3",
"metadata": {},
"source": [
"## Model Details\n"
]
},
{
"cell_type": "markdown",
"id": "dbcaecb0",
"metadata": {},
"source": [
"**Model Description:** GPT-2 Medium is the **355M parameter** version of GPT-2, a transformer-based language model created and released by OpenAI. The model is a pretrained model on English language using a causal language modeling (CLM) objective.\n"
]
},
{
"cell_type": "markdown",
"id": "ab73e9f0",
"metadata": {},
"source": [
"- **Developed by:** OpenAI, see [associated research paper](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf) and [GitHub repo](https://github.com/openai/gpt-2) for model developers.\n",
"- **Model Type:** Transformer-based language model\n",
"- **Language(s):** English\n",
"- **License:** [Modified MIT License](https://github.com/openai/gpt-2/blob/master/LICENSE)\n",
"- **Related Models:** [GPT2](https://huggingface.co/gpt2), [GPT2-Large](https://huggingface.co/gpt2-large) and [GPT2-XL](https://huggingface.co/gpt2-xl)\n",
"- **Resources for more information:**\n",
"- [Research Paper](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf)\n",
"- [OpenAI Blog Post](https://openai.com/blog/better-language-models/)\n",
"- [GitHub Repo](https://github.com/openai/gpt-2)\n",
"- [OpenAI Model Card for GPT-2](https://github.com/openai/gpt-2/blob/master/model_card.md)\n",
"- Test the full generation capabilities here: https://transformer.huggingface.co/doc/gpt2-large\n"
]
},
{
"cell_type": "markdown",
"id": "70c3fd36",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1bae5ee0",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "11b32577",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"gpt2-medium\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "08f90ea0",
"metadata": {},
"source": [
"## Citation\n",
"\n",
"```\n",
"@article{radford2019language,\n",
"title={Language models are unsupervised multitask learners},\n",
"author={Radford, Alec and Wu, Jeffrey and Child, Rewon and Luan, David and Amodei, Dario and Sutskever, Ilya and others},\n",
"journal={OpenAI blog},\n",
"volume={1},\n",
"number={8},\n",
"pages={9},\n",
"year={2019}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "64d79312",
"metadata": {},
"source": [
"## Model Card Authors\n"
]
},
{
"cell_type": "markdown",
"id": "d14dd2ac",
"metadata": {},
"source": [
"This model card was written by the Hugging Face team.\n",
"\n",
"> 此模型介绍及权重来源于 https://huggingface.co/gpt2-medium ,并转换为飞桨模型格式。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "46995787",
"metadata": {},
"source": [
"# GPT-2 Medium\n",
"\n",
"You can get more details from [GPT2 in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/gpt/README.md)."
]
},
{
"cell_type": "markdown",
"id": "f695ad73",
"metadata": {},
"source": [
"## Model Details\n"
]
},
{
"cell_type": "markdown",
"id": "5a8170d9",
"metadata": {},
"source": [
"**Model Description:** GPT-2 Medium is the **355M parameter** version of GPT-2, a transformer-based language model created and released by OpenAI. The model is a pretrained model on English language using a causal language modeling (CLM) objective.\n"
]
},
{
"cell_type": "markdown",
"id": "1d0dc244",
"metadata": {},
"source": [
"- **Developed by:** OpenAI, see [associated research paper](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf) and [GitHub repo](https://github.com/openai/gpt-2) for model developers.\n",
"- **Model Type:** Transformer-based language model\n",
"- **Language(s):** English\n",
"- **License:** [Modified MIT License](https://github.com/openai/gpt-2/blob/master/LICENSE)\n",
"- **Related Models:** [GPT2](https://huggingface.co/gpt2), [GPT2-Large](https://huggingface.co/gpt2-large) and [GPT2-XL](https://huggingface.co/gpt2-xl)\n",
"- **Resources for more information:**\n",
"- [Research Paper](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf)\n",
"- [OpenAI Blog Post](https://openai.com/blog/better-language-models/)\n",
"- [GitHub Repo](https://github.com/openai/gpt-2)\n",
"- [OpenAI Model Card for GPT-2](https://github.com/openai/gpt-2/blob/master/model_card.md)\n",
"- Test the full generation capabilities here: https://transformer.huggingface.co/doc/gpt2-large\n"
]
},
{
"cell_type": "markdown",
"id": "adc5a3f9",
"metadata": {},
"source": [
"## How to Get Started with the Model\n"
]
},
{
"cell_type": "markdown",
"id": "7566eafd",
"metadata": {},
"source": [
"Use the code below to get started with the model. \n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ab4c71ee",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b0167528",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"gpt2-medium\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "52cdcf9e",
"metadata": {},
"source": [
"## Citation\n",
"\n",
"```\n",
"@article{radford2019language,\n",
"title={Language models are unsupervised multitask learners},\n",
"author={Radford, Alec and Wu, Jeffrey and Child, Rewon and Luan, David and Amodei, Dario and Sutskever, Ilya and others},\n",
"journal={OpenAI blog},\n",
"volume={1},\n",
"number={8},\n",
"pages={9},\n",
"year={2019}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "eb327c10",
"metadata": {},
"source": [
"## Model Card Authors\n"
]
},
{
"cell_type": "markdown",
"id": "50fb7de8",
"metadata": {},
"source": [
"> The model introduction and model weights originate from https://huggingface.co/gpt2-medium and were converted to PaddlePaddle format for ease of use in PaddleNLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: ''
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: English
License: mit
Model_Info:
name: "gpt2"
description: "GPT-2"
description_en: "GPT-2"
icon: ""
from_repo: "https://huggingface.co/gpt2"
description: GPT-2
description_en: GPT-2
from_repo: https://huggingface.co/gpt2
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: gpt2
Paper: null
Publisher: huggingface
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Text Generation"
sub_tag: "文本生成"
Example:
Datasets: ""
Publisher: "huggingface"
License: "mit"
Language: "English"
Paper:
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- sub_tag: 文本生成
sub_tag_en: Text Generation
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "a4cd103f",
"metadata": {},
"source": [
"# GPT-2\n",
"\n",
"详细内容请看[GPT2 in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/gpt/README.md)。"
]
},
{
"cell_type": "markdown",
"id": "e10dfe6d",
"metadata": {},
"source": [
"Pretrained model on English language using a causal language modeling (CLM) objective. It was introduced in\n",
"[this paper](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf)\n",
"and first released at [this page](https://openai.com/blog/better-language-models/).\n"
]
},
{
"cell_type": "markdown",
"id": "d1b13043",
"metadata": {},
"source": [
"Disclaimer: The team releasing GPT-2 also wrote a\n",
"[model card](https://github.com/openai/gpt-2/blob/master/model_card.md) for their model. Content from this model card\n",
"has been written by the Hugging Face team to complete the information they provided and give specific examples of bias.\n"
]
},
{
"cell_type": "markdown",
"id": "016271a5",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "e3a53155",
"metadata": {},
"source": [
"GPT-2 is a transformers model pretrained on a very large corpus of English data in a self-supervised fashion. This\n",
"means it was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots\n",
"of publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely,\n",
"it was trained to guess the next word in sentences.\n"
]
},
{
"cell_type": "markdown",
"id": "6836ad17",
"metadata": {},
"source": [
"More precisely, inputs are sequences of continuous text of a certain length and the targets are the same sequence,\n",
"shifted one token (word or piece of word) to the right. The model uses internally a mask-mechanism to make sure the\n",
"predictions for the token `i` only uses the inputs from `1` to `i` but not the future tokens.\n"
]
},
{
"cell_type": "markdown",
"id": "26946ce6",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the English language that can then be used to extract features\n",
"useful for downstream tasks. The model is best at what it was pretrained for however, which is generating texts from a\n",
"prompt.\n"
]
},
{
"cell_type": "markdown",
"id": "571b41cf",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a6233e8e",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2e906136",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"gpt2\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "78f26b7f",
"metadata": {},
"source": [
"## Citation\n",
"\n",
"```\n",
"@article{radford2019language,\n",
"title={Language Models are Unsupervised Multitask Learners},\n",
"author={Radford, Alec and Wu, Jeff and Child, Rewon and Luan, David and Amodei, Dario and Sutskever, Ilya},\n",
"year={2019}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "2f646c57",
"metadata": {},
"source": [
"<a href=\"https://huggingface.co/exbert/?model=gpt2\">\n",
"<img width=\"300px\" src=\"https://cdn-media.huggingface.co/exbert/button.png\">\n",
"</a>\n",
"\n",
"> 此模型介绍及权重来源于[https://huggingface.co/gpt2](https://huggingface.co/gpt2),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
},
"vscode": {
"interpreter": {
"hash": "606ea184b8fed3419d714b545dc1784fad6c99d0cc940b6b9d787dccf225faa5"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "2d373572",
"metadata": {},
"source": [
"# GPT-2\n",
"\n",
"You can get more details from [GPT2 in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/gpt/README.md)."
]
},
{
"cell_type": "markdown",
"id": "00be5831",
"metadata": {},
"source": [
"Test the whole generation capabilities here: https://transformer.huggingface.co/doc/gpt2-large\n"
]
},
{
"cell_type": "markdown",
"id": "b5857cc2",
"metadata": {},
"source": [
"Pretrained model on English language using a causal language modeling (CLM) objective. It was introduced in\n",
"[this paper](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf)\n",
"and first released at [this page](https://openai.com/blog/better-language-models/).\n"
]
},
{
"cell_type": "markdown",
"id": "b0abac76",
"metadata": {},
"source": [
"Disclaimer: The team releasing GPT-2 also wrote a\n",
"[model card](https://github.com/openai/gpt-2/blob/master/model_card.md) for their model. Content from this model card\n",
"has been written by the Hugging Face team to complete the information they provided and give specific examples of bias.\n"
]
},
{
"cell_type": "markdown",
"id": "fa2c7f4b",
"metadata": {},
"source": [
"## Model description\n"
]
},
{
"cell_type": "markdown",
"id": "294521bd",
"metadata": {},
"source": [
"GPT-2 is a transformers model pretrained on a very large corpus of English data in a self-supervised fashion. This\n",
"means it was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots\n",
"of publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely,\n",
"it was trained to guess the next word in sentences.\n"
]
},
{
"cell_type": "markdown",
"id": "b1204c32",
"metadata": {},
"source": [
"More precisely, inputs are sequences of continuous text of a certain length and the targets are the same sequence,\n",
"shifted one token (word or piece of word) to the right. The model uses internally a mask-mechanism to make sure the\n",
"predictions for the token `i` only uses the inputs from `1` to `i` but not the future tokens.\n"
]
},
{
"cell_type": "markdown",
"id": "a019cc9e",
"metadata": {},
"source": [
"This way, the model learns an inner representation of the English language that can then be used to extract features\n",
"useful for downstream tasks. The model is best at what it was pretrained for however, which is generating texts from a\n",
"prompt.\n"
]
},
{
"cell_type": "markdown",
"id": "54ae8500",
"metadata": {},
"source": [
"## How to use"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d33fddda",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d0e160c6",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"gpt2\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "fcb8a843",
"metadata": {},
"source": [
"## Citation\n",
"\n",
"```\n",
"@article{radford2019language,\n",
"title={Language Models are Unsupervised Multitask Learners},\n",
"author={Radford, Alec and Wu, Jeff and Child, Rewon and Luan, David and Amodei, Dario and Sutskever, Ilya},\n",
"year={2019}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "513848f8",
"metadata": {},
"source": [
"<a href=\"https://huggingface.co/exbert/?model=gpt2\">\n",
"<img width=\"300px\" src=\"https://cdn-media.huggingface.co/exbert/button.png\">\n",
"</a>\n",
"\n",
"\n",
"> The model introduction and model weights originate from [https://huggingface.co/gpt2](https://huggingface.co/gpt2) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Datasets: indonlu
Example: null
IfOnlineDemo: 0
IfTraining: 0
Language: Indonesian
License: mit
Model_Info:
name: "indobenchmark/indobert-base-p1"
description: "IndoBERT Base Model (phase1 - uncased)"
description_en: "IndoBERT Base Model (phase1 - uncased)"
icon: ""
from_repo: "https://huggingface.co/indobenchmark/indobert-base-p1"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Feature Extraction"
sub_tag: "特征抽取"
Example:
Datasets: "indonlu"
Publisher: "indobenchmark"
License: "mit"
Language: "Indonesian"
description: IndoBERT Base Model (phase1 - uncased)
description_en: IndoBERT Base Model (phase1 - uncased)
from_repo: https://huggingface.co/indobenchmark/indobert-base-p1
icon: https://paddlenlp.bj.bcebos.com/models/community/transformer-layer.png
name: indobenchmark/indobert-base-p1
Paper:
- title: 'IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding'
url: 'http://arxiv.org/abs/2009.05387v3'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
- title: 'IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language
Understanding'
url: http://arxiv.org/abs/2009.05387v3
Publisher: indobenchmark
Task:
- sub_tag: 特征抽取
sub_tag_en: Feature Extraction
tag: 自然语言处理
tag_en: Natural Language Processing
{
"cells": [
{
"cell_type": "markdown",
"id": "3f5a12e4",
"metadata": {},
"source": [
"# IndoBERT Base Model (phase1 - uncased)\n"
]
},
{
"cell_type": "markdown",
"id": "e2fcac01",
"metadata": {},
"source": [
"[IndoBERT](https://arxiv.org/abs/2009.05387) is a state-of-the-art language model for Indonesian based on the BERT model. The pretrained model is trained using a masked language modeling (MLM) objective and next sentence prediction (NSP) objective.\n"
]
},
{
"cell_type": "markdown",
"id": "6a9d6a02",
"metadata": {},
"source": [
"## All Pre-trained Models\n"
]
},
{
"cell_type": "markdown",
"id": "3020975b",
"metadata": {},
"source": [
"| Model | #params | Arch. | Training data |\n",
"|--------------------------------|--------------------------------|-------|-----------------------------------|\n",
"| `indobenchmark/indobert-base-p1` | 124.5M | Base | Indo4B (23.43 GB of text) |\n",
"| `indobenchmark/indobert-base-p2` | 124.5M | Base | Indo4B (23.43 GB of text) |\n",
"| `indobenchmark/indobert-large-p1` | 335.2M | Large | Indo4B (23.43 GB of text) |\n",
"| `indobenchmark/indobert-large-p2` | 335.2M | Large | Indo4B (23.43 GB of text) |\n",
"| `indobenchmark/indobert-lite-base-p1` | 11.7M | Base | Indo4B (23.43 GB of text) |\n",
"| `indobenchmark/indobert-lite-base-p2` | 11.7M | Base | Indo4B (23.43 GB of text) |\n",
"| `indobenchmark/indobert-lite-large-p1` | 17.7M | Large | Indo4B (23.43 GB of text) |\n",
"| `indobenchmark/indobert-lite-large-p2` | 17.7M | Large | Indo4B (23.43 GB of text) |\n"
]
},
{
"cell_type": "markdown",
"id": "d0e3771a",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f1f38760",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a11bc38f",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"indobenchmark/indobert-base-p1\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "d1fe4366",
"metadata": {},
"source": [
"## Citation\n",
"\n",
"```\n",
"@inproceedings{wilie2020indonlu,\n",
"title={IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding},\n",
"author={Bryan Wilie and Karissa Vincentio and Genta Indra Winata and Samuel Cahyawijaya and X. Li and Zhi Yuan Lim and S. Soleman and R. Mahendra and Pascale Fung and Syafri Bahar and A. Purwarianti},\n",
"booktitle={Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing},\n",
"year={2020}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "95f83dc9",
"metadata": {},
"source": [
"> 此模型介绍及权重来源于[https://huggingface.co/indobenchmark/indobert-base-p1](https://huggingface.co/indobenchmark/indobert-base-p1),并转换为飞桨模型格式。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "markdown",
"id": "d6793868",
"metadata": {},
"source": [
"# IndoBERT Base Model (phase1 - uncased)\n"
]
},
{
"cell_type": "markdown",
"id": "48b35590",
"metadata": {},
"source": [
"[IndoBERT](https://arxiv.org/abs/2009.05387) is a state-of-the-art language model for Indonesian based on the BERT model. The pretrained model is trained using a masked language modeling (MLM) objective and next sentence prediction (NSP) objective.\n"
]
},
{
"cell_type": "markdown",
"id": "e5dc323c",
"metadata": {},
"source": [
"## All Pre-trained Models\n"
]
},
{
"cell_type": "markdown",
"id": "7db5d6e5",
"metadata": {},
"source": [
"| Model | #params | Arch. | Training data |\n",
"|--------------------------------|--------------------------------|-------|-----------------------------------|\n",
"| `indobenchmark/indobert-base-p1` | 124.5M | Base | Indo4B (23.43 GB of text) |\n",
"| `indobenchmark/indobert-base-p2` | 124.5M | Base | Indo4B (23.43 GB of text) |\n",
"| `indobenchmark/indobert-large-p1` | 335.2M | Large | Indo4B (23.43 GB of text) |\n",
"| `indobenchmark/indobert-large-p2` | 335.2M | Large | Indo4B (23.43 GB of text) |\n",
"| `indobenchmark/indobert-lite-base-p1` | 11.7M | Base | Indo4B (23.43 GB of text) |\n",
"| `indobenchmark/indobert-lite-base-p2` | 11.7M | Base | Indo4B (23.43 GB of text) |\n",
"| `indobenchmark/indobert-lite-large-p1` | 17.7M | Large | Indo4B (23.43 GB of text) |\n",
"| `indobenchmark/indobert-lite-large-p2` | 17.7M | Large | Indo4B (23.43 GB of text) |\n"
]
},
{
"cell_type": "markdown",
"id": "fc8827fd",
"metadata": {},
"source": [
"## How to use\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e5b6e205",
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade paddlenlp"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6701163d",
"metadata": {},
"outputs": [],
"source": [
"import paddle\n",
"from paddlenlp.transformers import AutoModel\n",
"\n",
"model = AutoModel.from_pretrained(\"indobenchmark/indobert-base-p1\")\n",
"input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
"print(model(input_ids))"
]
},
{
"cell_type": "markdown",
"id": "fb28cf5b",
"metadata": {},
"source": [
"## Citation\n",
"\n",
"```\n",
"@inproceedings{wilie2020indonlu,\n",
"title={IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding},\n",
"author={Bryan Wilie and Karissa Vincentio and Genta Indra Winata and Samuel Cahyawijaya and X. Li and Zhi Yuan Lim and S. Soleman and R. Mahendra and Pascale Fung and Syafri Bahar and A. Purwarianti},\n",
"booktitle={Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing},\n",
"year={2020}\n",
"}\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "e155d1ce",
"metadata": {},
"source": [
"> The model introduction and model weights originate from [https://huggingface.co/indobenchmark/indobert-base-p1](https://huggingface.co/indobenchmark/indobert-base-p1) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
# 模型列表
## johngiorgi/declutr-base
| 模型名称 | 模型介绍 | 模型大小 | 模型下载 |
| --- | --- | --- | --- |
|johngiorgi/declutr-base| | 625.22MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/johngiorgi/declutr-base/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/johngiorgi/declutr-base/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/johngiorgi/declutr-base/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/johngiorgi/declutr-base/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/johngiorgi/declutr-base/vocab.json)<br>[vocab.txt](https://bj.bcebos.com/paddlenlp/models/community/johngiorgi/declutr-base/vocab.txt) |
也可以通过`paddlenlp` cli 工具来下载对应的模型权重,使用步骤如下所示:
* 安装paddlenlp
```shell
pip install --upgrade paddlenlp
```
* 下载命令行
```shell
paddlenlp download --cache-dir ./pretrained_models johngiorgi/declutr-base
```
有任何下载的问题都可以到[PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP)中发Issue提问。
\ No newline at end of file
# model list
##
| model | description | model_size | download |
| --- | --- | --- | --- |
|johngiorgi/declutr-base| | 625.22MB | [merges.txt](https://bj.bcebos.com/paddlenlp/models/community/johngiorgi/declutr-base/merges.txt)<br>[model_config.json](https://bj.bcebos.com/paddlenlp/models/community/johngiorgi/declutr-base/model_config.json)<br>[model_state.pdparams](https://bj.bcebos.com/paddlenlp/models/community/johngiorgi/declutr-base/model_state.pdparams)<br>[tokenizer_config.json](https://bj.bcebos.com/paddlenlp/models/community/johngiorgi/declutr-base/tokenizer_config.json)<br>[vocab.json](https://bj.bcebos.com/paddlenlp/models/community/johngiorgi/declutr-base/vocab.json)<br>[vocab.txt](https://bj.bcebos.com/paddlenlp/models/community/johngiorgi/declutr-base/vocab.txt) |
or you can download all of model file with the following steps:
* install paddlenlp
```shell
pip install --upgrade paddlenlp
```
* download model with cli tool
```shell
paddlenlp download --cache-dir ./pretrained_models johngiorgi/declutr-base
```
If you have any problems with it, you can post issue on [PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP) to get support.
Model_Info:
name: "johngiorgi/declutr-base"
description: "DeCLUTR-base"
description_en: "DeCLUTR-base"
icon: ""
from_repo: "https://huggingface.co/johngiorgi/declutr-base"
Task:
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Sentence Similarity"
sub_tag: "句子相似度"
- tag_en: "Natural Language Processing"
tag: "自然语言处理"
sub_tag_en: "Feature Extraction"
sub_tag: "特征抽取"
Example:
Datasets: "openwebtext"
Publisher: "johngiorgi"
License: "apache-2.0"
Language: "English"
Paper:
- title: 'DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations'
url: 'http://arxiv.org/abs/2006.03659v4'
IfTraining: 0
IfOnlineDemo: 0
\ No newline at end of file
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册