{ "cells": [ { "cell_type": "markdown", "id": "b0541e6a", "metadata": {}, "source": [ "# roberta-large-ner-english: model fine-tuned from roberta-large for NER task\n" ] }, { "cell_type": "markdown", "id": "c85540d7", "metadata": {}, "source": [ "## Introduction\n" ] }, { "cell_type": "markdown", "id": "c2e2ebde", "metadata": {}, "source": [ "roberta-large-ner-english is an english NER model that was fine-tuned from roberta-large on conll2003 dataset.\n", "Model was validated on emails/chat data and outperformed other models on this type of data specifically.\n", "In particular the model seems to work better on entity that don't start with an upper case.\n" ] }, { "cell_type": "markdown", "id": "4f6d5dbe", "metadata": {}, "source": [ "## How to use" ] }, { "cell_type": "code", "execution_count": null, "id": "a159cf92", "metadata": {}, "outputs": [], "source": [ "!pip install --upgrade paddlenlp" ] }, { "cell_type": "code", "execution_count": null, "id": "daa60299", "metadata": {}, "outputs": [], "source": [ "import paddle\n", "from paddlenlp.transformers import AutoModel\n", "\n", "model = AutoModel.from_pretrained(\"Jean-Baptiste/roberta-large-ner-english\")\n", "input_ids = paddle.randint(100, 200, shape=[1, 20])\n", "print(model(input_ids))" ] }, { "cell_type": "markdown", "id": "2a66154e", "metadata": {}, "source": [ "For those who could be interested, here is a short article on how I used the results of this model to train a LSTM model for signature detection in emails:\n", "https://medium.com/@jean-baptiste.polle/lstm-model-for-email-signature-detection-8e990384fefa\n", "\n", "> The model introduction and model weights originate from https://huggingface.co/Jean-Baptiste/roberta-large-ner-english and were converted to PaddlePaddle format for ease of use in PaddleNLP." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.13" } }, "nbformat": 4, "nbformat_minor": 5 }