{ "cells": [ { "cell_type": "markdown", "id": "28a330b8", "metadata": {}, "source": [ "# bert-base-romanian-cased-v1\n" ] }, { "cell_type": "markdown", "id": "36f0d74f", "metadata": {}, "source": [ "The BERT **base**, **cased** model for Romanian, trained on a 15GB corpus." ] }, { "cell_type": "markdown", "id": "0104e14e", "metadata": {}, "source": [ "## How to use" ] }, { "cell_type": "code", "execution_count": null, "id": "b4ca4271", "metadata": {}, "outputs": [], "source": [ "!pip install --upgrade paddlenlp" ] }, { "cell_type": "code", "execution_count": null, "id": "9f3ca553", "metadata": {}, "outputs": [], "source": [ "import paddle\n", "from paddlenlp.transformers import AutoModel\n", "\n", "model = AutoModel.from_pretrained(\"dumitrescustefan/bert-base-romanian-cased-v1\")\n", "input_ids = paddle.randint(100, 200, shape=[1, 20])\n", "print(model(input_ids))" ] }, { "cell_type": "markdown", "id": "51754d3f", "metadata": {}, "source": [ "## Citation\n", "\n", "```\n", "@inproceedings{dumitrescu-etal-2020-birth,\n", "title = \"The birth of {R}omanian {BERT}\",\n", "author = \"Dumitrescu, Stefan and\n", "Avram, Andrei-Marius and\n", "Pyysalo, Sampo\",\n", "booktitle = \"Findings of the Association for Computational Linguistics: EMNLP 2020\",\n", "month = nov,\n", "year = \"2020\",\n", "address = \"Online\",\n", "publisher = \"Association for Computational Linguistics\",\n", "url = \"https://aclanthology.org/2020.findings-emnlp.387\",\n", "doi = \"10.18653/v1/2020.findings-emnlp.387\",\n", "pages = \"4324--4328\",\n", "}\n", "```" ] }, { "cell_type": "markdown", "id": "2143146f", "metadata": {}, "source": [ "#### Acknowledgements\n" ] }, { "cell_type": "markdown", "id": "d983ac22", "metadata": {}, "source": [ "- We'd like to thank [Sampo Pyysalo](https://github.com/spyysalo) from TurkuNLP for helping us out with the compute needed to pretrain the v1.0 BERT models. He's awesome!\n", "> The model introduction and model weights originate from [https://huggingface.co/dumitrescustefan/bert-base-romanian-cased-v1](https://huggingface.co/dumitrescustefan/bert-base-romanian-cased-v1) and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.13" } }, "nbformat": 4, "nbformat_minor": 5 }