pax_global_header 0000666 0000000 0000000 00000000064 14417466726 0014533 g ustar 00root root 0000000 0000000 52 comment=55b26881a953c89c00297aabd304f28bad998bc1
smartid-poc-main/ 0000775 0000000 0000000 00000000000 14417466726 0014243 5 ustar 00root root 0000000 0000000 smartid-poc-main/LICENSE 0000664 0000000 0000000 00000002750 14417466726 0015254 0 ustar 00root root 0000000 0000000 BSD 3-Clause License
Copyright (c) 2023, ETSI
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
smartid-poc-main/POC_SmartID_v3.ipynb 0000664 0000000 0000000 00000164417 14417466726 0017737 0 ustar 00root root 0000000 0000000 {
"cells": [
{
"cell_type": "markdown",
"metadata": {
"copyright": "Copyright ETSI 2023. This file is licensed under the BSD 3-Clause License"
},
"source": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "IJ-AEpS5jUI2"
},
"source": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "BMMkANxi-LAJ"
},
"source": [
"\n",
"# a. Installing the packages and importing the modules\n",
"---\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "nXQKofdtai9B",
"outputId": "738c9b61-cd7e-4d5b-ff96-1761d713611c"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Package Version\n",
"---------------------------- -----------\n",
"absl-py 1.3.0\n",
"aiohttp 3.8.3\n",
"aiosignal 1.3.1\n",
"altair 4.2.0\n",
"anyio 3.6.2\n",
"argon2-cffi 21.3.0\n",
"argon2-cffi-bindings 21.2.0\n",
"arrow 1.2.3\n",
"asttokens 2.2.1\n",
"astunparse 1.6.3\n",
"async-timeout 4.0.2\n",
"attrs 22.1.0\n",
"backcall 0.2.0\n",
"beautifulsoup4 4.11.1\n",
"bleach 5.0.1\n",
"cachetools 5.2.0\n",
"certifi 2022.12.7\n",
"cffi 1.15.1\n",
"charset-normalizer 2.1.1\n",
"click 8.1.3\n",
"comm 0.1.2\n",
"contourpy 1.0.6\n",
"cuda-python 12.0.0\n",
"cycler 0.11.0\n",
"Cython 0.29.32\n",
"dbus-python 1.2.18\n",
"debugpy 1.6.4\n",
"decorator 5.1.1\n",
"defusedxml 0.7.1\n",
"entrypoints 0.4\n",
"executing 1.2.0\n",
"fastapi 0.88.0\n",
"fastjsonschema 2.16.2\n",
"ffmpy 0.3.0\n",
"filelock 3.8.2\n",
"flatbuffers 1.12\n",
"fonttools 4.38.0\n",
"fqdn 1.5.1\n",
"frozenlist 1.3.3\n",
"fsspec 2022.11.0\n",
"gast 0.4.0\n",
"google-auth 2.15.0\n",
"google-auth-oauthlib 0.4.6\n",
"google-pasta 0.2.0\n",
"gradio 3.16.0\n",
"grpcio 1.51.1\n",
"gyp 0.1\n",
"h11 0.14.0\n",
"h5py 3.7.0\n",
"httpcore 0.16.3\n",
"httpx 0.23.3\n",
"huggingface-hub 0.11.1\n",
"idna 3.4\n",
"ipykernel 6.19.2\n",
"ipython 8.7.0\n",
"ipython-genutils 0.2.0\n",
"ipywidgets 8.0.3\n",
"isoduration 20.11.0\n",
"jedi 0.18.2\n",
"Jinja2 3.1.2\n",
"joblib 1.2.0\n",
"jsonpointer 2.3\n",
"jsonschema 4.17.3\n",
"jupyter 1.0.0\n",
"jupyter_client 7.4.8\n",
"jupyter-console 6.4.4\n",
"jupyter_core 5.1.0\n",
"jupyter-events 0.5.0\n",
"jupyter_server 2.0.1\n",
"jupyter_server_terminals 0.4.2\n",
"jupyterlab-pygments 0.2.2\n",
"jupyterlab-widgets 3.0.4\n",
"keras 2.9.0\n",
"Keras-Preprocessing 1.1.2\n",
"kiwisolver 1.4.4\n",
"libclang 14.0.6\n",
"linkify-it-py 1.0.3\n",
"Markdown 3.4.1\n",
"markdown-it-py 2.1.0\n",
"MarkupSafe 2.1.1\n",
"matplotlib 3.6.2\n",
"matplotlib-inline 0.1.6\n",
"mdit-py-plugins 0.3.3\n",
"mdurl 0.1.2\n",
"mistune 2.0.4\n",
"multidict 6.0.4\n",
"nbclassic 0.4.8\n",
"nbclient 0.7.2\n",
"nbconvert 7.2.6\n",
"nbformat 5.7.0\n",
"nest-asyncio 1.5.6\n",
"notebook 6.5.2\n",
"notebook_shim 0.2.2\n",
"numpy 1.23.5\n",
"nvidia-cublas-cu11 11.10.3.66\n",
"nvidia-cuda-nvrtc-cu11 11.7.99\n",
"nvidia-cuda-runtime-cu11 11.7.99\n",
"nvidia-cudnn-cu11 8.5.0.96\n",
"oauthlib 3.2.2\n",
"opt-einsum 3.3.0\n",
"orjson 3.8.4\n",
"packaging 22.0\n",
"pandas 1.5.2\n",
"pandocfilters 1.5.0\n",
"parso 0.8.3\n",
"pexpect 4.8.0\n",
"pickle5 0.0.11\n",
"pickleshare 0.7.5\n",
"Pillow 9.3.0\n",
"pip 22.2\n",
"platformdirs 2.6.0\n",
"prometheus-client 0.15.0\n",
"prompt-toolkit 3.0.36\n",
"protobuf 3.19.6\n",
"psutil 5.9.4\n",
"ptyprocess 0.7.0\n",
"pure-eval 0.2.2\n",
"pyasn1 0.4.8\n",
"pyasn1-modules 0.2.8\n",
"pycparser 2.21\n",
"pycryptodome 3.16.0\n",
"pydantic 1.10.4\n",
"pydub 0.25.1\n",
"Pygments 2.13.0\n",
"PyGObject 3.42.2\n",
"pyparsing 3.0.9\n",
"pyrsistent 0.19.2\n",
"python-dateutil 2.8.2\n",
"python-json-logger 2.0.4\n",
"python-multipart 0.0.5\n",
"pytz 2022.6\n",
"PyYAML 6.0\n",
"pyzmq 24.0.1\n",
"qtconsole 5.4.0\n",
"QtPy 2.3.0\n",
"regex 2022.10.31\n",
"requests 2.28.1\n",
"requests-oauthlib 1.3.1\n",
"rfc3339-validator 0.1.4\n",
"rfc3986 1.5.0\n",
"rfc3986-validator 0.1.1\n",
"rsa 4.9\n",
"scikit-learn 1.2.0\n",
"scipy 1.9.3\n",
"Send2Trash 1.8.0\n",
"sentencepiece 0.1.97\n",
"setuptools 59.6.0\n",
"six 1.16.0\n",
"sniffio 1.3.0\n",
"soupsieve 2.3.2.post1\n",
"stack-data 0.6.2\n",
"starlette 0.22.0\n",
"tensorboard 2.9.1\n",
"tensorboard-data-server 0.6.1\n",
"tensorboard-plugin-wit 1.8.1\n",
"tensorflow 2.9.2\n",
"tensorflow-estimator 2.9.0\n",
"tensorflow-hub 0.12.0\n",
"tensorflow-io-gcs-filesystem 0.28.0\n",
"tensorflow-text 2.9.0\n",
"termcolor 2.1.1\n",
"terminado 0.17.1\n",
"threadpoolctl 3.1.0\n",
"tinycss2 1.2.1\n",
"tokenizers 0.13.2\n",
"toolz 0.12.0\n",
"torch 1.13.0\n",
"tornado 6.2\n",
"tqdm 4.64.1\n",
"traitlets 5.7.1\n",
"transformers 4.25.1\n",
"typing_extensions 4.4.0\n",
"uc-micro-py 1.0.1\n",
"uri-template 1.2.0\n",
"urllib3 1.26.13\n",
"uvicorn 0.20.0\n",
"wcwidth 0.2.5\n",
"webcolors 1.12\n",
"webencodings 0.5.1\n",
"websocket-client 1.4.2\n",
"websockets 10.4\n",
"Werkzeug 2.2.2\n",
"wheel 0.37.1\n",
"widgetsnbextension 4.0.4\n",
"wrapt 1.14.1\n",
"xlrd 1.2.0\n",
"yarl 1.8.2\n"
]
}
],
"source": [
"!pip list"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"id": "ocfsWcldPu_X"
},
"outputs": [],
"source": [
"%%capture\n",
"!pip install transformers\n",
"!pip install sentencepiece\n",
"!pip install -U tensorflow==2.9.2"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"id": "156-of7owSBi"
},
"outputs": [],
"source": [
"%%capture\n",
"!pip install tensorflow_text==2.9.0\n",
"!pip install xlrd==1.2.0"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"id": "UhZwl_ki2ZsS"
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"2023-01-13 13:34:20.293788: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory\n",
"2023-01-13 13:34:20.293832: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.\n"
]
}
],
"source": [
"import pandas as pd\n",
"import tensorflow as tf\n",
"import tensorflow_hub as hub\n",
"import tensorflow_text as text\n",
"import pickle\n",
"from sklearn.model_selection import train_test_split"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "cDZ_-EaGLiQQ"
},
"source": [
"## b. Definition of the classificator based on the pre-trainied Camembert-base-xnli model"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"id": "wiTc3ZJ-JRiG"
},
"outputs": [],
"source": [
"# What's the purpose of the pipeline that you're creating here??\n",
"from transformers import pipeline\n",
"\n",
"classifier = pipeline(\"zero-shot-classification\", model=\"BaptisteDoyen/camembert-base-xnli\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "7ZqAhaR5_V7r"
},
"source": [
"## c. Building the data set"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"id": "RKib_9V6hRMY"
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/tmp/ipykernel_12/2976529013.py:223: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n",
" df = df.append(pd.Series(data=d), ignore_index=True)\n"
]
}
],
"source": [
"e1 = pd.Series({'task':'medical appointment booking',\n",
" 'location':'office',\n",
" 'internal equipment': 'smartphone or computer',\n",
" 'internal networks':'802.11',\n",
" 'internal services':'',\n",
" 'external equipment': 'equipment participating in the session',\n",
" 'external networks':'internet',\n",
" 'external services': 'web, health platform services',\n",
" 'information':'health insurance',\n",
" 'templates':'health'})\n",
"\n",
"e2 = pd.Series({'task':'book medical appointment',\n",
" 'location':'house',\n",
" 'internal equipment': 'smartphone or computer',\n",
" 'internal networks':'802.11',\n",
" 'internal services':'',\n",
" 'external equipment': 'equipment participating in the session',\n",
" 'external networks':'internet',\n",
" 'external services': 'web, health platform services',\n",
" 'information':'health insurance',\n",
" 'templates':'health'})\n",
"\n",
"e3 = pd.Series({'task':'medical appointment booking',\n",
" 'location':\"outside\",\n",
" 'internal equipment': 'smartphone',\n",
" 'internal networks':'3GPP',\n",
" 'internal services':'',\n",
" 'external equipment': 'equipment participating in the session',\n",
" 'external networks':'internet',\n",
" 'external services': 'web, health platform services',\n",
" 'information':'health insurance',\n",
" 'templates':'health'})\n",
"\n",
"e4 = pd.Series({'task':'Go to doctor',\n",
" 'location':\"outside\",\n",
" 'internal equipment': 'smartphone',\n",
" 'internal networks':'3GPP',\n",
" 'internal services':'set location',\n",
" 'external equipment': 'equipment participating in the session',\n",
" 'external networks':'internet',\n",
" 'external services':'weather, get location',\n",
" 'information':'travel ticket',\n",
" 'templates':'displacement'})\n",
"\n",
"e5 = pd.Series({'task':'Consultation by doctor',\n",
" 'location':\"at destination\",\n",
" 'internal equipment': 'smartphone',\n",
" 'internal networks':'',\n",
" 'internal services':'',\n",
" 'external equipment': '',\n",
" 'external networks':'',\n",
" 'external services':'',\n",
" 'information': 'health insurance, mutuals',\n",
" 'templates':'health'})\n",
"\n",
"e6 = pd.Series({'task':'Teleconsultation',\n",
" 'location':\"house\",\n",
" 'internal equipment': 'smartphone or computer',\n",
" 'internal networks':'802.11',\n",
" 'internal services':'Consultation reminder notification',\n",
" 'external equipment': 'equipment participating in the session',\n",
" 'external networks':'internet',\n",
" 'external services':'health platform services',\n",
" 'information': 'health insurance, mutuals',\n",
" 'templates':'health'})\n",
"\n",
"e7 = pd.Series({'task':'Consultation fee payment',\n",
" 'location':\"house\",\n",
" 'internal equipment': 'smartphone',\n",
" 'internal networks':'802.11',\n",
" 'internal services':'notification for payment validation',\n",
" 'external equipment': 'equipment participating in the session',\n",
" 'external networks':'internet',\n",
" 'external services':'banking services',\n",
" 'information':'credit card',\n",
" 'templates':'finance'})\n",
"\n",
"e8 = pd.Series({'task':'Consultation fee payment',\n",
" 'location':\"at destination\",\n",
" 'internal equipment': 'smartphone',\n",
" 'internal networks':'',\n",
" 'internal services':'',\n",
" 'external equipment': '',\n",
" 'external networks':'',\n",
" 'external services':'banking services',\n",
" 'information': 'credit card, health insurance, mutual funds',\n",
" 'templates':'finance'})\n",
"\n",
"e9 = pd.Series({'task':'Purchase of prescribed drugs',\n",
" 'location':\"outside\",\n",
" 'internal equipment': 'smartphone',\n",
" 'internal networks':'3GPP',\n",
" 'internal services':'set location',\n",
" 'external equipment': 'equipment participating in the session',\n",
" 'external networks':'internet',\n",
" 'external services':\"get location, banking services\",\n",
" 'information': 'credit card, prescription, health insurance, mutual funds',\n",
" 'templates':'finance, health'})\n",
"\t\t\t\t \n",
"e10 = pd.Series({'task':'Do medical exams',\n",
" 'location':\"outside\",\n",
" 'internal equipment': 'smartphone',\n",
" 'internal networks':'3GPP',\n",
" 'internal services':'set location, weather',\n",
" 'external equipment': 'equipment participating in the session',\n",
" 'external networks':'internet',\n",
" 'external services':\"get location, weather\",\n",
" 'information': 'transport ticket, prescription, health insurance, mutual funds',\n",
" 'templates':'movement, health'})\n",
"\n",
"e11 = pd.Series({'task':\"Payment for gym membership\",\n",
" 'location':\"house\",\n",
" 'internal equipment': 'smartphone',\n",
" 'internal networks':'802.11',\n",
" 'internal services':'',\n",
" 'external equipment': 'equipment participating in the session',\n",
" 'external networks':'internet',\n",
" 'external services': \"web, sports platform services, banking services\",\n",
" 'information':'credit card',\n",
" 'templates':'sports'})\n",
"\n",
"e12 = pd.Series({'task':\"Gym membership payment\",\n",
" 'location':\"office\",\n",
" 'internal equipment': 'smartphone',\n",
" 'internal networks':'802.11',\n",
" 'internal services':'',\n",
" 'external equipment': 'equipment participating in the session',\n",
" 'external networks':'internet',\n",
" 'external services': \"web, sports platform services, banking services\",\n",
" 'information':'credit card',\n",
" 'templates':'sports'})\n",
"\n",
"e13 = pd.Series({'task':\"Payment for gym membership\",\n",
" 'location':\"outside\",\n",
" 'internal equipment': 'smartphone',\n",
" 'internal networks':'3GPP',\n",
" 'internal services':'',\n",
" 'external equipment': 'equipment participating in the session',\n",
" 'external networks':'internet',\n",
" 'external services': \"web, sports platform services, banking services\",\n",
" 'information': 'credit card, transport ticket',\n",
" 'templates':'sport, travel'})\n",
"\n",
"e14 = pd.Series({'task':\"Finding the nearest gym\",\n",
" 'location':\"outside\",\n",
" 'internal equipment': 'smartphone',\n",
" 'internal networks':'3GPP',\n",
" 'internal services':'set location',\n",
" 'external equipment': 'equipment participating in the session',\n",
" 'external networks':'internet',\n",
" 'external services':\"web, get location, weather\",\n",
" 'information':'',\n",
" 'templates':'sports'})\n",
"\n",
"e15 = pd.Series({'task':\"Finding the nearest gym\",\n",
" 'location':\"house\",\n",
" 'internal equipment': 'smartphone',\n",
" 'internal networks':'802.11',\n",
" 'internal services':'set location',\n",
" 'external equipment': 'equipment participating in the session',\n",
" 'external networks':'internet',\n",
" 'external services':\"web, get location, weather\",\n",
" 'information':'',\n",
" 'templates':'sports'})\n",
"\n",
"e16 = pd.Series({'task':\"Finding the nearest gym\",\n",
" 'location':\"office\",\n",
" 'internal equipment': 'smartphone',\n",
" 'internal networks':'802.11',\n",
" 'internal services':'set location',\n",
" 'external equipment': 'equipment participating in the session',\n",
" 'external networks':'internet',\n",
" 'external services':\"web, get location, weather\",\n",
" 'information':'',\n",
" 'templates':'sports'})\n",
"\n",
"e17 = pd.Series({'task':\"Choice of physical activity exercises to do\",\n",
" 'location':\"house\",\n",
" 'internal equipment': 'smartphone',\n",
" 'internal networks':'802.11',\n",
" 'internal services':'notepad',\n",
" 'external equipment': 'equipment participating in the session',\n",
" 'external networks':'internet',\n",
" 'external services': \"web, sports platform services\",\n",
" 'information':'',\n",
" 'templates':'sports'})\n",
"\n",
"e18 = pd.Series({'task':\"On the way to the gym\",\n",
" 'location':\"outside\",\n",
" 'internal equipment': 'smartphone',\n",
" 'internal networks':'3GPP',\n",
" 'internal services':'set location',\n",
" 'external equipment': 'equipment participating in the session',\n",
" 'external network':'internet',\n",
" 'external services':\"get location\",\n",
" 'information':'travel ticket',\n",
" 'templates':'displacement'})\n",
"\n",
"e19 = pd.Series({'task':\"Outdoor sports session\",\n",
" 'location':\"outside\",\n",
" 'internal equipment': 'smartphone',\n",
" 'internal networks':'3GPP',\n",
" 'internal services':\"set location, set information (training statistics), get information\",\n",
" 'external equipment': 'equipment participating in the session',\n",
" 'external networks':'internet',\n",
" 'external services':\"get location, music, exercise programs\",\n",
" 'information':'',\n",
" 'templates':'sport, health, travel'})\n",
"\n",
"e20 = pd.Series({'task':\"Indoor sports session\",\n",
" 'location':\"at destination\",\n",
" 'internal equipment': 'smartphone',\n",
" 'internal networks':'3GPP',\n",
" 'internal services':\"set information (training statistics), get information\",\n",
" 'external equipment': 'equipment participating in the session',\n",
" 'external networks':'internet',\n",
" 'external services':\"music, exercise programs\",\n",
" 'information':'club membership card',\n",
" 'templates':'sport,health'})\n",
"\n",
"df = pd.DataFrame([e1,e2,e3,e4,e5,e6,e7,e8,e9,e10,e11,e12,e13,e14,e15,e16,e17,e18,e19,e20])\n",
"d = {'task':'Make a covid test','location':'outside','internal equipment':'smartphone','internal networks':'3GPP','internal services':'set location','external equipment':'equipment participating in the session', 'external networks': 'internet', 'external services': 'get location, test platform services', 'information': 'transport ticket, health insurance, health insurance', 'templates': 'travel, health' }\n",
"df = df.append(pd.Series(data=d), ignore_index=True)\t\t\t\t "
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 1000
},
"id": "_vAyErGXhkws",
"outputId": "76023cdc-c14d-4db1-c224-1f2a81b09207"
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" task | \n",
" location | \n",
" internal equipment | \n",
" internal networks | \n",
" internal services | \n",
" external equipment | \n",
" external networks | \n",
" external services | \n",
" information | \n",
" templates | \n",
" external network | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" medical appointment booking | \n",
" office | \n",
" smartphone or computer | \n",
" 802.11 | \n",
" | \n",
" equipment participating in the session | \n",
" internet | \n",
" web, health platform services | \n",
" health insurance | \n",
" health | \n",
" NaN | \n",
"
\n",
" \n",
" 1 | \n",
" book medical appointment | \n",
" house | \n",
" smartphone or computer | \n",
" 802.11 | \n",
" | \n",
" equipment participating in the session | \n",
" internet | \n",
" web, health platform services | \n",
" health insurance | \n",
" health | \n",
" NaN | \n",
"
\n",
" \n",
" 2 | \n",
" medical appointment booking | \n",
" outside | \n",
" smartphone | \n",
" 3GPP | \n",
" | \n",
" equipment participating in the session | \n",
" internet | \n",
" web, health platform services | \n",
" health insurance | \n",
" health | \n",
" NaN | \n",
"
\n",
" \n",
" 3 | \n",
" Go to doctor | \n",
" outside | \n",
" smartphone | \n",
" 3GPP | \n",
" set location | \n",
" equipment participating in the session | \n",
" internet | \n",
" weather, get location | \n",
" travel ticket | \n",
" displacement | \n",
" NaN | \n",
"
\n",
" \n",
" 4 | \n",
" Consultation by doctor | \n",
" at destination | \n",
" smartphone | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" health insurance, mutuals | \n",
" health | \n",
" NaN | \n",
"
\n",
" \n",
" 5 | \n",
" Teleconsultation | \n",
" house | \n",
" smartphone or computer | \n",
" 802.11 | \n",
" Consultation reminder notification | \n",
" equipment participating in the session | \n",
" internet | \n",
" health platform services | \n",
" health insurance, mutuals | \n",
" health | \n",
" NaN | \n",
"
\n",
" \n",
" 6 | \n",
" Consultation fee payment | \n",
" house | \n",
" smartphone | \n",
" 802.11 | \n",
" notification for payment validation | \n",
" equipment participating in the session | \n",
" internet | \n",
" banking services | \n",
" credit card | \n",
" finance | \n",
" NaN | \n",
"
\n",
" \n",
" 7 | \n",
" Consultation fee payment | \n",
" at destination | \n",
" smartphone | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" banking services | \n",
" credit card, health insurance, mutual funds | \n",
" finance | \n",
" NaN | \n",
"
\n",
" \n",
" 8 | \n",
" Purchase of prescribed drugs | \n",
" outside | \n",
" smartphone | \n",
" 3GPP | \n",
" set location | \n",
" equipment participating in the session | \n",
" internet | \n",
" get location, banking services | \n",
" credit card, prescription, health insurance, m... | \n",
" finance, health | \n",
" NaN | \n",
"
\n",
" \n",
" 9 | \n",
" Do medical exams | \n",
" outside | \n",
" smartphone | \n",
" 3GPP | \n",
" set location, weather | \n",
" equipment participating in the session | \n",
" internet | \n",
" get location, weather | \n",
" transport ticket, prescription, health insuran... | \n",
" movement, health | \n",
" NaN | \n",
"
\n",
" \n",
" 10 | \n",
" Payment for gym membership | \n",
" house | \n",
" smartphone | \n",
" 802.11 | \n",
" | \n",
" equipment participating in the session | \n",
" internet | \n",
" web, sports platform services, banking services | \n",
" credit card | \n",
" sports | \n",
" NaN | \n",
"
\n",
" \n",
" 11 | \n",
" Gym membership payment | \n",
" office | \n",
" smartphone | \n",
" 802.11 | \n",
" | \n",
" equipment participating in the session | \n",
" internet | \n",
" web, sports platform services, banking services | \n",
" credit card | \n",
" sports | \n",
" NaN | \n",
"
\n",
" \n",
" 12 | \n",
" Payment for gym membership | \n",
" outside | \n",
" smartphone | \n",
" 3GPP | \n",
" | \n",
" equipment participating in the session | \n",
" internet | \n",
" web, sports platform services, banking services | \n",
" credit card, transport ticket | \n",
" sport, travel | \n",
" NaN | \n",
"
\n",
" \n",
" 13 | \n",
" Finding the nearest gym | \n",
" outside | \n",
" smartphone | \n",
" 3GPP | \n",
" set location | \n",
" equipment participating in the session | \n",
" internet | \n",
" web, get location, weather | \n",
" | \n",
" sports | \n",
" NaN | \n",
"
\n",
" \n",
" 14 | \n",
" Finding the nearest gym | \n",
" house | \n",
" smartphone | \n",
" 802.11 | \n",
" set location | \n",
" equipment participating in the session | \n",
" internet | \n",
" web, get location, weather | \n",
" | \n",
" sports | \n",
" NaN | \n",
"
\n",
" \n",
" 15 | \n",
" Finding the nearest gym | \n",
" office | \n",
" smartphone | \n",
" 802.11 | \n",
" set location | \n",
" equipment participating in the session | \n",
" internet | \n",
" web, get location, weather | \n",
" | \n",
" sports | \n",
" NaN | \n",
"
\n",
" \n",
" 16 | \n",
" Choice of physical activity exercises to do | \n",
" house | \n",
" smartphone | \n",
" 802.11 | \n",
" notepad | \n",
" equipment participating in the session | \n",
" internet | \n",
" web, sports platform services | \n",
" | \n",
" sports | \n",
" NaN | \n",
"
\n",
" \n",
" 17 | \n",
" On the way to the gym | \n",
" outside | \n",
" smartphone | \n",
" 3GPP | \n",
" set location | \n",
" equipment participating in the session | \n",
" NaN | \n",
" get location | \n",
" travel ticket | \n",
" displacement | \n",
" internet | \n",
"
\n",
" \n",
" 18 | \n",
" Outdoor sports session | \n",
" outside | \n",
" smartphone | \n",
" 3GPP | \n",
" set location, set information (training statis... | \n",
" equipment participating in the session | \n",
" internet | \n",
" get location, music, exercise programs | \n",
" | \n",
" sport, health, travel | \n",
" NaN | \n",
"
\n",
" \n",
" 19 | \n",
" Indoor sports session | \n",
" at destination | \n",
" smartphone | \n",
" 3GPP | \n",
" set information (training statistics), get inf... | \n",
" equipment participating in the session | \n",
" internet | \n",
" music, exercise programs | \n",
" club membership card | \n",
" sport,health | \n",
" NaN | \n",
"
\n",
" \n",
" 20 | \n",
" Make a covid test | \n",
" outside | \n",
" smartphone | \n",
" 3GPP | \n",
" set location | \n",
" equipment participating in the session | \n",
" internet | \n",
" get location, test platform services | \n",
" transport ticket, health insurance, health ins... | \n",
" travel, health | \n",
" NaN | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" task location \\\n",
"0 medical appointment booking office \n",
"1 book medical appointment house \n",
"2 medical appointment booking outside \n",
"3 Go to doctor outside \n",
"4 Consultation by doctor at destination \n",
"5 Teleconsultation house \n",
"6 Consultation fee payment house \n",
"7 Consultation fee payment at destination \n",
"8 Purchase of prescribed drugs outside \n",
"9 Do medical exams outside \n",
"10 Payment for gym membership house \n",
"11 Gym membership payment office \n",
"12 Payment for gym membership outside \n",
"13 Finding the nearest gym outside \n",
"14 Finding the nearest gym house \n",
"15 Finding the nearest gym office \n",
"16 Choice of physical activity exercises to do house \n",
"17 On the way to the gym outside \n",
"18 Outdoor sports session outside \n",
"19 Indoor sports session at destination \n",
"20 Make a covid test outside \n",
"\n",
" internal equipment internal networks \\\n",
"0 smartphone or computer 802.11 \n",
"1 smartphone or computer 802.11 \n",
"2 smartphone 3GPP \n",
"3 smartphone 3GPP \n",
"4 smartphone \n",
"5 smartphone or computer 802.11 \n",
"6 smartphone 802.11 \n",
"7 smartphone \n",
"8 smartphone 3GPP \n",
"9 smartphone 3GPP \n",
"10 smartphone 802.11 \n",
"11 smartphone 802.11 \n",
"12 smartphone 3GPP \n",
"13 smartphone 3GPP \n",
"14 smartphone 802.11 \n",
"15 smartphone 802.11 \n",
"16 smartphone 802.11 \n",
"17 smartphone 3GPP \n",
"18 smartphone 3GPP \n",
"19 smartphone 3GPP \n",
"20 smartphone 3GPP \n",
"\n",
" internal services \\\n",
"0 \n",
"1 \n",
"2 \n",
"3 set location \n",
"4 \n",
"5 Consultation reminder notification \n",
"6 notification for payment validation \n",
"7 \n",
"8 set location \n",
"9 set location, weather \n",
"10 \n",
"11 \n",
"12 \n",
"13 set location \n",
"14 set location \n",
"15 set location \n",
"16 notepad \n",
"17 set location \n",
"18 set location, set information (training statis... \n",
"19 set information (training statistics), get inf... \n",
"20 set location \n",
"\n",
" external equipment external networks \\\n",
"0 equipment participating in the session internet \n",
"1 equipment participating in the session internet \n",
"2 equipment participating in the session internet \n",
"3 equipment participating in the session internet \n",
"4 \n",
"5 equipment participating in the session internet \n",
"6 equipment participating in the session internet \n",
"7 \n",
"8 equipment participating in the session internet \n",
"9 equipment participating in the session internet \n",
"10 equipment participating in the session internet \n",
"11 equipment participating in the session internet \n",
"12 equipment participating in the session internet \n",
"13 equipment participating in the session internet \n",
"14 equipment participating in the session internet \n",
"15 equipment participating in the session internet \n",
"16 equipment participating in the session internet \n",
"17 equipment participating in the session NaN \n",
"18 equipment participating in the session internet \n",
"19 equipment participating in the session internet \n",
"20 equipment participating in the session internet \n",
"\n",
" external services \\\n",
"0 web, health platform services \n",
"1 web, health platform services \n",
"2 web, health platform services \n",
"3 weather, get location \n",
"4 \n",
"5 health platform services \n",
"6 banking services \n",
"7 banking services \n",
"8 get location, banking services \n",
"9 get location, weather \n",
"10 web, sports platform services, banking services \n",
"11 web, sports platform services, banking services \n",
"12 web, sports platform services, banking services \n",
"13 web, get location, weather \n",
"14 web, get location, weather \n",
"15 web, get location, weather \n",
"16 web, sports platform services \n",
"17 get location \n",
"18 get location, music, exercise programs \n",
"19 music, exercise programs \n",
"20 get location, test platform services \n",
"\n",
" information templates \\\n",
"0 health insurance health \n",
"1 health insurance health \n",
"2 health insurance health \n",
"3 travel ticket displacement \n",
"4 health insurance, mutuals health \n",
"5 health insurance, mutuals health \n",
"6 credit card finance \n",
"7 credit card, health insurance, mutual funds finance \n",
"8 credit card, prescription, health insurance, m... finance, health \n",
"9 transport ticket, prescription, health insuran... movement, health \n",
"10 credit card sports \n",
"11 credit card sports \n",
"12 credit card, transport ticket sport, travel \n",
"13 sports \n",
"14 sports \n",
"15 sports \n",
"16 sports \n",
"17 travel ticket displacement \n",
"18 sport, health, travel \n",
"19 club membership card sport,health \n",
"20 transport ticket, health insurance, health ins... travel, health \n",
"\n",
" external network \n",
"0 NaN \n",
"1 NaN \n",
"2 NaN \n",
"3 NaN \n",
"4 NaN \n",
"5 NaN \n",
"6 NaN \n",
"7 NaN \n",
"8 NaN \n",
"9 NaN \n",
"10 NaN \n",
"11 NaN \n",
"12 NaN \n",
"13 NaN \n",
"14 NaN \n",
"15 NaN \n",
"16 NaN \n",
"17 internet \n",
"18 NaN \n",
"19 NaN \n",
"20 NaN "
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "-a_d-du3Ajud"
},
"source": [
"## d. Building the models"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "jR-P1NzJCm2E"
},
"source": [
"#*d.1. Model 1 for classification of the activity*"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "zcHeC1ZMh4CM",
"outputId": "d5b81526-8b12-43ac-c502-a50e40ad1e2c"
},
"outputs": [
{
"data": {
"text/plain": [
"{'sequence': 'Purchase of prescribed drugs',\n",
" 'labels': ['health', 'travel', 'finance', 'sport'],\n",
" 'scores': [0.43092602491378784,\n",
" 0.3009793162345886,\n",
" 0.17015446722507477,\n",
" 0.09794023633003235]}"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"sequence = \"Purchase of prescribed drugs\"\n",
"candidate_labels = [\"sport\",\"health\",\"travel\", \"finance\"]\n",
"hypothesis_template = \"This text is about {}.\"\n",
"\n",
"classifier(sequence, candidate_labels, hypothesis_template=hypothesis_template)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "dOjRSgRODnLW"
},
"source": [
"# *d.2. Model 2 to calculate the simularity between the data and the prediction of resources*"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"id": "dCpH-dneQpr8"
},
"outputs": [],
"source": [
"#Importing the library to pre-process the data\n",
"\n",
"from transformers import AutoTokenizer, AutoModelForSequenceClassification"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"id": "ltDFZt1IQFFX"
},
"outputs": [],
"source": [
"# Generating the embeddings and importing the pre-trained model\n",
"\n",
"nli_model = AutoModelForSequenceClassification.from_pretrained(\"BaptisteDoyen/camembert-base-xnli\")\n",
"tokenizer = AutoTokenizer.from_pretrained(\"BaptisteDoyen/camembert-base-xnli\") \n"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "-egIY0HSQ-RR",
"outputId": "99e6fc26-9e34-4998-a150-8e7e9ed3f2f4"
},
"outputs": [
{
"data": {
"text/plain": [
"70.14259099960327"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#Building the model which calculates the simularity between the reference phrase and the candidate phrase\n",
"\n",
"premise = \"Choice of physical activity exercises to do\"\n",
"hypothesis = \"training program composition\"\n",
"# tokenize and run through model\n",
"x = tokenizer.encode(premise, hypothesis, return_tensors='pt')\n",
"logits = nli_model(x)[0]\n",
"# we throw away \"neutral\" (dim 1) and take the probability of\n",
"# \"entailment\" (0) as the probability of the label being true \n",
"entail_contradiction_logits = logits[:,::2]\n",
"probs = entail_contradiction_logits.softmax(dim=1)\n",
"prob_label_is_true = probs[:,0]\n",
"prob_label_is_true[0].tolist() * 100"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"id": "_BV3Nc5SXSIL"
},
"outputs": [],
"source": [
"# Function to deliver the level of confidence of the simularity calculation, and deliver the phrase conisdered to be the reference phrase\n",
"\n",
"def det_tache(entry, data):\n",
" max = 0\n",
" for elt in data['task'].tolist():\n",
" x = tokenizer.encode(elt, entry, return_tensors='pt')\n",
" logits = nli_model(x)[0]\n",
" # we throw away \"neutral\" (dim 1) and take the probability of\n",
" # \"entailment\" (0) as the probability of the label being true \n",
" entail_contradiction_logits = logits[:,::2]\n",
" probs = entail_contradiction_logits.softmax(dim=1)\n",
" prob_label_is_true = probs[:,0]\n",
" tmp = prob_label_is_true[0].tolist() * 100\n",
" if(tmp>max):\n",
" max = tmp\n",
" tache = elt\n",
" return max,tache"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "TxwnxKicZvAZ",
"outputId": "207e60a0-9141-4051-f077-7d6ab0481ad0"
},
"outputs": [
{
"data": {
"text/plain": [
"(93.58221888542175, 'Finding the nearest gym')"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#Call the function det_tache\n",
"\n",
"det_tache(\"Composition of the training program\", df)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ZcAuLwraEHTv"
},
"source": [
"#e. Prediction of the resources to be used"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"id": "l2IGEcobbN4T"
},
"outputs": [],
"source": [
"##Function written for test purposes. To be improved\n",
"\n",
"def det_ressources(entry, loc, data=df):\n",
" \n",
" lignes = data.loc[data['task'] == det_tache(entry, data)[1]]\n",
" ressources = lignes.loc[lignes['location'] == loc]\n",
" return [ressources.equipements.values[0], ressources['networks'].values[0], ressources['services'].values[0]]\n",
"\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"id": "HJhaV7aau9Mg"
},
"outputs": [],
"source": [
"# Function to determine the resouces to be used to perform a task. It also calls the function to check similarity between the phrases \n",
"\n",
"def det_ressources1(entry, loc):\n",
" \n",
" lignes = df.loc[df['task'] == det_tache(entry, df)[1]]\n",
" ressources = lignes.loc[lignes['location'] == loc]\n",
" return [ressources['internal equipment'].values[0], ressources['internal networks'].values[0], ressources['internal services'].values[0], ressources['external equipment'].values[0], ressources['external networks'].values[0], ressources['external services'].values[0]]\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Tdl0spGSil6K"
},
"source": []
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "PFp4lo43moy4",
"outputId": "9580e1fe-3f42-4aeb-9644-adf6c61a4e7a"
},
"outputs": [
{
"data": {
"text/plain": [
"['smartphone',\n",
" '802.11',\n",
" 'set location',\n",
" 'equipment participating in the session',\n",
" 'internet',\n",
" 'web, get location, weather']"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Example of resource prediction\n",
"\n",
"det_ressources1(\"Composition of the training program\", \"house\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "H3ZRKp5PiDpo"
},
"source": [
"#f. Interface to test the models"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"id": "065aB5GqsueC"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...\n",
"To disable this warning, you can either:\n",
"\t- Avoid using `tokenizers` before the fork if possible\n",
"\t- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)\n"
]
}
],
"source": [
"%%capture\n",
"!pip install gradio"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"id": "IrnOmguss6si"
},
"outputs": [],
"source": [
"import gradio as gr"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 617
},
"id": "YEQoa4jei9bG",
"outputId": "dbe91889-5aaa-417b-cc8a-fd35b381e2f6"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Running on local URL: http://127.0.0.1:7860\n"
]
}
],
"source": [
"# Code to launch the Gradio test environment for model 1\n",
"\n",
"def zeroShotClassification(text_input, candidate_labels):\n",
" labels = [label.strip(' ') for label in candidate_labels.split(',')]\n",
" output = {}\n",
" prediction = classifier(text_input, labels)\n",
" for i in range(len(prediction['labels'])):\n",
" output[prediction['labels'][i]] = prediction['scores'][i]\n",
" return output\n",
"\n",
"demo = gr.Interface(fn=zeroShotClassification, inputs=[gr.Textbox(label=\"Task\"), gr.Textbox(label=\"Templates (classes) candidats\")], outputs=gr.Label(label=\"Classification\"))\n",
"demo.launch(share=True)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 617
},
"id": "mBsHM4WutJVZ",
"outputId": "2bb6d4e9-ba04-422f-cb6f-cb43028f451c"
},
"outputs": [],
"source": [
"# Code to launch the Gradio test environment for model 2\n",
"\n",
"demo = gr.Interface(fn=det_ressources1, inputs=['text', 'text'], outputs=\"text\")\n",
"demo.launch(share=True)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "bJzlHmJoPlJ3"
},
"outputs": [],
"source": []
}
],
"metadata": {
"accelerator": "GPU",
"colab": {
"provenance": []
},
"gpuClass": "standard",
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.7"
}
},
"nbformat": 4,
"nbformat_minor": 1
}
smartid-poc-main/POC_SmartID_v4.ipynb 0000664 0000000 0000000 00000375003 14417466726 0017733 0 ustar 00root root 0000000 0000000 {
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "BMMkANxi-LAJ"
},
"source": [
"\n",
"# a. Installing the packages and importing the modules\n",
"---\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "ocfsWcldPu_X"
},
"outputs": [],
"source": [
"%%capture\n",
"!pip install transformers\n",
"!pip install sentencepiece\n",
"!pip install -U tensorflow==2.11.1"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "156-of7owSBi"
},
"outputs": [],
"source": [
"%%capture\n",
"!pip install -U tensorflow_text==2.11.0\n",
"!pip install xlrd==1.2.0"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "UhZwl_ki2ZsS"
},
"outputs": [],
"source": [
"import pandas as pd\n",
"import tensorflow as tf\n",
"import tensorflow_hub as hub\n",
"import tensorflow_text as text\n",
"import pickle\n",
"from sklearn.model_selection import train_test_split"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "cDZ_-EaGLiQQ"
},
"source": [
"## b. Definition of the classificator based on the pre-trainied Camembert-base-xnli model"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "wiTc3ZJ-JRiG",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 177,
"referenced_widgets": [
"cff8c7e6b2f942e0a9df876e0bcd09bf",
"9aa98415ca874172b0bab2eacf12b553",
"1c78a79f4e284b37933c65c4caec0d6f",
"5e00bb0a1b47470a880a970cb3c5c7c5",
"d6ea4bcbd982430e96228e58911d4e5d",
"db52af7763d6410d927a5a5d8c81b52b",
"92065aa8dd5547508b244541f4b4953d",
"049cbb029db84ed6bc0fb36570c1b374",
"2be056a4cf9042a586b426a74402ab72",
"f118b5824c2b4f87bb6d65aef39ba356",
"68598a9db6b64f6b9f5989c0620c2b5c",
"1cce055c7ea148f9b696f9464e88d34b",
"c4506b58ebec4dbbbcef3a48e1122f13",
"52371fc7667341ca98465403cd9e0fed",
"897f253c09ca4ce7a46105739fd0552a",
"df66fae6114b46d2a9b0c379d78436e6",
"2b93d8268de148b5abce28c36c992b46",
"1407195508674d558426988738aacfc8",
"c8c1cc40816b4c81beb3ddbf52379abb",
"ec56a3cdc5cf41b8aeaf9f397332e18f",
"094131684bd64185b29127cfb2b291e2",
"05da292c7a36435faaec7c3957190b57",
"894b89467a9e4730826d69d38da09fcd",
"4246fb20634e4747bc022d91da11fd77",
"2fbd4f2642274418b90bbaef20d5665c",
"aee5ccde9fa544c78961d0e969715656",
"19963506dbcc4ae08384c30f3a800f12",
"7cd613cf081a46d5a768fd181bc8f3fc",
"41367350a5864304b8560bfd5d1b33e4",
"b146924bdcfc4272bd32554e9c62b089",
"9a8a057b18b34e0685b898d325dd156e",
"2b0d8ebb0cb04ce78eb7a86a0ca80cc1",
"592c81348d0f4a4c8d81a0370e0f4a0c",
"65b9d31511094cf7bb3b01dce95997f8",
"91c893d270584852864cd529ae751b62",
"ab50e757329d47bbb9d171c33e730716",
"e011343f8b444515b9b58140ec87fb53",
"f7ee9ab382034684b208940e7eb4f5c9",
"cb4cf6c5d75349dea57a90025770cf45",
"3f8743aba17d49c09c4f02e2789cc11a",
"131fe709136945ccb5d431584dd39f24",
"13ceaf20d8f042f890ec554f0ba9b3ce",
"78dbae4369d043babfacf2fcbad8a8de",
"291bd3ce8516465a86a590a0d30f6b29",
"5ba02140610a4e0684744c855a52410b",
"881c9fb5fb1942338e95c11f9a38819a",
"da25afb7ff104431b25e9211b7af68b8",
"3805233a5eb7431f86a3d453f3038ee8",
"f514de3b6d864fdabdff7cd992054323",
"05537e9a24a542f8b13736da09f86dfd",
"3873d640bc354a18b40b2d2e372c9a78",
"982afee1fe73438fa0e6b685c5afe552",
"4f9561538c5e42498b8bf621b52d4016",
"be068e7eca114654a8507f904924b39a",
"3762d5212c4e43c99a77c0d1b171ed46"
]
},
"outputId": "e4aea031-80d8-488c-fa55-8c0fe698fac5"
},
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
"Downloading (…)lve/main/config.json: 0%| | 0.00/882 [00:00, ?B/s]"
],
"application/vnd.jupyter.widget-view+json": {
"version_major": 2,
"version_minor": 0,
"model_id": "cff8c7e6b2f942e0a9df876e0bcd09bf"
}
},
"metadata": {}
},
{
"output_type": "display_data",
"data": {
"text/plain": [
"Downloading pytorch_model.bin: 0%| | 0.00/443M [00:00, ?B/s]"
],
"application/vnd.jupyter.widget-view+json": {
"version_major": 2,
"version_minor": 0,
"model_id": "1cce055c7ea148f9b696f9464e88d34b"
}
},
"metadata": {}
},
{
"output_type": "display_data",
"data": {
"text/plain": [
"Downloading (…)okenizer_config.json: 0%| | 0.00/433 [00:00, ?B/s]"
],
"application/vnd.jupyter.widget-view+json": {
"version_major": 2,
"version_minor": 0,
"model_id": "894b89467a9e4730826d69d38da09fcd"
}
},
"metadata": {}
},
{
"output_type": "display_data",
"data": {
"text/plain": [
"Downloading (…)tencepiece.bpe.model: 0%| | 0.00/811k [00:00, ?B/s]"
],
"application/vnd.jupyter.widget-view+json": {
"version_major": 2,
"version_minor": 0,
"model_id": "65b9d31511094cf7bb3b01dce95997f8"
}
},
"metadata": {}
},
{
"output_type": "display_data",
"data": {
"text/plain": [
"Downloading (…)cial_tokens_map.json: 0%| | 0.00/299 [00:00, ?B/s]"
],
"application/vnd.jupyter.widget-view+json": {
"version_major": 2,
"version_minor": 0,
"model_id": "5ba02140610a4e0684744c855a52410b"
}
},
"metadata": {}
}
],
"source": [
"# What's the purpose of the pipeline that you're creating here??\n",
"from transformers import pipeline\n",
"\n",
"classifier = pipeline(\"zero-shot-classification\", model=\"BaptisteDoyen/camembert-base-xnli\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "7ZqAhaR5_V7r"
},
"source": [
"## c. Building the data set"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "RKib_9V6hRMY",
"colab": {
"base_uri": "https://localhost:8080/"
},
"outputId": "eda02094-d948-49a4-b453-26fc17bdf114"
},
"outputs": [
{
"output_type": "stream",
"name": "stderr",
"text": [
":223: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.\n",
" df = df.append(pd.Series(data=d), ignore_index=True)\n"
]
}
],
"source": [
"e1 = pd.Series({'task':'medical appointment booking',\n",
" 'location':'office',\n",
" 'internal equipment': 'smartphone or computer',\n",
" 'internal networks':'802.11',\n",
" 'internal services':'',\n",
" 'external equipment': 'equipment participating in the session',\n",
" 'external networks':'internet',\n",
" 'external services': 'web, health platform services',\n",
" 'information':'health insurance',\n",
" 'templates':'health'})\n",
"\n",
"e2 = pd.Series({'task':'book medical appointment',\n",
" 'location':'house',\n",
" 'internal equipment': 'smartphone or computer',\n",
" 'internal networks':'802.11',\n",
" 'internal services':'',\n",
" 'external equipment': 'equipment participating in the session',\n",
" 'external networks':'internet',\n",
" 'external services': 'web, health platform services',\n",
" 'information':'health insurance',\n",
" 'templates':'health'})\n",
"\n",
"e3 = pd.Series({'task':'medical appointment booking',\n",
" 'location':\"outside\",\n",
" 'internal equipment': 'smartphone',\n",
" 'internal networks':'3GPP',\n",
" 'internal services':'',\n",
" 'external equipment': 'equipment participating in the session',\n",
" 'external networks':'internet',\n",
" 'external services': 'web, health platform services',\n",
" 'information':'health insurance',\n",
" 'templates':'health'})\n",
"\n",
"e4 = pd.Series({'task':'Go to doctor',\n",
" 'location':\"outside\",\n",
" 'internal equipment': 'smartphone',\n",
" 'internal networks':'3GPP',\n",
" 'internal services':'set location',\n",
" 'external equipment': 'equipment participating in the session',\n",
" 'external networks':'internet',\n",
" 'external services':'weather, get location',\n",
" 'information':'travel ticket',\n",
" 'templates':'displacement'})\n",
"\n",
"e5 = pd.Series({'task':'Consultation by doctor',\n",
" 'location':\"at destination\",\n",
" 'internal equipment': 'smartphone',\n",
" 'internal networks':'',\n",
" 'internal services':'',\n",
" 'external equipment': '',\n",
" 'external networks':'',\n",
" 'external services':'',\n",
" 'information': 'health insurance, mutuals',\n",
" 'templates':'health'})\n",
"\n",
"e6 = pd.Series({'task':'Teleconsultation',\n",
" 'location':\"house\",\n",
" 'internal equipment': 'smartphone or computer',\n",
" 'internal networks':'802.11',\n",
" 'internal services':'Consultation reminder notification',\n",
" 'external equipment': 'equipment participating in the session',\n",
" 'external networks':'internet',\n",
" 'external services':'health platform services',\n",
" 'information': 'health insurance, mutuals',\n",
" 'templates':'health'})\n",
"\n",
"e7 = pd.Series({'task':'Consultation fee payment',\n",
" 'location':\"house\",\n",
" 'internal equipment': 'smartphone',\n",
" 'internal networks':'802.11',\n",
" 'internal services':'notification for payment validation',\n",
" 'external equipment': 'equipment participating in the session',\n",
" 'external networks':'internet',\n",
" 'external services':'banking services',\n",
" 'information':'credit card',\n",
" 'templates':'finance'})\n",
"\n",
"e8 = pd.Series({'task':'Consultation fee payment',\n",
" 'location':\"at destination\",\n",
" 'internal equipment': 'smartphone',\n",
" 'internal networks':'',\n",
" 'internal services':'',\n",
" 'external equipment': '',\n",
" 'external networks':'',\n",
" 'external services':'banking services',\n",
" 'information': 'credit card, health insurance, mutual funds',\n",
" 'templates':'finance'})\n",
"\n",
"e9 = pd.Series({'task':'Purchase of prescribed drugs',\n",
" 'location':\"outside\",\n",
" 'internal equipment': 'smartphone',\n",
" 'internal networks':'3GPP',\n",
" 'internal services':'set location',\n",
" 'external equipment': 'equipment participating in the session',\n",
" 'external networks':'internet',\n",
" 'external services':\"get location, banking services\",\n",
" 'information': 'credit card, prescription, health insurance, mutual funds',\n",
" 'templates':'finance, health'})\n",
"\t\t\t\t \n",
"e10 = pd.Series({'task':'Do medical exams',\n",
" 'location':\"outside\",\n",
" 'internal equipment': 'smartphone',\n",
" 'internal networks':'3GPP',\n",
" 'internal services':'set location, weather',\n",
" 'external equipment': 'equipment participating in the session',\n",
" 'external networks':'internet',\n",
" 'external services':\"get location, weather\",\n",
" 'information': 'transport ticket, prescription, health insurance, mutual funds',\n",
" 'templates':'movement, health'})\n",
"\n",
"e11 = pd.Series({'task':\"Payment for gym membership\",\n",
" 'location':\"house\",\n",
" 'internal equipment': 'smartphone',\n",
" 'internal networks':'802.11',\n",
" 'internal services':'',\n",
" 'external equipment': 'equipment participating in the session',\n",
" 'external networks':'internet',\n",
" 'external services': \"web, sports platform services, banking services\",\n",
" 'information':'credit card',\n",
" 'templates':'sports'})\n",
"\n",
"e12 = pd.Series({'task':\"Gym membership payment\",\n",
" 'location':\"office\",\n",
" 'internal equipment': 'smartphone',\n",
" 'internal networks':'802.11',\n",
" 'internal services':'',\n",
" 'external equipment': 'equipment participating in the session',\n",
" 'external networks':'internet',\n",
" 'external services': \"web, sports platform services, banking services\",\n",
" 'information':'credit card',\n",
" 'templates':'sports'})\n",
"\n",
"e13 = pd.Series({'task':\"Payment for gym membership\",\n",
" 'location':\"outside\",\n",
" 'internal equipment': 'smartphone',\n",
" 'internal networks':'3GPP',\n",
" 'internal services':'',\n",
" 'external equipment': 'equipment participating in the session',\n",
" 'external networks':'internet',\n",
" 'external services': \"web, sports platform services, banking services\",\n",
" 'information': 'credit card, transport ticket',\n",
" 'templates':'sport, travel'})\n",
"\n",
"e14 = pd.Series({'task':\"Finding the nearest gym\",\n",
" 'location':\"outside\",\n",
" 'internal equipment': 'smartphone',\n",
" 'internal networks':'3GPP',\n",
" 'internal services':'set location',\n",
" 'external equipment': 'equipment participating in the session',\n",
" 'external networks':'internet',\n",
" 'external services':\"web, get location, weather\",\n",
" 'information':'',\n",
" 'templates':'sports'})\n",
"\n",
"e15 = pd.Series({'task':\"Finding the nearest gym\",\n",
" 'location':\"house\",\n",
" 'internal equipment': 'smartphone',\n",
" 'internal networks':'802.11',\n",
" 'internal services':'set location',\n",
" 'external equipment': 'equipment participating in the session',\n",
" 'external networks':'internet',\n",
" 'external services':\"web, get location, weather\",\n",
" 'information':'',\n",
" 'templates':'sports'})\n",
"\n",
"e16 = pd.Series({'task':\"Finding the nearest gym\",\n",
" 'location':\"office\",\n",
" 'internal equipment': 'smartphone',\n",
" 'internal networks':'802.11',\n",
" 'internal services':'set location',\n",
" 'external equipment': 'equipment participating in the session',\n",
" 'external networks':'internet',\n",
" 'external services':\"web, get location, weather\",\n",
" 'information':'',\n",
" 'templates':'sports'})\n",
"\n",
"e17 = pd.Series({'task':\"Choice of physical activity exercises to do\",\n",
" 'location':\"house\",\n",
" 'internal equipment': 'smartphone',\n",
" 'internal networks':'802.11',\n",
" 'internal services':'notepad',\n",
" 'external equipment': 'equipment participating in the session',\n",
" 'external networks':'internet',\n",
" 'external services': \"web, sports platform services\",\n",
" 'information':'',\n",
" 'templates':'sports'})\n",
"\n",
"e18 = pd.Series({'task':\"On the way to the gym\",\n",
" 'location':\"outside\",\n",
" 'internal equipment': 'smartphone',\n",
" 'internal networks':'3GPP',\n",
" 'internal services':'set location',\n",
" 'external equipment': 'equipment participating in the session',\n",
" 'external network':'internet',\n",
" 'external services':\"get location\",\n",
" 'information':'travel ticket',\n",
" 'templates':'displacement'})\n",
"\n",
"e19 = pd.Series({'task':\"Outdoor sports session\",\n",
" 'location':\"outside\",\n",
" 'internal equipment': 'smartphone',\n",
" 'internal networks':'3GPP',\n",
" 'internal services':\"set location, set information (training statistics), get information\",\n",
" 'external equipment': 'equipment participating in the session',\n",
" 'external networks':'internet',\n",
" 'external services':\"get location, music, exercise programs\",\n",
" 'information':'',\n",
" 'templates':'sport, health, travel'})\n",
"\n",
"e20 = pd.Series({'task':\"Indoor sports session\",\n",
" 'location':\"at destination\",\n",
" 'internal equipment': 'smartphone',\n",
" 'internal networks':'3GPP',\n",
" 'internal services':\"set information (training statistics), get information\",\n",
" 'external equipment': 'equipment participating in the session',\n",
" 'external networks':'internet',\n",
" 'external services':\"music, exercise programs\",\n",
" 'information':'club membership card',\n",
" 'templates':'sport,health'})\n",
"\n",
"df = pd.DataFrame([e1,e2,e3,e4,e5,e6,e7,e8,e9,e10,e11,e12,e13,e14,e15,e16,e17,e18,e19,e20])\n",
"d = {'task':'Make a covid test','location':'outside','internal equipment':'smartphone','internal networks':'3GPP','internal services':'set location','external equipment':'equipment participating in the session', 'external networks': 'internet', 'external services': 'get location, test platform services', 'information': 'transport ticket, health insurance, health insurance', 'templates': 'travel, health' }\n",
"df = df.append(pd.Series(data=d), ignore_index=True)\t\t\t\t "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 1000
},
"id": "_vAyErGXhkws",
"outputId": "a611fd3d-d859-46de-b9d7-936bfb997cd5"
},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" task location \\\n",
"0 medical appointment booking office \n",
"1 book medical appointment house \n",
"2 medical appointment booking outside \n",
"3 Go to doctor outside \n",
"4 Consultation by doctor at destination \n",
"5 Teleconsultation house \n",
"6 Consultation fee payment house \n",
"7 Consultation fee payment at destination \n",
"8 Purchase of prescribed drugs outside \n",
"9 Do medical exams outside \n",
"10 Payment for gym membership house \n",
"11 Gym membership payment office \n",
"12 Payment for gym membership outside \n",
"13 Finding the nearest gym outside \n",
"14 Finding the nearest gym house \n",
"15 Finding the nearest gym office \n",
"16 Choice of physical activity exercises to do house \n",
"17 On the way to the gym outside \n",
"18 Outdoor sports session outside \n",
"19 Indoor sports session at destination \n",
"20 Make a covid test outside \n",
"\n",
" internal equipment internal networks \\\n",
"0 smartphone or computer 802.11 \n",
"1 smartphone or computer 802.11 \n",
"2 smartphone 3GPP \n",
"3 smartphone 3GPP \n",
"4 smartphone \n",
"5 smartphone or computer 802.11 \n",
"6 smartphone 802.11 \n",
"7 smartphone \n",
"8 smartphone 3GPP \n",
"9 smartphone 3GPP \n",
"10 smartphone 802.11 \n",
"11 smartphone 802.11 \n",
"12 smartphone 3GPP \n",
"13 smartphone 3GPP \n",
"14 smartphone 802.11 \n",
"15 smartphone 802.11 \n",
"16 smartphone 802.11 \n",
"17 smartphone 3GPP \n",
"18 smartphone 3GPP \n",
"19 smartphone 3GPP \n",
"20 smartphone 3GPP \n",
"\n",
" internal services \\\n",
"0 \n",
"1 \n",
"2 \n",
"3 set location \n",
"4 \n",
"5 Consultation reminder notification \n",
"6 notification for payment validation \n",
"7 \n",
"8 set location \n",
"9 set location, weather \n",
"10 \n",
"11 \n",
"12 \n",
"13 set location \n",
"14 set location \n",
"15 set location \n",
"16 notepad \n",
"17 set location \n",
"18 set location, set information (training statis... \n",
"19 set information (training statistics), get inf... \n",
"20 set location \n",
"\n",
" external equipment external networks \\\n",
"0 equipment participating in the session internet \n",
"1 equipment participating in the session internet \n",
"2 equipment participating in the session internet \n",
"3 equipment participating in the session internet \n",
"4 \n",
"5 equipment participating in the session internet \n",
"6 equipment participating in the session internet \n",
"7 \n",
"8 equipment participating in the session internet \n",
"9 equipment participating in the session internet \n",
"10 equipment participating in the session internet \n",
"11 equipment participating in the session internet \n",
"12 equipment participating in the session internet \n",
"13 equipment participating in the session internet \n",
"14 equipment participating in the session internet \n",
"15 equipment participating in the session internet \n",
"16 equipment participating in the session internet \n",
"17 equipment participating in the session NaN \n",
"18 equipment participating in the session internet \n",
"19 equipment participating in the session internet \n",
"20 equipment participating in the session internet \n",
"\n",
" external services \\\n",
"0 web, health platform services \n",
"1 web, health platform services \n",
"2 web, health platform services \n",
"3 weather, get location \n",
"4 \n",
"5 health platform services \n",
"6 banking services \n",
"7 banking services \n",
"8 get location, banking services \n",
"9 get location, weather \n",
"10 web, sports platform services, banking services \n",
"11 web, sports platform services, banking services \n",
"12 web, sports platform services, banking services \n",
"13 web, get location, weather \n",
"14 web, get location, weather \n",
"15 web, get location, weather \n",
"16 web, sports platform services \n",
"17 get location \n",
"18 get location, music, exercise programs \n",
"19 music, exercise programs \n",
"20 get location, test platform services \n",
"\n",
" information templates \\\n",
"0 health insurance health \n",
"1 health insurance health \n",
"2 health insurance health \n",
"3 travel ticket displacement \n",
"4 health insurance, mutuals health \n",
"5 health insurance, mutuals health \n",
"6 credit card finance \n",
"7 credit card, health insurance, mutual funds finance \n",
"8 credit card, prescription, health insurance, m... finance, health \n",
"9 transport ticket, prescription, health insuran... movement, health \n",
"10 credit card sports \n",
"11 credit card sports \n",
"12 credit card, transport ticket sport, travel \n",
"13 sports \n",
"14 sports \n",
"15 sports \n",
"16 sports \n",
"17 travel ticket displacement \n",
"18 sport, health, travel \n",
"19 club membership card sport,health \n",
"20 transport ticket, health insurance, health ins... travel, health \n",
"\n",
" external network \n",
"0 NaN \n",
"1 NaN \n",
"2 NaN \n",
"3 NaN \n",
"4 NaN \n",
"5 NaN \n",
"6 NaN \n",
"7 NaN \n",
"8 NaN \n",
"9 NaN \n",
"10 NaN \n",
"11 NaN \n",
"12 NaN \n",
"13 NaN \n",
"14 NaN \n",
"15 NaN \n",
"16 NaN \n",
"17 internet \n",
"18 NaN \n",
"19 NaN \n",
"20 NaN "
],
"text/html": [
"\n",
" \n",
"
\n",
"
\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" task | \n",
" location | \n",
" internal equipment | \n",
" internal networks | \n",
" internal services | \n",
" external equipment | \n",
" external networks | \n",
" external services | \n",
" information | \n",
" templates | \n",
" external network | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" medical appointment booking | \n",
" office | \n",
" smartphone or computer | \n",
" 802.11 | \n",
" | \n",
" equipment participating in the session | \n",
" internet | \n",
" web, health platform services | \n",
" health insurance | \n",
" health | \n",
" NaN | \n",
"
\n",
" \n",
" 1 | \n",
" book medical appointment | \n",
" house | \n",
" smartphone or computer | \n",
" 802.11 | \n",
" | \n",
" equipment participating in the session | \n",
" internet | \n",
" web, health platform services | \n",
" health insurance | \n",
" health | \n",
" NaN | \n",
"
\n",
" \n",
" 2 | \n",
" medical appointment booking | \n",
" outside | \n",
" smartphone | \n",
" 3GPP | \n",
" | \n",
" equipment participating in the session | \n",
" internet | \n",
" web, health platform services | \n",
" health insurance | \n",
" health | \n",
" NaN | \n",
"
\n",
" \n",
" 3 | \n",
" Go to doctor | \n",
" outside | \n",
" smartphone | \n",
" 3GPP | \n",
" set location | \n",
" equipment participating in the session | \n",
" internet | \n",
" weather, get location | \n",
" travel ticket | \n",
" displacement | \n",
" NaN | \n",
"
\n",
" \n",
" 4 | \n",
" Consultation by doctor | \n",
" at destination | \n",
" smartphone | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" health insurance, mutuals | \n",
" health | \n",
" NaN | \n",
"
\n",
" \n",
" 5 | \n",
" Teleconsultation | \n",
" house | \n",
" smartphone or computer | \n",
" 802.11 | \n",
" Consultation reminder notification | \n",
" equipment participating in the session | \n",
" internet | \n",
" health platform services | \n",
" health insurance, mutuals | \n",
" health | \n",
" NaN | \n",
"
\n",
" \n",
" 6 | \n",
" Consultation fee payment | \n",
" house | \n",
" smartphone | \n",
" 802.11 | \n",
" notification for payment validation | \n",
" equipment participating in the session | \n",
" internet | \n",
" banking services | \n",
" credit card | \n",
" finance | \n",
" NaN | \n",
"
\n",
" \n",
" 7 | \n",
" Consultation fee payment | \n",
" at destination | \n",
" smartphone | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" banking services | \n",
" credit card, health insurance, mutual funds | \n",
" finance | \n",
" NaN | \n",
"
\n",
" \n",
" 8 | \n",
" Purchase of prescribed drugs | \n",
" outside | \n",
" smartphone | \n",
" 3GPP | \n",
" set location | \n",
" equipment participating in the session | \n",
" internet | \n",
" get location, banking services | \n",
" credit card, prescription, health insurance, m... | \n",
" finance, health | \n",
" NaN | \n",
"
\n",
" \n",
" 9 | \n",
" Do medical exams | \n",
" outside | \n",
" smartphone | \n",
" 3GPP | \n",
" set location, weather | \n",
" equipment participating in the session | \n",
" internet | \n",
" get location, weather | \n",
" transport ticket, prescription, health insuran... | \n",
" movement, health | \n",
" NaN | \n",
"
\n",
" \n",
" 10 | \n",
" Payment for gym membership | \n",
" house | \n",
" smartphone | \n",
" 802.11 | \n",
" | \n",
" equipment participating in the session | \n",
" internet | \n",
" web, sports platform services, banking services | \n",
" credit card | \n",
" sports | \n",
" NaN | \n",
"
\n",
" \n",
" 11 | \n",
" Gym membership payment | \n",
" office | \n",
" smartphone | \n",
" 802.11 | \n",
" | \n",
" equipment participating in the session | \n",
" internet | \n",
" web, sports platform services, banking services | \n",
" credit card | \n",
" sports | \n",
" NaN | \n",
"
\n",
" \n",
" 12 | \n",
" Payment for gym membership | \n",
" outside | \n",
" smartphone | \n",
" 3GPP | \n",
" | \n",
" equipment participating in the session | \n",
" internet | \n",
" web, sports platform services, banking services | \n",
" credit card, transport ticket | \n",
" sport, travel | \n",
" NaN | \n",
"
\n",
" \n",
" 13 | \n",
" Finding the nearest gym | \n",
" outside | \n",
" smartphone | \n",
" 3GPP | \n",
" set location | \n",
" equipment participating in the session | \n",
" internet | \n",
" web, get location, weather | \n",
" | \n",
" sports | \n",
" NaN | \n",
"
\n",
" \n",
" 14 | \n",
" Finding the nearest gym | \n",
" house | \n",
" smartphone | \n",
" 802.11 | \n",
" set location | \n",
" equipment participating in the session | \n",
" internet | \n",
" web, get location, weather | \n",
" | \n",
" sports | \n",
" NaN | \n",
"
\n",
" \n",
" 15 | \n",
" Finding the nearest gym | \n",
" office | \n",
" smartphone | \n",
" 802.11 | \n",
" set location | \n",
" equipment participating in the session | \n",
" internet | \n",
" web, get location, weather | \n",
" | \n",
" sports | \n",
" NaN | \n",
"
\n",
" \n",
" 16 | \n",
" Choice of physical activity exercises to do | \n",
" house | \n",
" smartphone | \n",
" 802.11 | \n",
" notepad | \n",
" equipment participating in the session | \n",
" internet | \n",
" web, sports platform services | \n",
" | \n",
" sports | \n",
" NaN | \n",
"
\n",
" \n",
" 17 | \n",
" On the way to the gym | \n",
" outside | \n",
" smartphone | \n",
" 3GPP | \n",
" set location | \n",
" equipment participating in the session | \n",
" NaN | \n",
" get location | \n",
" travel ticket | \n",
" displacement | \n",
" internet | \n",
"
\n",
" \n",
" 18 | \n",
" Outdoor sports session | \n",
" outside | \n",
" smartphone | \n",
" 3GPP | \n",
" set location, set information (training statis... | \n",
" equipment participating in the session | \n",
" internet | \n",
" get location, music, exercise programs | \n",
" | \n",
" sport, health, travel | \n",
" NaN | \n",
"
\n",
" \n",
" 19 | \n",
" Indoor sports session | \n",
" at destination | \n",
" smartphone | \n",
" 3GPP | \n",
" set information (training statistics), get inf... | \n",
" equipment participating in the session | \n",
" internet | \n",
" music, exercise programs | \n",
" club membership card | \n",
" sport,health | \n",
" NaN | \n",
"
\n",
" \n",
" 20 | \n",
" Make a covid test | \n",
" outside | \n",
" smartphone | \n",
" 3GPP | \n",
" set location | \n",
" equipment participating in the session | \n",
" internet | \n",
" get location, test platform services | \n",
" transport ticket, health insurance, health ins... | \n",
" travel, health | \n",
" NaN | \n",
"
\n",
" \n",
"
\n",
"
\n",
"
\n",
" \n",
" \n",
"\n",
" \n",
"
\n",
"
\n",
" "
]
},
"metadata": {},
"execution_count": 6
}
],
"source": [
"df"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "-a_d-du3Ajud"
},
"source": [
"## d. Building the models"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "jR-P1NzJCm2E"
},
"source": [
"#*d.1. Model 1 for classification of the activity*"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "zcHeC1ZMh4CM",
"outputId": "f4a7fc2d-09d1-4ed8-ce36-8c661e8a9dbf"
},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"{'sequence': 'Purchase of prescribed drugs',\n",
" 'labels': ['health', 'travel', 'finance', 'sport'],\n",
" 'scores': [0.4309265613555908,\n",
" 0.30097901821136475,\n",
" 0.17015425860881805,\n",
" 0.0979401245713234]}"
]
},
"metadata": {},
"execution_count": 7
}
],
"source": [
"sequence = \"Purchase of prescribed drugs\"\n",
"candidate_labels = [\"sport\",\"health\",\"travel\", \"finance\"]\n",
"hypothesis_template = \"This text is about {}.\"\n",
"\n",
"classifier(sequence, candidate_labels, hypothesis_template=hypothesis_template)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "dOjRSgRODnLW"
},
"source": [
"# *d.2. Model 2 to calculate the simularity between the data and the prediction of resources*"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "dCpH-dneQpr8"
},
"outputs": [],
"source": [
"#Importing the library to pre-process the data\n",
"\n",
"from transformers import AutoTokenizer, AutoModelForSequenceClassification"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "ltDFZt1IQFFX"
},
"outputs": [],
"source": [
"# Generating the embeddings and importing the pre-trained model\n",
"\n",
"nli_model = AutoModelForSequenceClassification.from_pretrained(\"BaptisteDoyen/camembert-base-xnli\")\n",
"tokenizer = AutoTokenizer.from_pretrained(\"BaptisteDoyen/camembert-base-xnli\") \n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "-egIY0HSQ-RR",
"outputId": "3321d551-55fa-41b6-ae72-b5a488b61a44"
},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"70.14256715774536"
]
},
"metadata": {},
"execution_count": 10
}
],
"source": [
"#Building the model which calculates the simularity between the reference phrase and the candidate phrase\n",
"\n",
"premise = \"Choice of physical activity exercises to do\"\n",
"hypothesis = \"training program composition\"\n",
"# tokenize and run through model\n",
"x = tokenizer.encode(premise, hypothesis, return_tensors='pt')\n",
"logits = nli_model(x)[0]\n",
"# we throw away \"neutral\" (dim 1) and take the probability of\n",
"# \"entailment\" (0) as the probability of the label being true \n",
"entail_contradiction_logits = logits[:,::2]\n",
"probs = entail_contradiction_logits.softmax(dim=1)\n",
"prob_label_is_true = probs[:,0]\n",
"prob_label_is_true[0].tolist() * 100"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "_BV3Nc5SXSIL"
},
"outputs": [],
"source": [
"# Function to deliver the level of confidence of the simularity calculation, and deliver the phrase conisdered to be the reference phrase\n",
"\n",
"def det_tache(entry, data):\n",
" max = 0\n",
" for elt in data['task'].tolist():\n",
" x = tokenizer.encode(elt, entry, return_tensors='pt')\n",
" logits = nli_model(x)[0]\n",
" # we throw away \"neutral\" (dim 1) and take the probability of\n",
" # \"entailment\" (0) as the probability of the label being true \n",
" entail_contradiction_logits = logits[:,::2]\n",
" probs = entail_contradiction_logits.softmax(dim=1)\n",
" prob_label_is_true = probs[:,0]\n",
" tmp = prob_label_is_true[0].tolist() * 100\n",
" if(tmp>max):\n",
" max = tmp\n",
" tache = elt\n",
" return max,tache"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "TxwnxKicZvAZ",
"outputId": "b68c1279-d032-41d8-e27c-6fa973c7889a"
},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"(93.58223080635071, 'Finding the nearest gym')"
]
},
"metadata": {},
"execution_count": 12
}
],
"source": [
"#Call the function det_tache\n",
"\n",
"det_tache(\"Composition of the training program\", df)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ZcAuLwraEHTv"
},
"source": [
"#e. Prediction of the resources to be used"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "l2IGEcobbN4T"
},
"outputs": [],
"source": [
"##Function written for test purposes. To be improved\n",
"\n",
"def det_ressources(entry, loc, data=df):\n",
" \n",
" lignes = data.loc[data['task'] == det_tache(entry, data)[1]]\n",
" ressources = lignes.loc[lignes['location'] == loc]\n",
" return [ressources.equipements.values[0], ressources['networks'].values[0], ressources['services'].values[0]]\n",
"\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "HJhaV7aau9Mg"
},
"outputs": [],
"source": [
"# Function to determine the resouces to be used to perform a task. It also calls the function to check similarity between the phrases \n",
"\n",
"def det_ressources1(entry, loc):\n",
" \n",
" lignes = df.loc[df['task'] == det_tache(entry, df)[1]]\n",
" ressources = lignes.loc[lignes['location'] == loc]\n",
" return [ressources['internal equipment'].values[0], ressources['internal networks'].values[0], ressources['internal services'].values[0], ressources['external equipment'].values[0], ressources['external networks'].values[0], ressources['external services'].values[0]]\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Tdl0spGSil6K"
},
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "PFp4lo43moy4",
"outputId": "fe22d996-5831-4471-e8e1-75543acbe607"
},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"['smartphone',\n",
" '802.11',\n",
" 'set location',\n",
" 'equipment participating in the session',\n",
" 'internet',\n",
" 'web, get location, weather']"
]
},
"metadata": {},
"execution_count": 15
}
],
"source": [
"# Example of resource prediction\n",
"\n",
"det_ressources1(\"Composition of the training program\", \"house\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "H3ZRKp5PiDpo"
},
"source": [
"#f. Interface to test the models"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "065aB5GqsueC"
},
"outputs": [],
"source": [
"%%capture\n",
"!pip install gradio"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "IrnOmguss6si"
},
"outputs": [],
"source": [
"import gradio as gr"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 592
},
"id": "YEQoa4jei9bG",
"outputId": "6b5bfe92-9f99-4628-928c-c0cecbb99c02"
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Colab notebook detected. To show errors in colab notebook, set debug=True in launch()\n",
"Running on public URL: https://fa7e6ed763bf430732.gradio.live\n",
"\n",
"This share link expires in 72 hours. For free permanent hosting and GPU upgrades (NEW!), check out Spaces: https://huggingface.co/spaces\n"
]
},
{
"output_type": "display_data",
"data": {
"text/plain": [
""
],
"text/html": [
""
]
},
"metadata": {}
},
{
"output_type": "execute_result",
"data": {
"text/plain": []
},
"metadata": {},
"execution_count": 18
}
],
"source": [
"# Code to launch the Gradio test environment for model 1\n",
"\n",
"def zeroShotClassification(text_input, candidate_labels):\n",
" labels = [label.strip(' ') for label in candidate_labels.split(',')]\n",
" output = {}\n",
" prediction = classifier(text_input, labels)\n",
" for i in range(len(prediction['labels'])):\n",
" output[prediction['labels'][i]] = prediction['scores'][i]\n",
" return output\n",
"\n",
"demo = gr.Interface(fn=zeroShotClassification, inputs=[gr.Textbox(label=\"Task\"), gr.Textbox(label=\"Templates (classes) candidats\")], outputs=gr.Label(label=\"Classification\"))\n",
"demo.launch(share=True)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 592
},
"id": "mBsHM4WutJVZ",
"outputId": "9647bb41-7e05-4fa6-a028-c18ef81241e8"
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Colab notebook detected. To show errors in colab notebook, set debug=True in launch()\n",
"Running on public URL: https://c4eacdcc4910b56eb2.gradio.live\n",
"\n",
"This share link expires in 72 hours. For free permanent hosting and GPU upgrades (NEW!), check out Spaces: https://huggingface.co/spaces\n"
]
},
{
"output_type": "display_data",
"data": {
"text/plain": [
""
],
"text/html": [
""
]
},
"metadata": {}
},
{
"output_type": "execute_result",
"data": {
"text/plain": []
},
"metadata": {},
"execution_count": 19
}
],
"source": [
"# Code to launch the Gradio test environment for model 2\n",
"\n",
"demo = gr.Interface(fn=det_ressources1, inputs=['text', 'text'], outputs=\"text\")\n",
"demo.launch(share=True)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "bJzlHmJoPlJ3"
},
"outputs": [],
"source": []
}
],
"metadata": {
"colab": {
"provenance": []
},
"gpuClass": "standard",
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.7"
},
"widgets": {
"application/vnd.jupyter.widget-state+json": {
"cff8c7e6b2f942e0a9df876e0bcd09bf": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HBoxModel",
"model_module_version": "1.5.0",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_9aa98415ca874172b0bab2eacf12b553",
"IPY_MODEL_1c78a79f4e284b37933c65c4caec0d6f",
"IPY_MODEL_5e00bb0a1b47470a880a970cb3c5c7c5"
],
"layout": "IPY_MODEL_d6ea4bcbd982430e96228e58911d4e5d"
}
},
"9aa98415ca874172b0bab2eacf12b553": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HTMLModel",
"model_module_version": "1.5.0",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_db52af7763d6410d927a5a5d8c81b52b",
"placeholder": "",
"style": "IPY_MODEL_92065aa8dd5547508b244541f4b4953d",
"value": "Downloading (…)lve/main/config.json: 100%"
}
},
"1c78a79f4e284b37933c65c4caec0d6f": {
"model_module": "@jupyter-widgets/controls",
"model_name": "FloatProgressModel",
"model_module_version": "1.5.0",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_049cbb029db84ed6bc0fb36570c1b374",
"max": 882,
"min": 0,
"orientation": "horizontal",
"style": "IPY_MODEL_2be056a4cf9042a586b426a74402ab72",
"value": 882
}
},
"5e00bb0a1b47470a880a970cb3c5c7c5": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HTMLModel",
"model_module_version": "1.5.0",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_f118b5824c2b4f87bb6d65aef39ba356",
"placeholder": "",
"style": "IPY_MODEL_68598a9db6b64f6b9f5989c0620c2b5c",
"value": " 882/882 [00:00<00:00, 7.42kB/s]"
}
},
"d6ea4bcbd982430e96228e58911d4e5d": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"model_module_version": "1.2.0",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"db52af7763d6410d927a5a5d8c81b52b": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"model_module_version": "1.2.0",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"92065aa8dd5547508b244541f4b4953d": {
"model_module": "@jupyter-widgets/controls",
"model_name": "DescriptionStyleModel",
"model_module_version": "1.5.0",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"049cbb029db84ed6bc0fb36570c1b374": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"model_module_version": "1.2.0",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"2be056a4cf9042a586b426a74402ab72": {
"model_module": "@jupyter-widgets/controls",
"model_name": "ProgressStyleModel",
"model_module_version": "1.5.0",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": ""
}
},
"f118b5824c2b4f87bb6d65aef39ba356": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"model_module_version": "1.2.0",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"68598a9db6b64f6b9f5989c0620c2b5c": {
"model_module": "@jupyter-widgets/controls",
"model_name": "DescriptionStyleModel",
"model_module_version": "1.5.0",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"1cce055c7ea148f9b696f9464e88d34b": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HBoxModel",
"model_module_version": "1.5.0",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_c4506b58ebec4dbbbcef3a48e1122f13",
"IPY_MODEL_52371fc7667341ca98465403cd9e0fed",
"IPY_MODEL_897f253c09ca4ce7a46105739fd0552a"
],
"layout": "IPY_MODEL_df66fae6114b46d2a9b0c379d78436e6"
}
},
"c4506b58ebec4dbbbcef3a48e1122f13": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HTMLModel",
"model_module_version": "1.5.0",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_2b93d8268de148b5abce28c36c992b46",
"placeholder": "",
"style": "IPY_MODEL_1407195508674d558426988738aacfc8",
"value": "Downloading pytorch_model.bin: 100%"
}
},
"52371fc7667341ca98465403cd9e0fed": {
"model_module": "@jupyter-widgets/controls",
"model_name": "FloatProgressModel",
"model_module_version": "1.5.0",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_c8c1cc40816b4c81beb3ddbf52379abb",
"max": 442587593,
"min": 0,
"orientation": "horizontal",
"style": "IPY_MODEL_ec56a3cdc5cf41b8aeaf9f397332e18f",
"value": 442587593
}
},
"897f253c09ca4ce7a46105739fd0552a": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HTMLModel",
"model_module_version": "1.5.0",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_094131684bd64185b29127cfb2b291e2",
"placeholder": "",
"style": "IPY_MODEL_05da292c7a36435faaec7c3957190b57",
"value": " 443M/443M [00:02<00:00, 190MB/s]"
}
},
"df66fae6114b46d2a9b0c379d78436e6": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"model_module_version": "1.2.0",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"2b93d8268de148b5abce28c36c992b46": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"model_module_version": "1.2.0",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"1407195508674d558426988738aacfc8": {
"model_module": "@jupyter-widgets/controls",
"model_name": "DescriptionStyleModel",
"model_module_version": "1.5.0",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"c8c1cc40816b4c81beb3ddbf52379abb": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"model_module_version": "1.2.0",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"ec56a3cdc5cf41b8aeaf9f397332e18f": {
"model_module": "@jupyter-widgets/controls",
"model_name": "ProgressStyleModel",
"model_module_version": "1.5.0",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": ""
}
},
"094131684bd64185b29127cfb2b291e2": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"model_module_version": "1.2.0",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"05da292c7a36435faaec7c3957190b57": {
"model_module": "@jupyter-widgets/controls",
"model_name": "DescriptionStyleModel",
"model_module_version": "1.5.0",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"894b89467a9e4730826d69d38da09fcd": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HBoxModel",
"model_module_version": "1.5.0",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_4246fb20634e4747bc022d91da11fd77",
"IPY_MODEL_2fbd4f2642274418b90bbaef20d5665c",
"IPY_MODEL_aee5ccde9fa544c78961d0e969715656"
],
"layout": "IPY_MODEL_19963506dbcc4ae08384c30f3a800f12"
}
},
"4246fb20634e4747bc022d91da11fd77": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HTMLModel",
"model_module_version": "1.5.0",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_7cd613cf081a46d5a768fd181bc8f3fc",
"placeholder": "",
"style": "IPY_MODEL_41367350a5864304b8560bfd5d1b33e4",
"value": "Downloading (…)okenizer_config.json: 100%"
}
},
"2fbd4f2642274418b90bbaef20d5665c": {
"model_module": "@jupyter-widgets/controls",
"model_name": "FloatProgressModel",
"model_module_version": "1.5.0",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_b146924bdcfc4272bd32554e9c62b089",
"max": 433,
"min": 0,
"orientation": "horizontal",
"style": "IPY_MODEL_9a8a057b18b34e0685b898d325dd156e",
"value": 433
}
},
"aee5ccde9fa544c78961d0e969715656": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HTMLModel",
"model_module_version": "1.5.0",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_2b0d8ebb0cb04ce78eb7a86a0ca80cc1",
"placeholder": "",
"style": "IPY_MODEL_592c81348d0f4a4c8d81a0370e0f4a0c",
"value": " 433/433 [00:00<00:00, 10.8kB/s]"
}
},
"19963506dbcc4ae08384c30f3a800f12": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"model_module_version": "1.2.0",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"7cd613cf081a46d5a768fd181bc8f3fc": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"model_module_version": "1.2.0",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"41367350a5864304b8560bfd5d1b33e4": {
"model_module": "@jupyter-widgets/controls",
"model_name": "DescriptionStyleModel",
"model_module_version": "1.5.0",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"b146924bdcfc4272bd32554e9c62b089": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"model_module_version": "1.2.0",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"9a8a057b18b34e0685b898d325dd156e": {
"model_module": "@jupyter-widgets/controls",
"model_name": "ProgressStyleModel",
"model_module_version": "1.5.0",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": ""
}
},
"2b0d8ebb0cb04ce78eb7a86a0ca80cc1": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"model_module_version": "1.2.0",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"592c81348d0f4a4c8d81a0370e0f4a0c": {
"model_module": "@jupyter-widgets/controls",
"model_name": "DescriptionStyleModel",
"model_module_version": "1.5.0",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"65b9d31511094cf7bb3b01dce95997f8": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HBoxModel",
"model_module_version": "1.5.0",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_91c893d270584852864cd529ae751b62",
"IPY_MODEL_ab50e757329d47bbb9d171c33e730716",
"IPY_MODEL_e011343f8b444515b9b58140ec87fb53"
],
"layout": "IPY_MODEL_f7ee9ab382034684b208940e7eb4f5c9"
}
},
"91c893d270584852864cd529ae751b62": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HTMLModel",
"model_module_version": "1.5.0",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_cb4cf6c5d75349dea57a90025770cf45",
"placeholder": "",
"style": "IPY_MODEL_3f8743aba17d49c09c4f02e2789cc11a",
"value": "Downloading (…)tencepiece.bpe.model: 100%"
}
},
"ab50e757329d47bbb9d171c33e730716": {
"model_module": "@jupyter-widgets/controls",
"model_name": "FloatProgressModel",
"model_module_version": "1.5.0",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_131fe709136945ccb5d431584dd39f24",
"max": 810912,
"min": 0,
"orientation": "horizontal",
"style": "IPY_MODEL_13ceaf20d8f042f890ec554f0ba9b3ce",
"value": 810912
}
},
"e011343f8b444515b9b58140ec87fb53": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HTMLModel",
"model_module_version": "1.5.0",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_78dbae4369d043babfacf2fcbad8a8de",
"placeholder": "",
"style": "IPY_MODEL_291bd3ce8516465a86a590a0d30f6b29",
"value": " 811k/811k [00:00<00:00, 4.97MB/s]"
}
},
"f7ee9ab382034684b208940e7eb4f5c9": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"model_module_version": "1.2.0",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"cb4cf6c5d75349dea57a90025770cf45": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"model_module_version": "1.2.0",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"3f8743aba17d49c09c4f02e2789cc11a": {
"model_module": "@jupyter-widgets/controls",
"model_name": "DescriptionStyleModel",
"model_module_version": "1.5.0",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"131fe709136945ccb5d431584dd39f24": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"model_module_version": "1.2.0",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"13ceaf20d8f042f890ec554f0ba9b3ce": {
"model_module": "@jupyter-widgets/controls",
"model_name": "ProgressStyleModel",
"model_module_version": "1.5.0",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": ""
}
},
"78dbae4369d043babfacf2fcbad8a8de": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"model_module_version": "1.2.0",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"291bd3ce8516465a86a590a0d30f6b29": {
"model_module": "@jupyter-widgets/controls",
"model_name": "DescriptionStyleModel",
"model_module_version": "1.5.0",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"5ba02140610a4e0684744c855a52410b": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HBoxModel",
"model_module_version": "1.5.0",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_881c9fb5fb1942338e95c11f9a38819a",
"IPY_MODEL_da25afb7ff104431b25e9211b7af68b8",
"IPY_MODEL_3805233a5eb7431f86a3d453f3038ee8"
],
"layout": "IPY_MODEL_f514de3b6d864fdabdff7cd992054323"
}
},
"881c9fb5fb1942338e95c11f9a38819a": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HTMLModel",
"model_module_version": "1.5.0",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_05537e9a24a542f8b13736da09f86dfd",
"placeholder": "",
"style": "IPY_MODEL_3873d640bc354a18b40b2d2e372c9a78",
"value": "Downloading (…)cial_tokens_map.json: 100%"
}
},
"da25afb7ff104431b25e9211b7af68b8": {
"model_module": "@jupyter-widgets/controls",
"model_name": "FloatProgressModel",
"model_module_version": "1.5.0",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_982afee1fe73438fa0e6b685c5afe552",
"max": 299,
"min": 0,
"orientation": "horizontal",
"style": "IPY_MODEL_4f9561538c5e42498b8bf621b52d4016",
"value": 299
}
},
"3805233a5eb7431f86a3d453f3038ee8": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HTMLModel",
"model_module_version": "1.5.0",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_be068e7eca114654a8507f904924b39a",
"placeholder": "",
"style": "IPY_MODEL_3762d5212c4e43c99a77c0d1b171ed46",
"value": " 299/299 [00:00<00:00, 4.47kB/s]"
}
},
"f514de3b6d864fdabdff7cd992054323": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"model_module_version": "1.2.0",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"05537e9a24a542f8b13736da09f86dfd": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"model_module_version": "1.2.0",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"3873d640bc354a18b40b2d2e372c9a78": {
"model_module": "@jupyter-widgets/controls",
"model_name": "DescriptionStyleModel",
"model_module_version": "1.5.0",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"982afee1fe73438fa0e6b685c5afe552": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"model_module_version": "1.2.0",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"4f9561538c5e42498b8bf621b52d4016": {
"model_module": "@jupyter-widgets/controls",
"model_name": "ProgressStyleModel",
"model_module_version": "1.5.0",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": ""
}
},
"be068e7eca114654a8507f904924b39a": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"model_module_version": "1.2.0",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"3762d5212c4e43c99a77c0d1b171ed46": {
"model_module": "@jupyter-widgets/controls",
"model_name": "DescriptionStyleModel",
"model_module_version": "1.5.0",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
}
}
}
},
"nbformat": 4,
"nbformat_minor": 0
} smartid-poc-main/README.md 0000664 0000000 0000000 00000004123 14417466726 0015522 0 ustar 00root root 0000000 0000000 # Smart Identity Proof of Concept
## About the Smart Identity Proof of Concept
The Smart Identity Proof of Concept was produced by ETSI Special Committee USER Group and is described in ETSI TR 103 875-2. It is intended to demonstrate the feasibility of the Smart Identity as it is defined in TR 103 875-1
It defines, for a specific use case (e-health) the Smart Identity (ID) and provides an associated Proof of Concept (PoC)
## Getting started
The Smart Identity Proof of Concept is run in the Google Colaboratory notebook (https://colab.research.google.com/).
The `POC_SmartID_v4.ipynb` file is to be uploaded to Google Colaboratory and the PoC is executed from there.
## Further details
The Smart Identity Proof of Concept is documented in ETSI TR 103 875-2.
For the creation of AI models for Smart ID, a pre-trained neural network model based on Transformers was used. It is called CamemBERT™.
The Camembert™-Base-XNLI zero-stroke pre-trained transfer learning algorithm was used because classical machine learning algorithms did not give accurate results during training on the dataset.
Camembert™-base-XNLI is a transformer-based natural language processing model written in Python®. It was trained on XNLI (Multilingual Natural Language Inference) which was published by Facebook. It is mainly used to determine the probability of a corpus of text belonging to a predefined class.
To implement the Camembert-Base-XNLI algorithm for data entry and resource prediction, the following tools used are:
* Python® 3.7
* Transformers 4.24.0
Library for downloading and training pre-trained natural language processing models.
* Tensorflow®-Text 2.9.0
TensorFlow® library to perform operations on text for pre-processing.
* Pandas™ 1.3.5
For managing datasets using dataframes
* Google Colab® 1.0.0
A cloud service offered by Google®, based on Jupyter Notebook and allowing to train ML models directly online, without the need to install anything.
For a better visualization of the results of the main model, web interfaces have been developed with the Gradio API version 3.12.1.