Getting Started
Last updated
Last updated
Prerequisites\
A recent version of (e.g. ^2.37 )
Note: to install Poetry consult the
Note: Install poetry system-wide (not in a virtualenv).
This tool uses Git LFS for handling large files. Before using it you need to install and set up Git LFS on your local machine. See https://git-lfs.com/ for installation instructions.
After Git LFS is set up, follow these steps to clone the repository:
If you already cloned the repository without Git LFS, run:
Set up virtualenv
In the root directory of the backend project (so, the same directory as this README file), run the following commands:
Note: Install the dependencies for the training using:
Note: Before running any tasks, activate the virtual environment so that the installed dependencies are available:
To deactivate the virtual environment, run:
Activate Python and download the NLTK punctuation package to use the sentence tokenizer. You only need to download punkt
it once.
The tool uses the following environment variable:
ATTENTION: The .env file should be kept secure and not shared with others as it contains sensitive information.
The inference pipeline extracts occupations and skills from a job description and matches them to the most similar entities in the ESCO taxonomy.
Then, start python interpreter in the root directory
and run the following commands:
Load the EntityLinker
class and create an instance of the class, then perform inference on any text with the following code:
After running the commands above, you should see the following output:
You can use the French version of the Entity Linker using the following code:
You should see the following output:
Load the Evaluator
class and print the results:
This class inherits from the EntityLinker
, with the main difference being the 'entity_type'
flag.
4 GB CPU/GPU RAM
The code runs on GPU if available. Ensure your machine has CUDA installed if running on GPU.
HF_TOKEN
: To use the project, you need access to the HuggingFace 🤗 entity extraction model. Contact the administrators via [tabiya@benisis.de]. From there, you must create a read access token to use the model. Find or create your read access token . The backend supports the use of a .env
file to set the environment variable. Create a .env
file in the root directory of the backend project and set the environment variables as follows:
First, activate the virtual environment as explained .
If you want to run evaluations on custom datasets, you will need to make modifications to the _load_dataset
function, located on the evaluation.py
file. Please refer to the original evaluation datasets as described . If you have any trouble, please open an issue on .