Getting Started
Installation
Prerequisites\
A recent version of git (e.g. ^2.37 )
Note: to install Poetry consult the Poetry documentation
Note: Install poetry system-wide (not in a virtualenv).
Using Git LFS
This tool uses Git LFS for handling large files. Before using it you need to install and set up Git LFS on your local machine. See https://git-lfs.com/ for installation instructions.
After Git LFS is set up, follow these steps to clone the repository:
If you already cloned the repository without Git LFS, run:
Install the dependencies
Set up virtualenv
In the root directory of the backend project (so, the same directory as this README file), run the following commands:
Note: Install the dependencies for the training using:
Note: Before running any tasks, activate the virtual environment so that the installed dependencies are available:
To deactivate the virtual environment, run:
Activate Python and download the NLTK punctuation package to use the sentence tokenizer. You only need to download punkt
it once.
Environment Variable & Configuration
The tool uses the following environment variable:
HF_TOKEN
: To use the project, you need access to the HuggingFace 🤗 entity extraction model. Contact the administrators via [tabiya@benisis.de]. From there, you must create a read access token to use the model. Find or create your read access token here. The backend supports the use of a.env
file to set the environment variable. Create a.env
file in the root directory of the backend project and set the environment variables as follows:
ATTENTION: The .env file should be kept secure and not shared with others as it contains sensitive information.
QuickStart Guide
Inference Pipeline
The inference pipeline extracts occupations and skills from a job description and matches them to the most similar entities in the ESCO taxonomy.
Usage
First, activate the virtual environment as explained here.
Then, start python interpreter in the root directory
and run the following commands:
Load the EntityLinker
class and create an instance of the class, then perform inference on any text with the following code:
After running the commands above, you should see the following output:
French version
You can use the French version of the Entity Linker using the following code:
You should see the following output:
Running the evaluation tests
Load the Evaluator
class and print the results:
This class inherits from the EntityLinker
, with the main difference being the 'entity_type'
flag.
Minimum Hardware
4 GB CPU/GPU RAM
The code runs on GPU if available. Ensure your machine has CUDA installed if running on GPU.
Last updated