# Advanced Topics

## inference/linker.`py`

### class EntityLinker

Creates a pipeline of an entity recognition transformer and a sentence transformer for embedding text.

#### Initialization Parameters

entity\_model : str, default='tabiya/roberta-base-job-ner' Path to a pre-trained `AutoModelForTokenClassification` model or an `AutoModelCrfForNer` model. This model is used for entity recognition within the input text.

similarity\_model : str, default='all-MiniLM-L6-v2' Path or name of a sentence transformer model used for embedding text. The sentence transformer is used to compute embeddings for the extracted entities and the reference sets. The model 'all-mpnet-base-v2' is available but not in cache, so it should be used with the parameter `from_cache=False` at least the first time.

crf : bool, default=False A flag to indicate whether to use an `AutoModelCrfForNer` model instead of a standard `AutoModelForTokenClassification`. `CRF` (Conditional Random Field) models are used when the task requires sequential predictions with dependencies between the outputs.

evaluation\_mode : bool, default=False If set to `True`, the linker will return the cosine similarity scores between the embeddings. This mode is useful for evaluating the quality of the linkages.

k : int, default=32 Specifies the number of items to retrieve from the reference sets. This parameter limits the number of top matches to consider when linking entities.

from\_cache : bool, default=True If set to `True`, the precomputed embeddings are loaded from cache to save time. If set to `False`, the embeddings are computed on-the-fly, which requires GPU access for efficiency and can be time-consuming.

output\_format : str, default='occupation' Specifies the format of the output for occupations, either `occupation`, `preffered_label`, `esco_code`, `uuid` or `all` to get all the columns. The `uuid` is also available for the skills.

#### Calling Parameters

text : str An arbitrary job vacancy-related string.

linking : bool, default=True Specify whether the model performs the entity linking to the taxonomy.

### class FrenchEntityLinker

French version of the entity linker. In order to use, we need to rewrite the reference databases to the French version of ESCO.

## `inference/evaluator.py`

## class Evaluator(EntityLinker)

Evaluator class that inherits the Entity Linker. It computes the queries, corpus, inverted corpus and relevant docs for the [InformationRetrievalEvaluator](https://github.com/UKPLab/sentence-transformers/blob/master/sentence_transformers/evaluation/InformationRetrievalEvaluator.py), performs entity linking and computes the Information Retrieval Metrics.

### Initialization Parameters

entity\_type: str Occupation, Skill, or Qualification to determine the exact evaluation set to be used.

## `util/transformersCRF.py`

### class CRF(nn.Module)

Implemented from [here](https://github.com/lonePatient/BERT-NER-Pytorch/tree/master).

A class that creates a linear Conditional Random Field model.

### class AutoModelForCrfPretrainedConfig(PretrainedConfig)

Configuration class that inherits from [PretrainedConfig ](https://huggingface.co/docs/transformers/en/main_classes/configuration#transformers.PretrainedConfig)HuggingFace class.

### class AutoModelCrfForNer(PreTrainedModel)

A general class that inherits from [PreTrainedModel HuggingFace](https://huggingface.co/docs/transformers/en/main_classes/model#transformers.PreTrainedModel) class. The model\_type is detected automatically.

model\_type: str Possible options include `BertCrfForNer`, `RobertaCrfForNer` and `DebertaCrfForNer.`

### class BERT\_CRF\_Config(PretrainedConfig)

Custom class used for configuring BERT for CRF.

### class BertCrfForNer(PreTrainedModel)

BERT-based CRF model that inherits from [PreTrainedModel HuggingFace](https://huggingface.co/docs/transformers/en/main_classes/model#transformers.PreTrainedModel) class.

Same as [PreTrainedModel HuggingFace](https://huggingface.co/docs/transformers/en/main_classes/model#transformers.PreTrainedModel).

#### Forward Parameters

Same as [PreTrainedModel HuggingFace](https://huggingface.co/docs/transformers/en/main_classes/model#transformers.PreTrainedModel) except for

`special_tokens_mask` default: None. We use this option from HuggingFace as a small hack to implement the special\_mask needed for CRF.

### class ROBERTA\_CRF\_Config(PretrainedConfig)

Custom class used for configuring RoBERTa for CRF.

### class RobertaCrfForNer(PreTrainedModel)

RoBERTa-based CRF model that inherits from [PreTrainedModel HuggingFace](https://huggingface.co/docs/transformers/en/main_classes/model#transformers.PreTrainedModel) class.

Same as [PreTrainedModel HuggingFace](https://huggingface.co/docs/transformers/en/main_classes/model#transformers.PreTrainedModel).

#### Forward Parameters

Same as [PreTrainedModel HuggingFace](https://huggingface.co/docs/transformers/en/main_classes/model#transformers.PreTrainedModel) except for

`special_tokens_mask` default: None. We use this option from HuggingFace as a small hack to implement the special\_mask needed for CRF.

### class DEBERTA\_CRF\_Config(PretrainedConfig)

Custom class used for configuring RoBERTa for CRF.

### class DebertaCrfForNer(PreTrainedModel)

RoBERTa-based CRF model that inherits from [PreTrainedModel HuggingFace](https://huggingface.co/docs/transformers/en/main_classes/model#transformers.PreTrainedModel) class.

Same as [PreTrainedModel HuggingFace](https://huggingface.co/docs/transformers/en/main_classes/model#transformers.PreTrainedModel).

#### Forward Parameters

Same as [PreTrainedModel HuggingFace](https://huggingface.co/docs/transformers/en/main_classes/model#transformers.PreTrainedModel) except for

`special_tokens_mask` default: None. We use this option from HuggingFace as a small hack to implement the special\_mask needed for CRF.

## `util/utilfunctions.py`

### class Config

Configuration class for the [training hyperparameters](/our-tech-stack/livelihoods-classifier/training.md#train-an-entity-extraction-model).

### class CPU\_Unpickler

A class that loads the tensors in the CPU.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.tabiya.org/our-tech-stack/livelihoods-classifier/advanced-topics.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
