Tabiya Documentation
HomepageGithub
🇬🇧 English
  • Tabiya Documentation
🇬🇧 English
  • Welcome
  • Overview
    • About Tabiya
    • The Global Youth Employment Challenge
      • The Role of Labor Market Intermediation
      • Digital Platforms and AI in LMIC Labor Market Intermediation
  • Open-Source Tech for Labor Markets
  • Our Tech Stack
    • Inclusive Livelihoods Taxonomy
      • Methodology
      • Why ESCO?
      • Core Taxonomy
      • Open Taxonomy Platform
      • Taxonomy CSV Format
    • Livelihoods Classifier
      • Getting Started
      • Web Application
      • Datasets
      • Training
      • Advanced Topics
      • Contributing Guide
      • FAQs
      • Demo Video
    • Compass
      • Technical Overview
      • UX Evaluation
        • UX Testing Discussion Guide
      • Roadmap
Powered by GitBook
On this page
  • inference/linker.py
  • class EntityLinker
  • class FrenchEntityLinker
  • inference/evaluator.py
  • class Evaluator(EntityLinker)
  • Initialization Parameters
  • util/transformersCRF.py
  • class CRF(nn.Module)
  • class AutoModelForCrfPretrainedConfig(PretrainedConfig)
  • class AutoModelCrfForNer(PreTrainedModel)
  • class BERT_CRF_Config(PretrainedConfig)
  • class BertCrfForNer(PreTrainedModel)
  • class ROBERTA_CRF_Config(PretrainedConfig)
  • class RobertaCrfForNer(PreTrainedModel)
  • class DEBERTA_CRF_Config(PretrainedConfig)
  • class DebertaCrfForNer(PreTrainedModel)
  • util/utilfunctions.py
  • class Config
  • class CPU_Unpickler
Export as PDF
  1. Our Tech Stack
  2. Livelihoods Classifier

Advanced Topics

In this page we aim to give further details about the classes and functions located to the GItHub repository.

inference/linker.py

class EntityLinker

Creates a pipeline of an entity recognition transformer and a sentence transformer for embedding text.

Initialization Parameters

entity_model : str, default='tabiya/roberta-base-job-ner' Path to a pre-trained AutoModelForTokenClassification model or an AutoModelCrfForNer model. This model is used for entity recognition within the input text.

similarity_model : str, default='all-MiniLM-L6-v2' Path or name of a sentence transformer model used for embedding text. The sentence transformer is used to compute embeddings for the extracted entities and the reference sets. The model 'all-mpnet-base-v2' is available but not in cache, so it should be used with the parameter from_cache=False at least the first time.

crf : bool, default=False A flag to indicate whether to use an AutoModelCrfForNer model instead of a standard AutoModelForTokenClassification. CRF (Conditional Random Field) models are used when the task requires sequential predictions with dependencies between the outputs.

evaluation_mode : bool, default=False If set to True, the linker will return the cosine similarity scores between the embeddings. This mode is useful for evaluating the quality of the linkages.

k : int, default=32 Specifies the number of items to retrieve from the reference sets. This parameter limits the number of top matches to consider when linking entities.

from_cache : bool, default=True If set to True, the precomputed embeddings are loaded from cache to save time. If set to False, the embeddings are computed on-the-fly, which requires GPU access for efficiency and can be time-consuming.

output_format : str, default='occupation' Specifies the format of the output for occupations, either occupation, preffered_label, esco_code, uuid or all to get all the columns. The uuid is also available for the skills.

Calling Parameters

text : str An arbitrary job vacancy-related string.

linking : bool, default=True Specify whether the model performs the entity linking to the taxonomy.

class FrenchEntityLinker

French version of the entity linker. In order to use, we need to rewrite the reference databases to the French version of ESCO.

inference/evaluator.py

class Evaluator(EntityLinker)

Initialization Parameters

entity_type: str Occupation, Skill, or Qualification to determine the exact evaluation set to be used.

util/transformersCRF.py

class CRF(nn.Module)

A class that creates a linear Conditional Random Field model.

class AutoModelForCrfPretrainedConfig(PretrainedConfig)

class AutoModelCrfForNer(PreTrainedModel)

model_type: str Possible options include BertCrfForNer, RobertaCrfForNer and DebertaCrfForNer.

class BERT_CRF_Config(PretrainedConfig)

Custom class used for configuring BERT for CRF.

class BertCrfForNer(PreTrainedModel)

Forward Parameters

special_tokens_mask default: None. We use this option from HuggingFace as a small hack to implement the special_mask needed for CRF.

class ROBERTA_CRF_Config(PretrainedConfig)

Custom class used for configuring RoBERTa for CRF.

class RobertaCrfForNer(PreTrainedModel)

Forward Parameters

special_tokens_mask default: None. We use this option from HuggingFace as a small hack to implement the special_mask needed for CRF.

class DEBERTA_CRF_Config(PretrainedConfig)

Custom class used for configuring RoBERTa for CRF.

class DebertaCrfForNer(PreTrainedModel)

Forward Parameters

special_tokens_mask default: None. We use this option from HuggingFace as a small hack to implement the special_mask needed for CRF.

util/utilfunctions.py

class Config

class CPU_Unpickler

A class that loads the tensors in the CPU.

PreviousTrainingNextContributing Guide

Last updated 2 months ago

Evaluator class that inherits the Entity Linker. It computes the queries, corpus, inverted corpus and relevant docs for the , performs entity linking and computes the Information Retrieval Metrics.

Implemented from .

Configuration class that inherits from HuggingFace class.

A general class that inherits from class. The model_type is detected automatically.

BERT-based CRF model that inherits from class.

Same as .

Same as except for

RoBERTa-based CRF model that inherits from class.

Same as .

Same as except for

RoBERTa-based CRF model that inherits from class.

Same as .

Same as except for

Configuration class for the .

InformationRetrievalEvaluator
here
PretrainedConfig
PreTrainedModel HuggingFace
PreTrainedModel HuggingFace
PreTrainedModel HuggingFace
PreTrainedModel HuggingFace
PreTrainedModel HuggingFace
PreTrainedModel HuggingFace
PreTrainedModel HuggingFace
PreTrainedModel HuggingFace
PreTrainedModel HuggingFace
PreTrainedModel HuggingFace
training hyperparameters