scandeval.benchmark_modules.base
source module scandeval.benchmark_modules.base
Abstract benchmark module class that the model classes inherit from.
Classes
-
BenchmarkModule — Abstract class for a benchmark module.
source class BenchmarkModule(dataset_config: DatasetConfig, benchmark_config: BenchmarkConfig)
Bases : ABC
Abstract class for a benchmark module.
Initialise the benchmark module.
Attributes
-
model_config —
The model configuration.
-
dataset_config —
The dataset configuration.
-
benchmark_config —
The benchmark configuration.
-
buffer : dict[str, t.Any] —
A buffer to store temporary data.
-
generative_type : GenerativeType | None — Get the generative type of the model.
-
data_collator : c.Callable[[list[t.Any]], dict[str, t.Any]] — The data collator used to prepare samples during finetuning.
-
compute_metrics : ComputeMetricsFunction — The function used to compute the metrics.
-
extract_labels_from_generation : ExtractLabelsFunction — The function used to extract the labels from the generated output.
-
trainer_class : t.Type[Trainer] — The Trainer class to use for finetuning.
Parameters
-
model_config : ModelConfig —
The model configuration.
-
dataset_config : DatasetConfig —
The dataset configuration.
-
benchmark_config : BenchmarkConfig —
The benchmark configuration.
Methods
-
get_pytorch_module — Get the underlying PyTorch module.
-
get_tokenizer — Get the underlying tokenizer.
-
num_params — The number of parameters in the model.
-
vocab_size — The vocabulary size of the model.
-
model_max_length — The maximum length of the model.
-
prepare_datasets — Prepare the datasets for the model.
-
prepare_dataset — Prepare the dataset for the model.
-
generate — Generate outputs from the model.
-
model_exists — Check if a model exists.
-
get_model_config — Fetch the model configuration.
source method BenchmarkModule.get_pytorch_module() → nn.Module
Get the underlying PyTorch module.
Returns
-
nn.Module — The PyTorch module.
Raises
-
NotImplementedError
source method BenchmarkModule.get_tokenizer() → PreTrainedTokenizer
Get the underlying tokenizer.
Returns
-
PreTrainedTokenizer — The tokenizer.
Raises
-
NotImplementedError
source method BenchmarkModule.num_params() → int
The number of parameters in the model.
Returns
-
int — The number of parameters in the model.
source property BenchmarkModule.generative_type: GenerativeType | None
Get the generative type of the model.
Returns
-
GenerativeType | None — The generative type of the model, or None if the model is not generative.
source method BenchmarkModule.vocab_size() → int
The vocabulary size of the model.
Returns
-
int — The vocabulary size of the model.
source method BenchmarkModule.model_max_length() → int
The maximum length of the model.
Returns
-
int — The maximum length of the model.
source property BenchmarkModule.data_collator: c.Callable[[list[t.Any]], dict[str, t.Any]]
The data collator used to prepare samples during finetuning.
Returns
-
c.Callable[[list[t.Any]], dict[str, t.Any]] — The data collator.
source property BenchmarkModule.compute_metrics: ComputeMetricsFunction
The function used to compute the metrics.
Returns
-
ComputeMetricsFunction — The function used to compute the metrics.
source property BenchmarkModule.extract_labels_from_generation: ExtractLabelsFunction
The function used to extract the labels from the generated output.
Returns
-
ExtractLabelsFunction — The function used to extract the labels from the generated output.
source property BenchmarkModule.trainer_class: t.Type[Trainer]
The Trainer class to use for finetuning.
Returns
-
t.Type[Trainer] — The Trainer class.
source method BenchmarkModule.prepare_datasets(datasets: list[DatasetDict], task: Task) → list[DatasetDict]
Prepare the datasets for the model.
This includes things like tokenisation.
Parameters
-
datasets : list[DatasetDict] —
The datasets to prepare.
-
task : Task —
The task to prepare the datasets for.
Returns
-
list[DatasetDict] — The prepared datasets.
source method BenchmarkModule.prepare_dataset(dataset: DatasetDict, task: Task, itr_idx: int) → DatasetDict
Prepare the dataset for the model.
This includes things like tokenisation.
Parameters
-
dataset : DatasetDict —
The dataset to prepare.
-
task : Task —
The task to prepare the dataset for.
-
itr_idx : int —
The index of the dataset in the iterator.
Returns
-
DatasetDict — The prepared dataset.
source method BenchmarkModule.generate(inputs: dict) → GenerativeModelOutput
Generate outputs from the model.
Parameters
-
inputs : dict —
A batch of inputs to pass through the model.
Returns
-
GenerativeModelOutput — The generated model outputs.
Raises
-
NotImplementedError
source classmethod BenchmarkModule.model_exists(model_id: str, benchmark_config: BenchmarkConfig) → bool | NeedsExtraInstalled | NeedsEnvironmentVariable
Check if a model exists.
Parameters
-
model_id : str —
The model ID.
-
benchmark_config : BenchmarkConfig —
The benchmark configuration.
Returns
-
bool | NeedsExtraInstalled | NeedsEnvironmentVariable — Whether the model exists, or an error describing why we cannot check whether the model exists.
source classmethod BenchmarkModule.get_model_config(model_id: str, benchmark_config: BenchmarkConfig) → ModelConfig
Fetch the model configuration.
Parameters
-
model_id : str —
The model ID.
-
benchmark_config : BenchmarkConfig —
The benchmark configuration.
Returns
-
ModelConfig — The model configuration.