scandeval.benchmark_modules.vllm
source module scandeval.benchmark_modules.vllm
Generative models using the vLLM inference framework.
Classes
-
VLLMModel — A generative model using the vLLM inference framework.
Functions
-
load_model_and_tokenizer — Load the model and tokenizer.
-
load_tokenizer — Load the tokenizer.
-
clear_vllm — Clear the GPU memory used by the vLLM model, enabling re-initialisation.
-
get_end_of_reasoning_token_id — Get the end of reasoning token ID for a generative model.
source class VLLMModel(dataset_config: DatasetConfig, benchmark_config: BenchmarkConfig)
Bases : HuggingFaceEncoderModel
A generative model using the vLLM inference framework.
Initialise the vLLM model.
Parameters
-
model_config : ModelConfig —
The model configuration.
-
dataset_config : DatasetConfig —
The dataset configuration.
-
benchmark_config : BenchmarkConfig —
The benchmark configuration.
Attributes
-
generative_type : GenerativeType | None — Get the generative type of the model.
-
data_collator : c.Callable[[list[t.Any]], dict[str, t.Any]] — The data collator used to prepare samples during finetuning.
-
compute_metrics : ComputeMetricsFunction — The function used to compute the metrics.
-
extract_labels_from_generation : ExtractLabelsFunction — The function used to extract the labels from the generated output.
-
trainer_class : t.Type[Trainer] — The Trainer class to use for finetuning.
Methods
-
prepare_dataset — Prepare the dataset for the model.
-
generate — Generate outputs from the model.
-
model_exists — Check if a model exists.
-
get_model_config — Fetch the model configuration.
source property VLLMModel.generative_type: GenerativeType | None
Get the generative type of the model.
Returns
-
GenerativeType | None — The generative type of the model, or None if it has not been set yet.
source property VLLMModel.extract_labels_from_generation: ExtractLabelsFunction
The function used to extract the labels from the generated output.
Returns
-
ExtractLabelsFunction — The function used to extract the labels from the generated output.
source method VLLMModel.prepare_dataset(dataset: DatasetDict, task: Task, itr_idx: int) → DatasetDict
Prepare the dataset for the model.
This includes things like tokenisation.
Parameters
-
dataset : DatasetDict —
The dataset to prepare.
-
task : Task —
The task to prepare the dataset for.
-
itr_idx : int —
The index of the dataset in the iterator.
Returns
-
DatasetDict — The prepared dataset.
source method VLLMModel.generate(inputs: dict) → GenerativeModelOutput
Generate outputs from the model.
Parameters
-
inputs : dict —
A batch of inputs to pass through the model.
Returns
-
GenerativeModelOutput — The generated model outputs.
Raises
source classmethod VLLMModel.model_exists(model_id: str, benchmark_config: BenchmarkConfig) → bool | NeedsExtraInstalled | NeedsEnvironmentVariable
Check if a model exists.
Parameters
-
model_id : str —
The model ID.
-
benchmark_config : BenchmarkConfig —
The benchmark configuration.
Returns
-
bool | NeedsExtraInstalled | NeedsEnvironmentVariable — Whether the model exists, or an error describing why we cannot check whether the model exists.
source classmethod VLLMModel.get_model_config(model_id: str, benchmark_config: BenchmarkConfig) → ModelConfig
Fetch the model configuration.
Parameters
-
model_id : str —
The model ID.
-
benchmark_config : BenchmarkConfig —
The benchmark configuration.
Returns
-
ModelConfig — The model configuration.
Raises
source property VLLMModel.data_collator: c.Callable[[list[t.Any]], dict[str, t.Any]]
The data collator used to prepare samples during finetuning.
Returns
-
c.Callable[[list[t.Any]], dict[str, t.Any]] — The data collator.
source property VLLMModel.trainer_class: t.Type[Trainer]
The Trainer class to use for finetuning.
Returns
-
t.Type[Trainer] — The Trainer class.
source load_model_and_tokenizer(model_config: ModelConfig, benchmark_config: BenchmarkConfig, output_scores: bool) → tuple[LLM, PreTrainedTokenizer]
Load the model and tokenizer.
Parameters
-
model_config : ModelConfig —
The model configuration.
-
benchmark_config : BenchmarkConfig —
The benchmark configuration.
-
output_scores : bool —
Whether to output scores.
Returns
-
tuple[LLM, PreTrainedTokenizer] — The loaded model and tokenizer.
Raises
source load_tokenizer(model_id: str, revision: str, adapter_base_model_id: str | None, trust_remote_code: bool, model_max_length: int, model_cache_dir: str, token: str | bool) → PreTrainedTokenizer
Load the tokenizer.
Parameters
-
model_id : str —
The model identifier.
-
revision : str —
The revision of the model.
-
adapter_base_model_id : str | None —
The base model ID for the adapter model. Can be None if the model is not an adapter model.
-
trust_remote_code : bool —
Whether to trust remote code.
-
model_max_length : int —
The maximum length of the model.
-
model_cache_dir : str —
The cache directory for the model.
-
token : str | bool —
The Hugging Face API token.
Returns
-
PreTrainedTokenizer — The loaded tokenizer.
Raises
source clear_vllm() → None
Clear the GPU memory used by the vLLM model, enabling re-initialisation.
source get_end_of_reasoning_token_id(model: LLM, tokenizer: PreTrainedTokenizer) → int | None
Get the end of reasoning token ID for a generative model.
This assumes that the reasoning token is of the form
Parameters
-
model : LLM —
The vLLM model.
-
tokenizer : PreTrainedTokenizer —
The tokenizer.
Returns
-
int | None — The end of reasoning token ID, or None if it could not be found.