VLLMModelInterface

`VLLMInferenceModel`

Bases: BaseInferenceModel

VLLM inference model interface. This class extends the BaseInferenceModel to provide specific functionality for VLLM.

Source code in easyroutine/inference/vllm_model_interface.py

class VLLMInferenceModel(BaseInferenceModel):
    """
    VLLM inference model interface.
    This class extends the BaseInferenceModel to provide specific functionality for VLLM.
    """

    def __init__(self, config: BaseInferenceModelConfig):
        super().__init__(config)
        self.model = LLM(model=config.model_name, tensor_parallel_size=config.n_gpus, dtype=config.dtype)


    def convert_chat_messages_to_custom_format(self, chat_messages: List[dict[str, str]]) -> List[dict[str, str]]:
        """
        For now, VLLM is compatible with the chat template format we use.
        """
        return chat_messages

    def chat(self, chat_messages: List[dict[str, str]], use_tqdm=False, **kwargs) -> list:
        """
        Generate a response based on the provided chat messages.

        Arguments:
            chat_messages (List[dict[str, str]]): List of chat messages to process.
            **kwargs: Additional parameters for the model.

        Returns:
            str: The generated response from the model.
        """
        chat_messages = self.convert_chat_messages_to_custom_format(chat_messages)

        sampling_params = SamplingParams(
            temperature=self.config.temperature,
            top_p=self.config.top_p,
            max_tokens=self.config.max_new_tokens
        )


        # Generate response using VLLM
        response = self.model.chat(chat_messages, sampling_params=sampling_params, use_tqdm=use_tqdm) # type: ignore

        return response

`chat(chat_messages, use_tqdm=False, **kwargs)`

Generate a response based on the provided chat messages.

Parameters:

Name	Type	Description	Default
`chat_messages`	`List[dict[str, str]]`	List of chat messages to process.	required
`**kwargs`		Additional parameters for the model.	`{}`

Returns:

Name	Type	Description
`str`	`list`	The generated response from the model.

Source code in easyroutine/inference/vllm_model_interface.py

def chat(self, chat_messages: List[dict[str, str]], use_tqdm=False, **kwargs) -> list:
    """
    Generate a response based on the provided chat messages.

    Arguments:
        chat_messages (List[dict[str, str]]): List of chat messages to process.
        **kwargs: Additional parameters for the model.

    Returns:
        str: The generated response from the model.
    """
    chat_messages = self.convert_chat_messages_to_custom_format(chat_messages)

    sampling_params = SamplingParams(
        temperature=self.config.temperature,
        top_p=self.config.top_p,
        max_tokens=self.config.max_new_tokens
    )


    # Generate response using VLLM
    response = self.model.chat(chat_messages, sampling_params=sampling_params, use_tqdm=use_tqdm) # type: ignore

    return response

`convert_chat_messages_to_custom_format(chat_messages)`

For now, VLLM is compatible with the chat template format we use.

Source code in easyroutine/inference/vllm_model_interface.py

def convert_chat_messages_to_custom_format(self, chat_messages: List[dict[str, str]]) -> List[dict[str, str]]:
    """
    For now, VLLM is compatible with the chat template format we use.
    """
    return chat_messages

`VLLMInferenceModelConfig` `dataclass`

Bases: BaseInferenceModelConfig

just a placeholder for now, as we don't have any specific config for VLLM.

Source code in easyroutine/inference/vllm_model_interface.py

@dataclass
class VLLMInferenceModelConfig(BaseInferenceModelConfig):
    """just a placeholder for now, as we don't have any specific config for VLLM."""

VLLMModelInterface

VLLMInferenceModel

chat(chat_messages, use_tqdm=False, **kwargs)

convert_chat_messages_to_custom_format(chat_messages)

VLLMInferenceModelConfig dataclass

`VLLMInferenceModel`

`chat(chat_messages, use_tqdm=False, **kwargs)`

`convert_chat_messages_to_custom_format(chat_messages)`

`VLLMInferenceModelConfig` `dataclass`