Supported Models

GenAI Processors supports multiple model backends, enabling flexible deployments from cloud to local. All model backends are processors and inputs/outputs are handled like any other processor in the library: they accept various content types as input and return an async iterable of ProcessorPart as output. See Processor for more on handling processor inputs and outputs.

Overview

Backend	Class	Use Case
Gemini API	`GenaiModel`	Cloud-based, full-featured
Ollama	`OllamaModel`	Local inference, many models
Transformers	`TransformersModel`	HuggingFace models

Gemini API (`GenaiModel`)

The primary backend using Google's Gemini models. GenaiModel provides a unified processor interface for the Gemini API, handling text, images, and video inputs.

See genai_model.py for more details.

All model_names below should be taken from the list provided in https://ai.google.dev/gemini-api/docs/models.

Basic Usage

from genai_processors.core import genai_model

model = genai_model.GenaiModel(model_name="gemini-...", api_key=...)

print asyncio.run(model("Hello, world!").text())

Configuration

You can configure generation parameters, system instructions, and tools:

model = genai_model.GenaiModel(
    model_name="gemini-...",
    api_key=...,
    generate_content_config={
        "temperature": 0.7,
        "top_p": 0.9,
        "max_output_tokens": 1024,
        "tools": [my_tool1, my_tool2],  # Example function calling
        "system_instruction": "You are a helpful assistant.",
    },
)

Multimodal Input

You can send images, audio, and video:

from genai_processors import content_api

# Image input
image_part = content_api.ProcessorPart(image_bytes, mimetype="image/png")
content=[image_part, "What's in this image?"]

response = model(content)

Ollama (`OllamaModel`)

Run models locally with Ollama. This backend supports many open models, including Google's Gemma family.

See ollama_model.py for more details.

Setup

Install Ollama: https://ollama.com
Pull a model, e.g., ollama pull gemma3 or ollama pull llama3
Start Ollama: ollama serve

Basic Usage

from genai_processors.core import ollama_model

model = ollama_model.OllamaModel(model_name="gemma3")
response = model("Hello, Gemma!")

Configuration

model = ollama_model.OllamaModel(
    model_name="gemma3",
    host="http://localhost:11434",  # Ollama server
    generate_content_config={
        "temperature": 0.7,
        "seed": 42,
        "system_instruction": "You are a helpful assistant.",
    },
)

Transformers (`TransformersModel`)

Run models locally using HuggingFace Transformers. This allows access to a vast range of models from the HuggingFace Hub, including Gemma.

See transformers_model.py for more details.

model_names for Gemma should be extracted from the HuggingFace site.

Basic Usage

from genai_processors.core import transformers_model

# Example with Gemma 2B
model = transformers_model.TransformersModel(
    model_name="google/gemma-..."
)
response = model("Hello from Transformers!")

Configuration

model = transformers_model.TransformersModel(
    model_name="google/gemma-...",
    generate_content_config={
        "temperature": 0.8,
        "max_output_tokens": 512,
        "system_instruction": "You are a helpful assistant.",
    }
)

Supported Models

Overview

Gemini API (GenaiModel)

Basic Usage

Configuration

Multimodal Input

Ollama (OllamaModel)

Setup

Basic Usage

Configuration

Transformers (TransformersModel)

Basic Usage

Configuration

Gemini API (`GenaiModel`)

Ollama (`OllamaModel`)

Transformers (`TransformersModel`)