Skip to content

Building AI Studio Applets for Live Agents

You can use Google's AI Studio to build custom web UIs called "Applets", which are ideal for rapidly prototyping interactive AI agents.

AI Studio Applets are web applications (HTML, CSS, and JavaScript/TypeScript) that run inside AI Studio. They are perfect for building interactive demos for agents that require features like live audio streaming from a microphone, camera access, or custom UI elements for displaying results.

GenAI Processors provides live_server.py to simplify building a backend for these applets. It wraps your processor in a WebSocket server, allowing your AI Studio Applet to communicate with it in real-time.

The live_server Backend

The live_server module provides a simple way to serve a GenAI Processor over a WebSocket connection. You typically run this server on your local machine, and your AI Studio Applet connects to it.

To use it, you need to provide a processor_factory function that creates an instance of your processor, and then call live_server.run_server:

import asyncio
from typing import Any
from genai_processors import processor
from genai_processors.examples import live_server

def create_my_processor(config: dict[str, Any]) -> processor.Processor:
  # config contains configuration sent by the client applet.
  my_processor = ... # Your processor initialization
  return my_processor

async def main():
  await live_server.run_server(create_my_processor, port=8765)

if __name__ == '__main__':
  asyncio.run(main())

See the live commentator code for a complete example.

The AI Studio Applet Frontend

An AI Studio Applet consists of three main files:

  • index.html: The main HTML structure of your app.
  • index.js (or .ts/.tsx): The client-side logic, including WebSocket connection and UI updates.
  • metadata.json: Configures app name, description, and permissions (e.g., microphone).

You can create a new applet in AI Studio and paste your code there, or use AI Studio's integration with GitHub to load an applet from a repository.

The client-side JavaScript typically performs these tasks:

  1. Establish a WebSocket connection to your live_server (e.g., ws://localhost:8765).
  2. Send user inputs (such as microphone audio or text) to the server.
  3. Receive outputs from the server and update the UI.

See the live commentator applet for a full example of an applet.

Communication Protocol

The client (Applet) and server (live_server) communicate by exchanging JSON-stringified messages over WebSocket. Each message represents a ProcessorPart object.

Client-to-Server:

The client sends user input, such as text or audio, as ProcessorParts. For example, to send text:

{
  "part": {
    "text": "Hello World"
  },
  "role": "user"
}

Audio or image data should be Base64-encoded and sent as inline_data:

{
  "part": {
    "inline_data": {
      "data": "SGVsbG8gV29ybGQ=",
      "mime_type": "audio/l16;rate=24000"
    }
  },
  "role": "user",
  "substream_name": "realtime"
}

On connection, the client can also send configuration to the server using the application/x-config mimetype. This configuration dictionary is passed to the processor_factory function:

{
  "mimetype": "application/x-config",
  "metadata": {
    "my_setting": "my_value"
  }
}

Server-to-Client:

The server sends ProcessorParts generated by the processor back to the client. Text, images, and other data types are serialized to JSON. For example:

{
  "part": {
    "text": "This is a response from the model."
  },
  "role": "model",
  "metadata": {}
}

Images or other binary data are returned with Base64-encoded inline_data.

By combining AI Studio Applets with live_server, you can rapidly build and test interactive AI agents with custom user interfaces.

Generating Applets with AI

AI Studio can help generate the applet code for you. To do this, simply describe the desired UI and specify the communication protocol in your prompt.

For example, you can include: "The applet should connect to a WebSocket backend that sends and receives JSON dictionaries representing Google GenAI parts."