Building AI Studio Applets for Live Agents
You can use Google's AI Studio to build custom web UIs called "Applets", which are ideal for rapidly prototyping interactive AI agents.
AI Studio Applets are web applications (HTML, CSS, and JavaScript/TypeScript) that run inside AI Studio. They are perfect for building interactive demos for agents that require features like live audio streaming from a microphone, camera access, or custom UI elements for displaying results.
GenAI Processors provides live_server.py to simplify building a backend for these applets. It wraps your processor in a WebSocket server, allowing your AI Studio Applet to communicate with it in real-time.
The live_server Backend
The
live_server
module provides a simple way to serve a GenAI Processor over a WebSocket
connection. You typically run this server on your local machine, and your AI
Studio Applet connects to it.
To use it, you need to provide a processor_factory function that creates an
instance of your processor, and then call live_server.run_server:
import asyncio
from typing import Any
from genai_processors import processor
from genai_processors.examples import live_server
def create_my_processor(config: dict[str, Any]) -> processor.Processor:
# config contains configuration sent by the client applet.
my_processor = ... # Your processor initialization
return my_processor
async def main():
await live_server.run_server(create_my_processor, port=8765)
if __name__ == '__main__':
asyncio.run(main())
See the
live commentator code
for a complete example.
The AI Studio Applet Frontend
An AI Studio Applet consists of three main files:
index.html: The main HTML structure of your app.index.js(or.ts/.tsx): The client-side logic, including WebSocket connection and UI updates.metadata.json: Configures app name, description, and permissions (e.g., microphone).
You can create a new applet in AI Studio and paste your code there, or use AI Studio's integration with GitHub to load an applet from a repository.
The client-side JavaScript typically performs these tasks:
- Establish a WebSocket connection to your
live_server(e.g.,ws://localhost:8765). - Send user inputs (such as microphone audio or text) to the server.
- Receive outputs from the server and update the UI.
See the
live commentator applet
for a full example of an applet.
Communication Protocol
The client (Applet) and server (live_server) communicate by exchanging
JSON-stringified messages over WebSocket. Each message represents a
ProcessorPart object.
Client-to-Server:
The client sends user input, such as text or audio, as ProcessorParts. For
example, to send text:
Audio or image data should be Base64-encoded and sent as inline_data:
{
"part": {
"inline_data": {
"data": "SGVsbG8gV29ybGQ=",
"mime_type": "audio/l16;rate=24000"
}
},
"role": "user",
"substream_name": "realtime"
}
On connection, the client can also send configuration to the server using the
application/x-config mimetype. This configuration dictionary is passed to the
processor_factory function:
Server-to-Client:
The server sends ProcessorParts generated by the processor back to the client.
Text, images, and other data types are serialized to JSON. For example:
Images or other binary data are returned with Base64-encoded inline_data.
By combining AI Studio Applets with live_server, you can rapidly build and
test interactive AI agents with custom user interfaces.
Generating Applets with AI
AI Studio can help generate the applet code for you. To do this, simply describe the desired UI and specify the communication protocol in your prompt.
For example, you can include: "The applet should connect to a WebSocket backend that sends and receives JSON dictionaries representing Google GenAI parts."