Real-time UI
Function calling framework described at the previous page makes possible to create back-end applications that can be controlled by LLMs, executing handlers either at the current context (via controller modules) or thru HTTP protocol via RPC modules. This article describes one of use-cases of this feature - building a real-time UI that can be interacted via text chat or voice interface, updating UI components automatically, while the implementation of the components requires minimal effort. This is a safe alternative to the “AI browsers” as the AI cannot execute arbitrary code in the browser, but only invoke pre-defined functions that perform HTTP requests. The HTTP requests, in their turn, can be authorized.
In order to demonstrate this feature, I’ve created a demo project of a kanban board that you can find at GitHub repository . The database, implemented with Postgres and Prisma ORM, is going to have only two tables: users and tasks, while it can be easily extended to include more tables in a real application. Note that the described project is just a demo and might need additional optimizations and improvements to be used in production.
Here’s how you can install and run it locally:
Clone the repository:
git clone https://github.com/finom/vovk-kanban-demo.git
cd vovk-kanban-demoInstall the dependencies:
yarnCreate a .env file in the root directory and add your OpenAI API key and database connection strings:
OPENAI_API_KEY=change_me
DATABASE_URL="postgresql://postgres:password@localhost:5432/vovk-kanban-demo-db?schema=public"
DATABASE_URL_UNPOOLED="postgresql://postgres:password@localhost:5432/vovk-kanban-demo-db?schema=public"
REDIS_URL=redis://localhost:6379Run docker containers and development server
docker-compose up -d && yarn devOpen http://localhost:3000 with your browser to see the result. The UI should be self-explanatory. The users and tasks can be created with the UI interface (click ”+ Add Team Member” etc), or by clicking on one of the available AI buttons: text or voice.
Prerequisites
Entity Registry Overview
At this article I’m going to describe an efficient way to synchronise application state with the back-end data, that can be applied to any application even if it isn’t using Vovk.ts at all. This idea is state-library-agnostic and database-agnostic, so you can implement it with any tools you prefer. I’m going to use Postgres as a database, Prisma as an ORM, Zustand as a state management library and, of course Vovk.ts as a back-end framework. The demo also uses Redis as database event bus, in order to implement polling functionality, but it’s out of the scope of this article.
Let’s say we have an more or less complex application with multiple components that use the same data that’s incoming from the server. As the most simple example imaginable, we have a user profile component that displays displays user’s full name at multiple components. Once the data is updated we want to update all the components that use this data.
Update your profile, John Doe!
The most straightforward way to do this is to have a global state that stores the user profile as an object and update it once the data is changed, or pass the object as a prop from a parent component.
export const UserProfile = ({ userProfile }: { userProfile: User }) => {
return <div>{userProfile.fullName}</div>
}This works perfectly fine for up to a certain complexity of the application, but as the application grows and more database entries are added, it becomes harder to manage the state and keep it in sync with the server.
A more efficient way to handle this is to have a normalized state that stores the database data as a dictionary of entities, where the key is the ID of the entity. In every component that uses this the database row, we request the related entity from the state by its ID, avoiding passing the entire entity object as a prop or storing it in a global state explicitly.
This way, once the data is requested from the server, and processed thru a middleware function that updates the entity state, we can easily retrieve the entity object with and ID and all the components that use this entity will be re-rendered automatically. Let’s call it “entity registry”.
export const UserProfile = ({ userId }: { userId: User['id'] }) => {
const userProfile = useRegistry(state => state.user[userId]);
return <div>{userProfile.fullName}</div>
}If all components that use database data are using this approach, we can request the data from the server in any desired way, wether its a one big initial fetch or smaller incremental updates.
WebRTC-based Voice AI
As setting up real-time API from OpenAI with function calling is a bit tricky, I’ve borrowed most of the code from repository created by Cameron King . In short, the code authorizes the WebRTC connection in a separate controller , and then establishes a peer-to-peer connection with OpenAI Realtime API, sending audio data from the microphone and receiving audio responses. The original hooks, such as useToolsFunctions and useWebRTCAudioSession, are left as is, with only a few minor changes.
As the connection is peer-to-peer, the audio data is sent directly from the browser to OpenAI servers, without passing it thru our back-end. The functions that are going to be executed are the methods of the RPC modules. In other words, AI is going to make HTTP requests to our back-end from the browser, supporting authorization that’s already implemented for normal requests. This makes the execution of the functions almmost instant, with only a small delay caused by the network latency.
Authentication
The app implements a very basic authentication mechanism with an optional PASSWORD stored in the .env file. Once the user enters the password, a session cookie is created that authorizes the user for further requests, making userId be a hashed version of the password. This allows to invalidate all sessions by changing the PASSWORD env variable in production.
The authentication is made by oversimplifying a solution provided at the official Next.js authentication documentation . It implements a /login page with a form that invokes login server action. Session is created at src/lib/session.ts and Data Access Level file is defined at src/lib/dal.ts .
The DAL file, in its turn, exports verifySession function that is invoked at page.tsx and redirects the user to the login page if the session is invalid.
import { verifySession } from "@/lib/dal";
export default async function Home() {
await verifySession();
// ...but also exports isLoggedIn function that is created for sessionGuard decorator to check if the user is logged in when invoking controller methods.
The sessionGuard decorator applied to all controller methods. typeof req.url !== 'undefined' check is required to distinguish between HTTP requests fn invocations.
Designing the database
The only requirement that we’ve got to perform is to make server return entities that include entity type (besides the ID) for every entity so that the front-end entity handler can understand where this entity should be stored in the registry. In our case we have two entity types: user for the User model and task for the Task model. We’re going to create an enum with lower-cased entity names in singular form and add it to each table as a column with a default value (ideally, it should be read-only but the tooling that we use doesn’t support this).
We’re also going to use prisma-zod-generator to generate Zod schemas from our Prisma models. This will help us to define Zod models automatically and make our code much shorter.
Notice the triple slashes comments. As React components and other app logic is going to work with entity IDs, we need to distinguish between IDs of different entity types implemented as branded types. We’re also using literal types for entityType columns and define examples and descriptions for better OpenAPI documentation generation and for LLM function calling.
That’s it. Each time when you invoke npx prisma generate, the Zod schemas are going to be generated automatically at src/prisma/generated folder with all the necessary type specifics.
Setting up the back-end
As we’re going to use function calling we’re going to apply a workaround decribed at Zod / Trobleshooting that allows to generate JSON schemas from Zod using draft-07 version by creating withZod helper function with createStandardValidation that creates a validation library with Standard Schema.
import { z } from 'zod';
import { createStandardValidation } from 'vovk';
export const withZod = createStandardValidation({
toJSONSchema: (model: z.core.$ZodType) => z.toJSONSchema(model, { target: 'draft-7' }),
});For additional type safety, let’s create a BaseEntity interface that describes the base fields of all entities in the database for additional type checking.
import { EntityType } from "@prisma/client";
export interface BaseEntity {
id: string;
createdAt: string | Date;
updatedAt: string | Date;
entityType: EntityType;
}For the sake of shorter code, we also need to create a reusable constant BASE_FIELDS that is going to be used to omit id, entityType, createdAt and updatedAt fields from the Zod models, but also BASE_KEYS array that contains the keys of these fields to use with lodash.omit. This will help us to build proper create/update Zod models and omit these fields from entity objects when we need to create input objects.
import type { BaseEntity } from "./types";
export const BASE_FIELDS = {
id: true,
entityType: true,
createdAt: true,
updatedAt: true,
} as const satisfies { readonly [key in keyof BaseEntity]: true };
export const BASE_KEYS = Object.keys(BASE_FIELDS) as (keyof BaseEntity)[];For example, here’s how the UpdateUserSchema can be created by omitting the base fields from the generated UserSchema:
import { UserSchema } from "../../../prisma/generated/schemas";
import { BASE_FIELDS } from "../../../constants";
const UpdateUserSchema = UserSchema.omit(BASE_FIELDS);The controllers and services are quite self-explanatory: we decorate each method with @operation decorator, and create handlers with withZod function. They have some additional features that will be described later:
- The
@operationdecorator acceptsx-tool-successMessageandx-tool-errorMessagefor the most of the endpoints. The values are used for MCP responses, described at the MCP article. - The “get all” endpoints use
x-tool-disableoperation option to disable the endpoint from being used as a tool by default, so that AI cannot access it and will use search instead in order to simulate more realistic scenarios. - Database requests are invoked using
DatabaseService.prismawhere theprismaproperty is a normal Prisma client instance with extensions . One of the features relevant to this article is that it adds__isDeletedproperty to the deleted entities whenprisma.xxx.deletemethods are invoked, this will be explained below. - Creations and updates are followed by
EmbeddingService.generateEntityEmbeddingcalls and the search endpoints useEmbeddingService.vectorSearchfunction. TheEmbeddingServiceitself wouldn’t be described here as it’s out of the scope of this article (it’s already too big), but you can check the full code at the GitHub repository . In short, it uses OpenAI embeddings and pgvector to store and search embeddings in the database. - Rest of the features will be described at polling article.
Here is the code for UserController and UserService directly fetched from the project repository:
Task’s controller and service implemented similarly.
Setting up the Entity Registry
As we’ve discussed above, the application state is going to be normalized and store entities in a dictionary by their IDs. The registry is going to have a parse method that accepts any data and extracts entities from it, storing them in the registry. The method is going to be used to process all incoming data from the server, so that all components that use this data are going to be updated automatically.
import { EntityType } from "@prisma/client";
import { create } from "zustand";
import fastDeepEqual from "fast-deep-equal";
import type { BaseEntity } from "./types";
import type { UserType } from "../prisma/generated/schemas/models/User.schema";
import type { TaskType } from "../prisma/generated/schemas/models/Task.schema";
interface Registry {
[EntityType.user]: Record<UserType["id"], UserType>;
[EntityType.task]: Record<TaskType["id"], TaskType>;
parse: (data: unknown) => Partial<{
[key in EntityType]: BaseEntity[];
}>;
}
function getEntitiesFromResponse(
data: unknown,
entities: Partial<{ [key in EntityType]: BaseEntity[] }> = {},
) {
if (Array.isArray(data)) {
data.forEach((item) => getEntitiesFromResponse(item, entities));
} else if (typeof data === "object" && data !== null) {
Object.values(data).forEach((value) =>
getEntitiesFromResponse(value, entities),
);
if ("entityType" in data && "id" in data) {
const entityType = data.entityType as EntityType;
entities[entityType] ??= [];
entities[entityType].push(data as BaseEntity);
}
}
return entities;
}
export const useRegistry = create<Registry>((set, get) => ({
[EntityType.user]: {},
[EntityType.task]: {},
parse: (data) => {
const entities = getEntitiesFromResponse(data);
set((state) => {
const newState: Record<string, unknown> = {};
Object.entries(entities).forEach(([entityType, entityList]) => {
const type = entityType as EntityType;
const descriptors = Object.getOwnPropertyDescriptors(state[type] ?? {});
entityList.forEach((entity) => {
const descriptorValue = descriptors[entity.id]?.value;
const value = { ...descriptorValue, ...entity };
descriptors[entity.id] =
descriptorValue && fastDeepEqual(descriptorValue, value)
? descriptors[entity.id]
: ({
value,
configurable: true,
writable: false,
} satisfies PropertyDescriptor);
descriptors[entity.id].enumerable = !("__isDeleted" in entity);
});
newState[type] = Object.defineProperties({}, descriptors);
});
const resultState = { ...state, ...newState };
return resultState;
});
return entities;
},
}));The code is small but quite complicated, so let’s break it down:
getEntitiesFromResponse is a recursive function that extracts entities from any data structure, based on the presence of entityType and id properties. Let’s say the server returns the following response:
{
"tasks": [
{
"id": "task-1",
"title": "Task 1",
"entityType": "task",
"user": {
"id": "user-1",
"fullName": "John Doe",
"entityType": "user"
}
},
{
"id": "task-2",
"title": "Task 2",
"entityType": "task",
"user": {
"id": "user-2",
"fullName": "Jane Doe",
"entityType": "user"
}
}
]
}The function is going to walk thru the entire object and extract all entities, turning the object into a preliminary format and not modifying the original database objects:
{
"task": [
{ "id": "task-1", "title": "Task 1", "entityType": "task", "user": { /* stays the same */ } },
{ "id": "task-2", "title": "Task 2", "entityType": "task", "user": { /* stays the same */ } }
],
"user": [
{ "id": "user-1", "fullName": "John Doe", "entityType": "user" },
{ "id": "user-2", "fullName": "Jane Doe", "entityType": "user" }
]
}Now let’s take our thinking hat on and break down the useRegistry store created by create function from Zustand.
First two bits such as [EntityType.user]: {} and [EntityType.task]: {} are the objects where the entities are going to be stored, with entity IDs as keys, represented by Record<UserType["id"], UserType> and Record<TaskType["id"], TaskType> types.
The third bit is the parse method that accepts any data, extracts entities from it and stores them in the registry. That’s where the magic happens. Instead of simply extending the state with new entities, we’re using Object.getOwnPropertyDescriptors to get the property descriptors of the existing entities. This way we can check if the entity already exists in the state and if it does, we can compare it with the new entity using fast-deep-equal library. If the entities are equal, we don’t update the state, otherwise we create a new property descriptor with the updated entity. This way we can avoid unnecessary re-renders of the components that use this entity.
The __isDeleted property is used to mark entities as deleted without actually removing them from the state, avoiding errors in components that might still reference them. Once __isDeleted is received as part of the entity, the property descriptor is marked as non-enumerable, so it won’t be included in Object.values or Object.keys calls, making the entity effectively invisible to the components.
That’s it. Each time when the data is received from the server, it needs to go thru the parse method of the registry, and all components that use the data are going to be updated automatically. Now we need a way to pass all incoming data from the server thru this method automatically with fetcher.
Setting up the fetcher
The next piece of the puzzle is to create a fetcher function (mentioned above as a “middleware”) that is going to be used by the application to request data from the server. You can find more details about it at imports article.
The fetcher is going to implement transformResponse function that is going to process all incoming data, passing it to the parse method of the registry. The function is also going to handle both regular JSON data and async iterables (if you use JSONLines responses).
import { useRegistry } from "@/registry";
import { createFetcher } from "vovk";
export const fetcher = createFetcher({
transformResponse: async (data) => {
const state = useRegistry.getState();
// check if data is a JSONLines response
if (
data && typeof data === "object" &&
Symbol.asyncIterator in data &&
"onIterate" in data &&
typeof data.onIterate === "function"
) {
data.onIterate(state.parse); // handle each item in the async iterable
return data;
}
state.parse(data); // parse regular JSON data
return data;
},
});Declare the fetcher in the config. It will replace the default fetcher imported by the generated client.
// @ts-check
/** @type {import('vovk').VovkConfig} */
const config = {
outputConfig: {
imports: {
// ...
fetcher: "./src/lib/fetcher.ts",
},
}
};
export default config;That’s it. From now on, each request is going to be processed by the registry parse method, and manual response handling is not required anymore. Here is a rough example:
import { useShallow } from "zustand/shallow";
import { useQuery } from "@tanstack/react-query";
import { UserRPC } from "vovk-client";
import { useRegistry } from "@/registry";
import { UserType } from "../../prisma/generated/schemas/models/User.schema";
interface Props {
userIds: UserType["id"][]; // an array of branded user IDs
}
const UsersExample = ({ userIds }: Props) => {
// request the data somewhere in the app
useQuery({
queryKey: UserRPC.getUsers.queryKey(),
queryFn: () => UserRPC.getUsers(),
});
// retrieve users from the registry
const users = useRegistry(
useShallow((state) => userIds.map((id) => state.user[id])),
);
return <div>
<ul>
{users.map((user) => (
<li key={user.id}>{user.fullName}</li>
))}
</ul>
</div>;
};
export default UsersExample;As you can see useQuery invocation doesn’t require to read data property anymore, as the response is processed by the registry automatically. The invocation of UserRPC.getUsers() or any other RPC method (with useQuery or without) can be placed anywhere in the app, and all components that use user data are going to be updated automatically.
Final bit: enabling function calling
As Cameron King provided a solid framework for Next.js + OpenAI Realtime API + Function Calling, the only thing left is to extend or replace existing tools with createLLMTools function from Vovk.ts, that turns RPC modules into tools that can be used by any LLM setup.
"use client";
import { useToolsFunctions } from "@/hooks/use-tools";
import useWebRTCAudioSession from "@/hooks/use-webrtc";
import { tools } from "@/lib/tools"; // default tools by Cameron King
import { useEffect, useState } from "react";
import { createLLMTools } from "vovk";
import { TaskRPC, UserRPC } from "vovk-client";
import Floaty from "./Floaty";
const { tools: llmTools } = createLLMTools({
modules: { TaskRPC, UserRPC },
});
const RealTimeDemo = () => {
// State for voice selection
const [voice] = useState<"ash" | "ballad" | "coral" | "sage" | "verse">(
"ash",
);
// WebRTC Audio Session Hook
const {
isSessionActive,
registerFunction,
handleStartStopClick,
currentVolume,
} = useWebRTCAudioSession(voice, [...tools, ...llmTools]);
// Get all tools functions
const toolsFunctions = useToolsFunctions();
useEffect(() => {
// Register all functions by iterating over the object
Object.entries(toolsFunctions).forEach(([name, func]) => {
const functionNames: Record<string, string> = {
timeFunction: "getCurrentTime",
partyFunction: "partyMode",
scrapeWebsite: "scrapeWebsite",
};
registerFunction(functionNames[name], func);
});
llmTools.forEach(({ name, execute }) => {
registerFunction(name, execute);
});
}, [registerFunction, toolsFunctions]);
return (
<div>
<Floaty
isActive={isSessionActive}
volumeLevel={currentVolume}
handleClick={handleStartStopClick}
/>
</div>
);
};
export default RealTimeDemo;For more details, check the full code of the component at the GitHub repository as well as the Floaty component .
Deploy the app
As this is just a demo, I’ve used free proprietary tools and services to make it work for easy deployment. Vovk.ts isn’t affiliated with any of them, and you can replace them with any other tools you prefer (switching out from Neon would require minor changes to DatabaseService.ts to remove Neon-specific code).
- OpenAI API for natural language processing
- Vercel for hosting and serverless functions
- Neon Vercel integration for Postgres/PGVector database hosting
- Redis Vercel integration for Redis hosting
In order to deploy the app, with the same setup, create a new project at Vercel, link it to the fork of the GitHub repository, and add the above mentioned integrations.
Add OPENAI_API_KEY at the project environment variables. Other variables such as DATABASE_URL and REDIS_URL are going to be created automatically by the integrations. In case of a problem, check the .env.template file for reference.