Seabird: App Discovery & Scripting

This post introduces the concepts behind Seabird, an app framework for building discoverable and scriptable apps. A discoverable app onboards new users by teaching app functionality via an agent; a scriptable app lets experienced users compose app functions together with the agent to get more done, faster.

This is the first of a multi-part series introducing Seabird. If you want to stay in the loop, I will send update emails to signups at Seabird.dev. Otherwise, check back here in a few days for the next post.

Motivation

The prototypical app is GUI-driven. If it offers an agent, that agent is usually a floating chatbox in the bottom right corner of the window. The agent supports less than half of the app's functionality. It likely does not know what the user is doing right now in the app.

This agent is more toy than tool. We can do much, much better.

Seabird's motivating axiom is that an in-app agent is useful iff understanding and interface are merged with the GUI. An agent that understands only half of app context or functionality is mostly useless. Nothing is more frustrating than asking a question and receiving an unrelated answer, or issuing a command for a simple GUI task and being told the agent cannot do it.

A discoverable app is one that anyone can pick up and use. Each new user walks an app's learning curve to discover how it works, and if the curve is too steep, the user will bounce. An agent can lower the learning curve by giving the user an outlet for “how do I…” questions and providing interactive tutorials. The more complex an app is, the more it benefits from discovery via natural language requests.

A scriptable app is one where the user can bypass tedious or repetitive work. Contemporary language models are capable of translating single requests into a suite of tool calls, each of which might have taken many seconds or minutes to manually configure and run. With such automation, the human user fundamentally controls the request without being forced to click around manually.

Over this post I’ll describe how Seabird integrates the in-app agent with the GUI for better onboarding and automation.

Background

Seabird focuses on traditional database backed apps. The database is defined by a SQL schema, and given a schema we can generate equivalent types in our programming language. This is not new! Every modern ORM already generates code from a SQL schema.

App logic is defined by methods - functions that act on models and other objects. Methods include business logic, auth, validation, db reads and writes, and other side effects. Again, none of this is new: methods exist today across web and mobile apps. We may organize them as controllers, routes, or services, but fundamentally they all define handlers for application logic.

Now let’s add our first new-ish idea.

Tools as the unified client/agent API

We annotate each client-facing method as a tool. Each tool is automatically injected into a registry that will be exposed to both the agent and the client.

impl Game {
    #[tool(display_name = "Start Game", method = "create")] // this macro marks a tool
    async fn start(opponent_id: Uuid, session: Session) -> Self { ... }
}

Each tool definition is inserted into the agent prompt.¹ The agent runs on behalf of a specific user, inheriting all user permissions and restrictions. When the user issues a natural language request to run app logic, the agent translates the request into tool calls. Each tool call is gated by software-defined access controls and risk checks.

A tool definition also generates a corresponding RPC interface: client code to invoke the tool from a web or native (TS, Swift, Kotlin, etc.) environment. This RPC code replaces API POST routes.²

// auto generated TypeScript client code for tool calls and types
export const startGame = async (oppenentId: string): Promise<Game> => { ... };
type Game = { ... };

The end result is a unified interface for GUI-based and agentic tool calls. With a unified interface in place, we can add the following system-wide features:

Log the shared GUI and agent tool call history as context for future agent calls. This lets the user switch between GUI and agent without losing context.
Reflect both GUI and agent tool calls and results directly into GUI components. Each agent tool call implicitly teaches the user how to perform the action manually.

These two features, built atop the shared tool call interface, ensure agent-GUI equality.

Discovery

App discovery follows from the agent's complete knowledge of user-facing tools and agent-GUI interop. In the game example above, a new user might ask the agent "How do I start a game against John?". The agent composes the following tool calls:

# first call to agent
search(text="john", types=["User"])
=> [
    User(id="8960CC", name="Jon Russo", p=.99)
    User(id="735166", name="John Morgan", p=.007)
    User(id="E59509", name="John Wilson", p=.001)
    ...10 more
]
# second call to agent
start_game(opponent_id="8960CC")
=> Game(
    id="F5C6B6",
    player=User(...),
    opponent=User(...),
    turn="player",
    started_at="2025-7-7 11:24:00-04:00"
)

The GUI shows each agent tool call as it occurs:

On search, the searchbox automatically renders the search text and shows results as they are returned
On start_game, the app navigates to the game page

The agent both performs the user request and shows the user how to perform the same action manually.³

Scriptability

Seabird's software-gated tools make the app inherently scriptable on behalf of a particular user. A game host might want to upload a CSV of emails and say "use this CSV to create any user that doesn't already exist." Normally, this kind of work requires minutes or hours of manual data entry. The work is perfectly suited for an agent with underlying knowledge of app models and tools, though:

# built in fn to create an iterator from a list-like object
iter(csv)
=> Iterator(id="DFE863")

# built in fn to get next iterator item
next(id="DFE863")
=> ["Bob Daniels", "bob.daniels@gmail.com"]

# SQL query is parsed and validated before it runs (covered in the next post)
search(sql="SELECT COUNT(*) FROM users WHERE email = 'bob.daniels@gmail.com'")
=> 0

create_user(name="Bob Daniels", email="bob.daniels@gmail.com")
=> User(...)

next(id="DFE863")
=> ["Alice Palmer", "alicepalmer1972@hotmail.com"]

search(sql="SELECT COUNT(*) FROM users WHERE email = 'alicepalmer1972@hotmail.com'")
=> 1

# Alice Palmer already exists, so the agent skips to the next CSV row
next(id="DFE863")
...

The agent churns on the user request until the CSV has been fully processed. Every tool call is authenticated and authorized.

In the future I'd like Seabird to move toward more general code generation. Generated code might execute tools inside a secure interpreter that supports not only function calls but also variables, loops, and conditionals:

for row in csv:
    name = row[0]
    email = row[1]
    count = search(sql="SELECT COUNT(*) FROM users where email = ?", email)
    if count == 0:
        create_user(name=name, email=email)

Code generation can be significantly more efficient than tool calling, but it is also more prone to model syntax errors that derail the agent. For now, tool calling interfaces are enough for most app scripting.

Safety

Safety is an essential Seabird requirement: the agent must never do what it shouldn't. There are two major components to safety in tool calling: authorization and risk management.

Authorization: the agent acts on behalf of a user and may only read and write data that the user can read and write manually. Authorization checks always run deterministically in software, never in the language model.

set_user_name(id="33C30B", name="Doofus")
=> Not Permitted

Authorization depends on an app-configurable permissions checker API.⁴ If the tool call is not authorized, it fails with a "Not Permitted" error.

Tool risk: the agent may call risky or destructive tools. Before any tool runs, Seabird software checks whether the tool is risky. If so, it asks the user to confirm. An app developer can tag any tool as risky by annotating it as a "delete" or "send" method, or force a confirmation on other kinds of tools by adding a confirm="always" tag.

impl User {
     // a tool with a "delete" method tag always confirms before execution
    #[tool(method = "delete", display_name= "Delete User")]
    async fn delete(self, session: Session)
}

Tool risk exists independently of authorization. Even if a user is fully authorized to make a risky tool call, Seabird will issue a user prompt. The goal here is minimizing user regret - we don't want the agent to accidentally delete data or unexpectedly send a message.

delete_user(id="CDF44A")
=> Confirm(tool="Delete User")

If Seabird issues a risk-based prompt, the tool will not execute until the user affirms or denies the prompt.

Is a Framework Necessary?

It is technically possible to avoid a Seabird-like framework with handwritten code. However, after some testing I've concluded that Seabird's registry, code generator, and embedded agent make agent-driven apps far more tractable. Handwritten code must synchronize types and functionality across the database, backend, agent, and client. Seabird automates this process, reducing handwritten code to a normal developer workflow:

in the database, define a schema
on the backend, define methods for app logic
on the client, render the UI and call the backend API

MCP

You may be thinking This sounds like MCP. There is some overlap, but not enough to warrant building an in-app agent atop MCP. A tool is a tool, but Seabird and MCP are built with distinct purposes and feature sets.

Seabird provides in-app discovery and scripting that integrates with your app GUI. MCP exposes your tools to a third party agent for headless remote execution. The primary benefits of Seabird are app-specific, while the primary benefit of a MCP server is interop with Claude, ChatGPT, or other MCP clients.

Implicit in MCP is the opening of app functionality and user data to third parties. Public APIs may benefit from a MCP interface, but many app APIs are closed for privacy, security, and competitive reasons. Seabird is app-specific, letting the app provide an agent to users without exposing themselves to the risks of 3P access.

Ultimately, Seabird and MCP are complementary. Your app will use a Seabird-like framework to integrate your agent and GUI with tools. Where appropriate, your app may expose a subset of those tools as a MCP server for external interop.

Alex Shapiro