The agent loop

Every Kiki agent runs the same loop: Perceive → Reason → Act. It's defined by the Agent trait in kiki-core and is the central control flow of the system.

rust

async fn run(&self, ctx: &mut Context) -> Result<()> {
    loop {
        let perception  = self.perceive(ctx).await?;
        let plan        = self.reason(ctx, perception).await?;
        let observation = self.act(ctx, plan).await?;
        ctx.push_observation(observation.clone());
        if observation.is_terminal() { break; }
    }
    Ok(())
}

Each iteration appends its observation to history and stops when an observation is terminal. State is checkpointed before each durable step, which is what makes mid-task migration possible.

Perceive

Perception gathers input from several sources, in order of preference: the Wayland tree and native app state (first-class), kernel events, AT-SPI for non-Kiki apps (fallback), and a screenshot only as a last resort. See Computer use.

Reason

Reasoning turns perception into a plan — a thought plus an ordered list of steps, each naming a tool and its input. This is where the model is invoked: the relevant tools are presented, and the model produces the plan.

Act

Acting runs a plan step and produces an observation (output, whether it errored, whether it's terminal). Each tool call is dispatched through the on-device tool hub and checked against the app's capabilities. The observation feeds the next perception.

Context

The loop threads a Context carrying the agent and session identity, the granted capabilities, the observation history, the state backend, and an optional step limit — so the loop is bounded and auditable.

Telemetry

Each loop step emits a structured tracing span. On a device in the cloud, fleet telemetry aggregates these so you can observe what agents are doing across a whole fleet.

Next: State & migration.

The agent loop ​

Perceive ​

Reason ​

Act ​

Context ​

Telemetry ​

The agent loop

Perceive

Reason

Act

Context

Telemetry