The agent loop
Every Kiki agent runs the same loop: Perceive → Reason → Act. It's defined by the Agent trait in kiki-core and is the central control flow of the system.
async fn run(&self, ctx: &mut Context) -> Result<()> {
loop {
let perception = self.perceive(ctx).await?;
let plan = self.reason(ctx, perception).await?;
let observation = self.act(ctx, plan).await?;
ctx.push_observation(observation.clone());
if observation.is_terminal() { break; }
}
Ok(())
}Each iteration appends its observation to history and stops when an observation is terminal. State is checkpointed before each durable step, which is what makes mid-task migration possible.
Perceive
Perception gathers input from several sources, in order of preference: the Wayland tree and native app state (first-class), kernel events, AT-SPI for non-Kiki apps (fallback), and a screenshot only as a last resort. See Computer use.
Reason
Reasoning turns perception into a plan — a thought plus an ordered list of steps, each naming a tool and its input. This is where the model is invoked: the relevant tools are presented, and the model produces the plan.
Act
Acting runs a plan step and produces an observation (output, whether it errored, whether it's terminal). Each tool call is dispatched through the on-device tool hub and checked against the app's capabilities. The observation feeds the next perception.
Context
The loop threads a Context carrying the agent and session identity, the granted capabilities, the observation history, the state backend, and an optional step limit — so the loop is bounded and auditable.
Telemetry
Each loop step emits a structured tracing span. On a device in the cloud, fleet telemetry aggregates these so you can observe what agents are doing across a whole fleet.
Next: State & migration.