Simulation
Architecture
Perception
Propagation
Watcher
Working Research Prototype · JavaScript
Agent Simulation
Multi-Agent System · Information Reliability · Emergent Narrative Behavior
Engine JavaScript (Vanilla)
Agent Count 7 agents per run
Duration 5-day simulation cycle
Status Functional prototype
Purpose of This Document
This document describes the agent architecture underlying the Prison Sandbox simulation — a working research prototype demonstrating how autonomous agents with private motivations, unreliable information, and social relationships produce emergent narrative behavior. The simulation is a concrete implementation of the central research question: how information reliability, as a first-class design variable, drives the emergence of social structure and narrative coherence in multi-agent systems.
The Agent — Core Architecture
What is an Agent?
An agent is an entity that exists inside the simulation with incomplete knowledge of the world, a reason to act, and a way of seeing other agents that is shaped by who they are — not by what is objectively true. Every agent is generated at the start of the simulation with a fixed set of properties that never change, and a set of dynamic properties that evolve through interaction.
Fixed Properties — Generated Once, Never Change
Identity & Role A name and a role: prisoner, guard, gang member, or independent. Role affects what beliefs they're likely to start with but doesn't lock behavior. A guard and a prisoner can both be honest or deceptive — role shapes probability, not determinism.
Truth Rate A float between 0.0 and 1.0. This is how reliably honest this agent is when they transmit information to others. It's fixed, hidden from every other agent, and only the Watcher knows it. An honest agent sits around 0.85. A sneaky one around 0.50. A habitual liar around 0.20. This isn't about intelligence — a liar can know exactly where the exit is and still transmit the wrong location. Crucially, truth rate is independent of knowledge.
Traits Up to two, drawn from a defined pool. Loyal, Sneaky, Tattler, Cowardly, Violent, Greedy, Connected, Paranoid, Lazy. Traits govern motivation and behavior thresholds — a Loyal agent protects their gang's information, a Paranoid agent misreads neutral behavior as threatening, a Connected agent starts with more beliefs than average.
Agenda What this agent is actually trying to achieve. Derived partly from traits, partly from a random roll at generation. Examples: survive until transfer, protect a specific ally, find the exit, control information flow in their block, frame another prisoner. The agenda is the hidden engine behind every decision the agent makes.
Trait System — Behavioral Modifiers
Loyal
Withholds sensitive information from non-allies; trust decay is slower toward allied agents.
Sneaky
Eligible to plant deferred traps; targets low-trust agents for distorted transmissions.
Paranoid
Suspicion score inflated by +0.30 in the perception engine; may form false beliefs from neutral events.
Tattler
Will transmit suspicion beliefs directly to guards rather than peers.
Connected
Starts with more initial beliefs; receives exit route information from external contacts.
Cowardly
Requires higher confidence threshold before acting on beliefs; suspicion score reduced.
Violent
Low-trust interactions may result in a fight response rather than withheld information.
Greedy
Weighted toward transmitting information that produces personal benefit.
Lazy
Guard alertness reduced; perception suspicion score slightly dampened.
Dynamic Properties — Evolve Through Simulation
Belief State A sparse, confidence-weighted map of what the agent currently believes about the world. Not every agent knows everything. A new arrival might have almost nothing. A Connected veteran might have partial knowledge of two escape routes and guard schedules for one shift. Each belief entry carries: what the agent believes, how confident they are, who they heard it from, and how many hops it has travelled through other agents before reaching them. Crucially — beliefs can be wrong from the moment of generation. The random sheet that seeds the simulation assigns beliefs probabilistically. An agent might start believing the exit is through the kitchen at 60% confidence. That belief might be false. They didn't get it from anyone — it's just what they came in thinking. Everything downstream of that initial wrong belief compounds.
Trust Scores A float toward every other agent in the simulation, seeded randomly at generation and updated through interaction. If Agent B acts in a way that confirms what Agent A expected, A's trust toward B rises slightly. If B's information turns out to be wrong — or if A's perception engine flags B's behavior as suspicious — trust drops. Trust scores are not symmetrical. A can trust B at 0.7 while B trusts A at 0.2.
Deferred Actions A queue of things this agent has set in motion that haven't resolved yet. This is where traps live. An agent can plant false information on Day 1 and schedule a follow-up action on Day 3 — tipping off a guard, positioning themselves, recruiting another agent to be present. The planted information was false when transmitted. By Day 3 the agent has made it partially true. Anyone who acted on the original tip gets caught not because they were wrong but because the trap was built around their belief.
The Perception Engine — Where Distortion Begins
This is where the research question lives
The gap between the ground truth event and Agent A's logged perception of it is the first point of distortion. Everything that propagates through the network downstream of that perception is built on this initial divergence — which emerged not from lying, but from the irreducible subjectivity of observation.
When Agent A observes Agent B, they don't receive a clean data packet. They receive an event — Agent B near the garbage bins at night, Agent B talking to a guard twice in one day, Agent B avoiding the yard. The perception engine takes that observable behavior, runs it through Agent A's own filters, and produces a logged belief. Those filters are: A's current trust score toward B, A's active agenda at that moment, A's relevant traits, and a dice roll — an instinct variable that represents the irreducible randomness of how a person reads a situation. The dice roll means two agents with identical stats watching the same event can reach different conclusions. That's not a flaw in the model. That's personality.
Perception Formula
suspicion = clamp(
  dice + paranoia_bonus + sneaky_bonus + role_bonus
  − loyal_discount − lazy_discount − cowardly_discount
  − (trust × 0.42),
  0, 1
)
dice — Random float [0,1], the instinct variable
paranoia_bonus — +0.30 if agent has Paranoid trait
sneaky_bonus — +0.10 if agent has Sneaky trait
role_bonus — +0.14 if guard observing prisoner; +0.08 if prisoner observing guard
loyal_discount — −0.20 if Loyal trait AND trust toward observed agent > 0.60
lazy_discount — −0.10 if Lazy trait
cowardly_discount — −0.08 if Cowardly trait
trust × 0.42 — The higher A's trust toward B, the more A interprets B's behavior charitably
Suspicion > 0.52 SUSPICIOUS — Agent forms a false belief: "[observed] is hiding something at [location]"
Suspicion 0.28–0.52 UNCERTAIN — Agent forms a weak, ambiguous belief at low confidence
Suspicion < 0.28 BENIGN — No belief formed. Event is not logged in agent's belief state.
Information Propagation — How Beliefs Travel
When Agent A decides to share information with Agent C, the transmission does not pass A's belief directly. It passes through A's truth rate first. What gets logged is not "Agent B was cleaning the garbage bin on guard orders." What gets logged is Agent A's interpretation — "Agent B is hiding something near the garbage bins." The ground truth and the perception are two separate entries. Only the Watcher holds both. When Agent A later tells Agent C what they saw, C receives A's interpretation, not the event. The garbage bin becomes a hiding spot that was never real. That belief travels. It mutates at each hop. Agents who never went near the garbage bin now factor it into their plans.
trust < 0.22 AND Violent trait FIGHT — A refuses aggressively. Trust toward A drops −0.08.
trust < 0.28 OR Loyal + cross-role WITHHELD — A claims ignorance. Trust toward A drops −0.02.
dice < truth_rate HONEST — A transmits their actual belief.
Sneaky AND trust < 0.55 PLANT — A transmits a false location AND queues a deferred trap.
fallthrough DISTORTED — A transmits a false location drawn from known alternatives.
When a belief is received, the receiver applies a confidence discount based on their trust toward the sender. Confidence decays further with each hop. A belief that was held at 0.80 by its originator arrives with meaningfully less certainty after travelling through two intermediate agents. Beliefs decay by 0.03 per hop. The hop count is preserved so the Watcher can trace the propagation path.
The Watcher — Research Instrument
The Watcher is the research instrument
The gap between the Watcher's log and any agent's belief state at any moment is the distortion score for that agent. Summed across all agents over time, mapped as a network — that is the data. That is what answers the research questions.
The Watcher is not an agent. It has no trust scores, no agenda, no beliefs. It cannot be interacted with, seen, or deceived. It exists in one layer above the simulation and records everything with perfect fidelity — every event as it actually happened, every agent perception of that event, every belief update, every deferred action queued and executed.
Distortion Score
distortion = (hops × 0.20) + ((1 − confidence) × 0.50)
> 0.60 — HIGH distortion (belief substantially wrong or very weakly held)
0.30–0.60 — MEDIUM distortion (partial distortion; belief may be directionally correct but unreliable)
< 0.30 — LOW distortion (belief is reasonably accurate)
0.00 — NONE (agent is the instigator or has no relevant belief)
Scenario Walkthrough — One Event, Five Agents
WATCHER · GROUND TRUTH
Agent E is near the garbage bins at 23:00 because a guard instructed them to move a bag. Duration: four minutes. Nothing is hidden at the location.
Day 1 — Perception
Agent A observes the event. Trust toward E: 0.30. Trait: Paranoid (+0.30 suspicion). Dice roll: high. Result: SUSPICIOUS. Belief formed: "E is hiding something near the bins." Confidence: 0.65.

Agent B observes the same event. Trust toward E: 0.70 (gang ally). Trait: Loyal (−0.20 discount). Dice roll: low. Result: BENIGN. No belief formed.
Day 2 — Propagation
Agent A tells Agent C. C's trust toward A: 0.60. No prior belief about the bins. C updates: belief confidence 0.45. One hop, some decay. The false belief is now in a second agent.

Agent C tells Agent D. D's trust toward C: 0.40. D updates at low confidence: 0.25. D has the Sneaky trait and an agenda to control information. D files it away.
Day 3 — Trap
Agent D tips off a guard that Agent F — who D wants removed — has been using the bins. D fabricated F's involvement. The guard checks. Finds nothing at the bins. But F is now under suspicion.
Day 5 — Resolution
The deferred trap triggers. F is questioned. The suspicion is now institutionally real.
The Watcher logged every step. The original event was four minutes of bag-moving on guard orders. What it produced: a false suspicion, a planted accusation, and a permanently altered social dynamic. No one lied about the original event. The distortion emerged entirely from perception, propagation, and one agent's strategic use of a rumour received in good faith.
Research Variables as Implemented Systems
Information Reliability (RQ1) Truth rate: a fixed, hidden float per agent. Governs whether transmissions are honest, distorted, or planted. Varies continuously from 0.0 to 1.0.
Trust Modeling (RQ2) Per-agent trust scores seeded randomly and updated through interaction. Governs perception discounting, sharing thresholds, and belief acceptance. Asymmetric and non-transitive.
Consequence Permanence (RQ3) Deferred action queue. Actions planted on Day N execute on Day N+K. Outcomes — caught, escaped, shot, warned — are irreversible within the run.
Simulation Properties
Agent Population
7 agents per run (5 prisoners, 2 guards)
Simulation Duration
5 days
Reproducibility
Fully deterministic given seed (Mulberry32 PRNG)
Instigator Selection
Weighted random — prisoners 3× more likely
Trait Assignment
Random sample from pool of 9, max 2 per agent
Initial Trust
Random float [0.15, 0.85] per agent pair