Files

Nathan Schneider bda868cb45 Implement LLM-driven governance architecture with structured memory

This commit completes the transition to a pure LLM-driven agentic
governance system with no hard-coded governance logic.

Core Architecture Changes:
- Add structured memory system (memory.py) for tracking governance processes
- Add LLM tools (tools.py) for deterministic operations (math, dates, random)
- Add audit trail system (audit.py) for human-readable decision explanations
- Add LLM-driven agent (agent_refactored.py) that interprets constitution

Documentation:
- Add ARCHITECTURE.md describing process-centric design
- Add ARCHITECTURE_EXAMPLE.md with complete workflow walkthrough
- Update README.md to reflect current LLM-driven architecture
- Simplify constitution.md to benevolent dictator model for testing

Templates:
- Add 8 governance templates (petition, consensus, do-ocracy, jury, etc.)
- Add 8 dispute resolution templates
- All templates work with generic process-based architecture

Key Design Principles:
- "Process" is central abstraction (not "proposal")
- No hard-coded process types or thresholds
- LLM interprets constitution to understand governance rules
- Tools ensure correctness for calculations
- Complete auditability with reasoning and citations

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

2026-02-08 14:24:23 -07:00

16 KiB

Raw Blame History

Govbot Architecture

Current State: Pure LLM-driven governance with structured memory and strong auditability

Design Principles

No Hard-Coded Governance Logic: Constitution defines ALL governance rules in natural language
LLM as Interpreter: Agent interprets constitution and makes all governance decisions
Structured Memory: Explicit memory system tracks state and enables LLM reasoning
Tools for Correctness: LLM uses tools for calculations (not reasoning about math)
Auditability First: Every decision logged with reasoning and constitutional citation
Human-Readable: All state and decisions must be inspectable by humans

Central Concept: Process

The core abstraction in Govbot is the process - a generic container for any governance activity that unfolds over time.

Process Types (examples, not exhaustive):

Proposals: Seeking decisions on policy, rules, actions
Disputes: Conflict resolution, mediation, arbitration
Elections: Selecting people for roles or responsibilities
Discussions: Facilitated conversations without a specific decision goal
Do-ocracy: Tracking autonomous actions taken by members
Reviews: Evaluating past decisions, actions, or outcomes
Juries: Random selection and deliberation processes
Any activity defined in your constitution

Why "Process" not "Proposal"?

✅ Generic - doesn't assume voting or decisions
✅ Flexible - works for conversations, actions, selections, etc.
✅ Temporal - captures activities that unfold over time
✅ Minimalist - one concept covers all governance activities

The LLM interprets your constitution to understand what types of processes exist and how they work. No process types are hard-coded.

Architecture Overview

┌─────────────────────────────────────────────┐
│           Governance Request                │
│     (Natural Language from User)            │
└─────────────────┬───────────────────────────┘
                  │
                  ▼
┌─────────────────────────────────────────────┐
│         Governance Agent (LLM)              │
│  • Interprets request                       │
│  • Consults constitution (RAG)              │
│  • Queries memory for context               │
│  • Uses tools for calculations              │
│  • Makes governance decisions               │
│  • Updates memory with reasoning            │
└─────────────────┬───────────────────────────┘
                  │
        ┌─────────┼─────────┐
        │         │         │
        ▼         ▼         ▼
┌─────────┐  ┌────────┐  ┌──────────┐
│ Memory  │  │ Tools  │  │ Audit    │
│ System  │  │        │  │ Trail    │
│         │  │        │  │          │
│ Tracks  │  │ Math   │  │ Explains │
│ State   │  │ Dates  │  │ Decisions│
│ Context │  │ Random │  │ Cites    │
└─────────┘  └────────┘  └──────────┘

Core Components

1. Governance Agent (LLM-Driven)

Role: Interprets constitution and makes all governance decisions

Responsibilities:

Parse user requests to understand intent
Query constitution for relevant rules (using RAG)
Query memory for current state
Reason about what action to take
Use tools for deterministic operations
Update memory with decisions
Generate audit trail

Key Feature: Agent does NOT execute hard-coded logic. Instead:

Reads constitution to understand rules
Uses tools to calculate/verify
Decides based on interpretation

Implementation: src/govbot/agent_refactored.py

2. Structured Memory System

Role: Persistent state that LLM can query and update

What It Tracks:

Processes: Active governance processes (proposals, disputes, etc.)
Events: Timeline of all governance events
Decisions: Bot decisions with reasoning
Participants: Who's involved in what
Context: Historical information for precedent

Key Features:

Queryable: LLM can search memory by criteria
Structured: Not just raw text, but typed records
Temporal: Tracks history and changes over time
Human-Readable: Can be inspected and understood
Versioned: Changes tracked for audit

Memory Schema:

ProcessMemory:
    id: str
    type: str  # "proposal", "dispute", "election", etc.
    status: str  # "active", "completed", "cancelled"
    created_at: datetime
    created_by: str
    deadline: Optional[datetime]
    constitution_basis: List[str]  # Article/section citations
    state: Dict[str, Any]  # Flexible process-specific state
    events: List[Event]  # History of what happened
    decisions: List[Decision]  # Bot decisions about this process

Event:
    timestamp: datetime
    actor: str
    event_type: str  # "vote_cast", "proposal_submitted", etc.
    data: Dict[str, Any]
    context: str  # Human description

Decision:
    timestamp: datetime
    decision_type: str  # "threshold_met", "deadline_reached", etc.
    reasoning: str  # LLM's reasoning
    constitution_citations: List[str]
    calculation_used: Optional[str]  # If tool was used
    result: Any

Important: The type field in ProcessMemory is completely flexible - it's not an enum or predefined list. The LLM reads your constitution to understand what process types exist and uses the same terminology. If your constitution mentions "lazy consensus", "sortition", or "restorative circle", those become valid process types. The system doesn't need to know about them in advance.

Implementation: src/govbot/memory.py

3. LLM Tools (Generic Primitives)

Role: Deterministic operations the LLM can use

Why Tools?

LLMs are unreliable at arithmetic
Need deterministic, verifiable results
Separates "what to do" (LLM) from "how to do it" (tool)

Available Tools:

calculate(expression, variables) - Evaluate mathematical expressions safely
get_datetime() - Current time
datetime_add(dt, days, hours) - Date calculations
is_past_deadline(deadline) - Check if deadline passed
random_select(items, count) - Random selection for juries
tally(items, key) - Count votes
filter_items(items, criteria) - Filter data
percentage(num, denom) - Calculate percentages

Before (hard-coded):

def check_threshold(counts, "simple_majority") → bool
def check_threshold(counts, "3x_majority") → bool

After (generic):

def calculate(expression: str, variables: Dict) → Any
# LLM provides: "agree > disagree", {"agree": 10, "disagree": 3}

Implementation: src/govbot/tools.py

4. Audit Trail System

Role: Human-readable explanation of all decisions

What It Captures:

Decision: What was decided
Reasoning: Why (in natural language)
Constitutional Basis: Which articles/sections
Calculations: What math was done
Precedent: Related past decisions
Participants: Who was involved
Timeline: When things happened

Audit Output Example:

# GOVERNANCE DECISION AUDIT TRAIL

**Decision**: Proposal #23 has passed
**Timestamp**: 2026-02-15 18:00:00 UTC
**Process**: standard_proposal (ID: prop_23)

## Constitutional Basis
- Article 3, Section 3.1: "Standard Proposals address routine governance matters"
- Article 3, Section 3.1: "Passage threshold: More Agree than Disagree votes"

## Calculation
Expression: agree > disagree
Variables: {"agree": 12, "disagree": 3, "abstain": 2, "block": 0}
Result: 12 > 3 = True

## Reasoning
The proposal reached its deadline of Feb 15, 2026 at 18:00 UTC.
According to the constitution, this is a standard proposal requiring
more agree votes than disagree votes. The vote tally shows 12 agree
and 3 disagree, which satisfies the threshold. Therefore, the proposal passes.

## Related Precedent
- Proposal #18 (passed with similar threshold)
- Proposal #21 (failed with 8 agree, 10 disagree)

## Next Actions
- Announce result to community
- Log outcome in governance record
- Execute authorized actions (if any)

Implementation: src/govbot/audit.py

5. Constitutional Reasoning (RAG)

Role: Retrieval-augmented generation for querying the constitution

How It Works:

Constitution chunked into semantic sections
Vector embeddings enable similarity search
LLM retrieves relevant constitutional passages
Provides context for decision-making

Implementation: src/govbot/governance/constitution.py

Complete Workflow Example

See ARCHITECTURE_EXAMPLE.md for a detailed walkthrough of a complete process lifecycle.

High-Level Flow (Example: Proposal Process)

This example uses a proposal to illustrate the flow, but the same pattern works for any process type (disputes, elections, discussions, etc.):

1. User initiates process (in this case, submitting a proposal)
2. Agent queries constitution: "What rules apply to this process?"
   → Constitution: Standard proposals need 6 days, more agree than disagree
3. Agent queries memory: "What active processes exist?"
   → Memory: 2 active processes currently
4. Agent updates memory:
   - Create process record (type: "proposal")
   - Log "process_initiated" event
   - Calculate deadline using datetime tool
   - Store decision: "Created process based on Article 3.1"
5. Agent announces to user with reasoning
6. [Time passes, users interact with the process]
7. Agent checks deadlines (scheduled task)
8. Agent queries memory: "What processes have reached deadline?"
   → Memory: Process #23 deadline was Feb 15
9. Agent queries memory: "What interactions occurred in process #23?"
   → Memory: 12 agree, 3 disagree, 2 abstain
10. Agent queries constitution: "What's the completion criteria?"
    → Constitution: "More agree than disagree"
11. Agent uses calculate tool: "agree > disagree", {agree: 12, disagree: 3}
    → Tool: True
12. Agent updates memory:
    - Store decision with reasoning
    - Log "process_completed" event
    - Update process status to "completed"
13. Agent generates audit trail
14. Agent announces result with full explanation

Key Point: The same flow works for any process type. The constitution defines what "initiating", "interacting", and "completing" mean for each process type.

Key Architecture Differences

Aspect	Traditional (Hard-Coded)	Current (Agentic)
Governance Rules	In Python code	In constitution (natural language)
Thresholds	4 fixed types	Any expression interpretable by LLM
State Storage	Database records	Structured memory
Decision Making	if/else logic	LLM reasoning with tools
Flexibility	Requires code changes	Constitution changes only
Auditability	Code + logs	Natural language reasoning
Process Types	Pre-defined	Any process type in constitution
Calculations	Python code	LLM + calculator tool

Benefits

1. Flexibility

No code changes needed for new governance models. Just update the constitution in natural language to define new process types.

Example:

## Lazy Consensus Process
Proposals pass unless blocked by 2+ members within 7 days.

## Restorative Circle Process
When harm occurs, affected parties meet with a facilitator to discuss
impact and agree on repairs. Process completes when all parties signal readiness.

## Sortition Selection Process
For jury roles, randomly select 5-7 members from those who opt in.

The LLM interprets these rules and handles each process type correctly using the generic tools. No code changes needed.

2. Auditability

Every decision includes:

Natural language reasoning
Constitutional citations
Calculation details
Related precedent

Non-programmers can understand exactly what happened and why.

3. Template Support

Works with diverse governance templates:

✅ Petition (simple voting)
✅ Consensus (blocks, concern resolution)
✅ Do-ocracy (authority through action)
✅ Jury (random selection, deliberation)
✅ Circles (lazy consensus, domains)
✅ All 8 dispute resolution processes

The same code handles all templates by interpreting their constitutional rules.

4. Transparency

Community members can:

Read audit trails in plain language
Verify constitutional citations
Inspect calculation details
Review precedent
Understand bot reasoning

5. Community Governance

Communities can amend their governance processes through the governance process itself, without requiring developer involvement.

File Structure

src/govbot/
├── memory.py              # Structured memory system
├── tools.py               # LLM tools for calculations
├── audit.py               # Audit trail generation
├── agent_refactored.py    # LLM-driven agent (current)
├── governance/
│   ├── primitives.py      # Generic platform actions
│   └── constitution.py    # RAG system
└── db/
    ├── models.py          # Database models
    └── queries.py         # Database queries

Considerations

LLM Reliability

Challenge: LLMs can make mistakes or interpret rules inconsistently

Mitigations:

Use tools for all math (not LLM reasoning)
Require constitutional citations
Store decisions as precedent
Enable community review and appeals
Validate critical decisions

Cost and Latency

Challenge: More LLM calls than hard-coded approach

Mitigations:

Cache constitutional interpretations
Use faster models for routine tasks
Batch deadline checks
Optimize prompts

Context Window Limits

Challenge: Memory can grow large over time

Mitigations:

Hierarchical memory (summary → detail)
Relevance filtering
Only include relevant precedent
Summarize completed processes

Testing

See implementation files for unit tests:

tests/test_memory.py - Memory operations
tests/test_tools.py - Tool correctness
tests/test_audit.py - Audit generation
tests/test_agent.py - End-to-end workflows

Success Criteria

✅ Zero hard-coded governance logic: No if/else for proposal types, thresholds, etc.

✅ Constitution is source of truth: All rules come from constitution text

✅ LLM makes all decisions: Agent interprets and decides, not executes programmed routines

✅ Memory is queryable: Can ask "what proposals are active?" and get answer

✅ Audit trail is complete: Every decision has reasoning + citations

✅ Human-readable: Non-programmers can understand what happened and why

✅ Handles diverse templates: Works with consensus, do-ocracy, jury, etc.

Next Steps

Current implementation status:

✅ Memory system complete
✅ Tools system complete
✅ Audit system complete
🚧 LLM agent integration in progress
⏳ Production deployment pending testing

For implementation details and complete examples, see:

ARCHITECTURE_EXAMPLE.md - Complete proposal lifecycle walkthrough
constitution.md - Example governance constitution
templates/ - Governance template examples

16 KiB Raw Blame History

Govbot Architecture

Design Principles

Central Concept: Process

Architecture Overview

Core Components

1. Governance Agent (LLM-Driven)

2. Structured Memory System

3. LLM Tools (Generic Primitives)

4. Audit Trail System

5. Constitutional Reasoning (RAG)

Complete Workflow Example

High-Level Flow (Example: Proposal Process)

Key Architecture Differences

Benefits

1. Flexibility

2. Auditability

3. Template Support

4. Transparency

5. Community Governance

File Structure

Considerations

LLM Reliability

Cost and Latency

Context Window Limits

Testing

Success Criteria

Next Steps

16 KiB

Raw Blame History