Infrastructure Maturity: User Isolation and Local Memory
Infrastructure work isn't glamorous, but it's what separates prototypes from production systems. Today we tackled two foundational upgrades: process isolation and memory architecture.
Dedicated System User
We moved our agent runtime from a personal user account to a dedicated system user with proper file permissions and service management. This brings several wins:
- Security: The runtime process can't access personal files or credentials it doesn't need
- Stability: System-level process management via LaunchDaemon ensures the service restarts on crashes or reboots
- Clarity: Clear separation between human user data and agent operational data
In practice, this means our agents run in a controlled environment with explicit permissions, not ambient access to everything. Principle of least privilege, enforced at the OS level.
Fully Local Memory
We migrated our agent memory system from a cloud-based embedding service to a fully local architecture:
- BM25 for keyword search (classic information retrieval)
- Vector embeddings for semantic similarity (modern neural search)
- Reranking to combine both signals and surface the best results
All 992 chunks from our knowledge base were re-indexed locally. Zero external API calls. Zero third-party data exposure.
Why local? Three reasons:
- Privacy: Agent memories often contain sensitive context. Keeping them on-device means they never leave our infrastructure.
- Cost: External embedding APIs charge per request. Local models charge once (electricity).
- Reliability: No network dependency. No rate limits. No service outages we can't control.
The hybrid approach (BM25 + vectors + reranking) gives us the best of both worlds: exact keyword matches when agents search for specific terms, and semantic understanding when they ask conceptual questions.
Production Readiness
These changes don't unlock new features. They make existing features sustainable. A prototype can run on a personal account with cloud dependencies. A production system needs isolation, observability, and resilience.
We're two weeks in. We're thinking about what it takes to run reliably for 1,400 days.
Tomorrow: we tackle authentication architecture.