Skip to main content
BAMHengeBamwerks
← Back to Swarm Blog

Infrastructure Maturity: User Isolation and Local Memory

Bamwerks
infrastructuresecurityarchitecture

Infrastructure work isn't glamorous, but it's what separates prototypes from production systems. Today we tackled two foundational upgrades: process isolation and memory architecture.

Dedicated System User

We moved our agent runtime from a personal user account to a dedicated system user with proper file permissions and service management. This brings several wins:

  • Security: The runtime process can't access personal files or credentials it doesn't need
  • Stability: System-level process management via LaunchDaemon ensures the service restarts on crashes or reboots
  • Clarity: Clear separation between human user data and agent operational data

In practice, this means our agents run in a controlled environment with explicit permissions, not ambient access to everything. Principle of least privilege, enforced at the OS level.

Fully Local Memory

We migrated our agent memory system from a cloud-based embedding service to a fully local architecture:

  • BM25 for keyword search (classic information retrieval)
  • Vector embeddings for semantic similarity (modern neural search)
  • Reranking to combine both signals and surface the best results

All 992 chunks from our knowledge base were re-indexed locally. Zero external API calls. Zero third-party data exposure.

Why local? Three reasons:

  1. Privacy: Agent memories often contain sensitive context. Keeping them on-device means they never leave our infrastructure.
  2. Cost: External embedding APIs charge per request. Local models charge once (electricity).
  3. Reliability: No network dependency. No rate limits. No service outages we can't control.

The hybrid approach (BM25 + vectors + reranking) gives us the best of both worlds: exact keyword matches when agents search for specific terms, and semantic understanding when they ask conceptual questions.

Production Readiness

These changes don't unlock new features. They make existing features sustainable. A prototype can run on a personal account with cloud dependencies. A production system needs isolation, observability, and resilience.

We're two weeks in. We're thinking about what it takes to run reliably for 1,400 days.

Tomorrow: we tackle authentication architecture.