The Always-On Hive: Orchestrating Persistent Agents via Tmux and SSH

From Chatbot to Workforce

Most people interact with LLMs through a browser tab. They type, they wait, they read. When they close the tab, the context vanishes. This is the “user” mindset.

To truly leverage multi-agent architectures, you need to shift to the “operator” mindset. You aren’t just chatting; you are managing a facility. Recently, I transitioned my workflow to run about 20 distinct Claude agents continuously. They don’t live in Chrome; they live in a headless Linux server, organized in tmux windows, accessible from anywhere.

The Architecture of the Hive

Before discussing the how, let’s look at what these agents are doing. As I explored in previous experiments, I rely on network architectures rather than single prompts.

1. The Creative Loop (Windows 1-2)

I have two agents, “Legolas” and “Melville,” engaged in an infinite poetry duel. They run on a loop: * Agent A (Legolas): Generates verse based on ethereal, woodland prompts. * Agent B (Melville): critiques and retorts with nautical, brooding prose. * The Infrastructure: They run in a detached tmux session. I don’t need to trigger them; I just check the logs to see what they’ve created.

2. The Supreme Court (Windows 3-8)

This is my “LLM-as-a-Judge” implementation. * The Judge: A high-reasoning model (like Claude 3.5 Sonnet or GPT-4o) acting as the orchestrator. * The Workers: Five smaller models (Llama 3, Haiku, etc.). * The Job: I dump complex queries into a queue. The workers vote, the judge synthesizes a majority opinion, and the result is written to a file. This runs asynchronously, 24/7.

The Operational Workflow: SSH and Tmux

Why use the terminal? Why tmux?

When you are coordinating 20 agents, you cannot rely on a browser that crashes or a laptop that goes to sleep. You need persistence.

The “Always-On” Advantage

tmux (terminal multiplexer) allows sessions to run on a server even when you disconnect. My agents are running Python scripts in infinite loops or listening for file system changes. 1. Start the work: I SSH into my server from my desktop. 2. Detach: I disconnect the session. The agents keep running. 3. Check-in: Later, while at a coffee shop, I SSH in from my phone using Termius or Blink. I re-attach to the session, check the output of the “Judge,” or see the latest poem from “Legolas,” and then disconnect.

The work never stops just because I closed my laptop.

The Tech Stack for Agility

You do not need to be a DevOps engineer to set this up. You need two things:

1. OpenRouter

Running 20 agents on separate API keys (OpenAI, Anthropic, Google) is a billing nightmare. I use OpenRouter. It offers a unified API. Whether my “Judge” is using Claude and my “Workers” are using Mistral, it is all one endpoint and one monthly bill. It simplifies the code significantly.

2. AI Coding Assistants

Don’t get bogged down in the theory of agent frameworks. You will see endless videos about LangChain or LangGraph. * Ignore the noise. * Use the AI. I simply ask my AI coding assistant: “Write a Python script that monitors a text file for new questions. When a question is added, send it to OpenRouter using the Claude 3.5 Sonnet model and append the answer to a log file.”

The AI writes the plumbing. I deploy it to a tmux window. Repeat.

Conclusion

We are moving past the era of the “chat.” We are entering the era of the “system.” By combining the reliability of old-school Linux tools with the intelligence of modern LLMs, you can build a personal operations center that works for you, even when you are asleep.

--- title: "The Always-On Hive: Orchestrating Persistent Agents via Tmux and SSH" author: "Bulent Soykan" date: "2026-01-19" categories: ["Workflow", "Multi-Agent Systems", "Linux", "AI Engineering"] description: "How to move beyond the browser and manage a network of persistent, always-running AI agents using simple terminal tools like tmux and SSH." image: "tmux-agent-hive.png" --- ## From Chatbot to Workforce Most people interact with LLMs through a browser tab. They type, they wait, they read. When they close the tab, the context vanishes. This is the "user" mindset. To truly leverage multi-agent architectures, you need to shift to the "operator" mindset. You aren't just chatting; you are managing a facility. Recently, I transitioned my workflow to run about 20 distinct Claude agents continuously. They don't live in Chrome; they live in a headless Linux server, organized in `tmux` windows, accessible from anywhere. ## The Architecture of the Hive Before discussing the *how*, let’s look at *what* these agents are doing. As I explored in previous experiments, I rely on network architectures rather than single prompts. ### 1. The Creative Loop (Windows 1-2) I have two agents, "Legolas" and "Melville," engaged in an infinite poetry duel. They run on a loop: * **Agent A (Legolas):** Generates verse based on ethereal, woodland prompts. * **Agent B (Melville):** critiques and retorts with nautical, brooding prose. * **The Infrastructure:** They run in a detached `tmux` session. I don't need to trigger them; I just check the logs to see what they've created. ### 2. The Supreme Court (Windows 3-8) This is my "LLM-as-a-Judge" implementation. * **The Judge:** A high-reasoning model (like Claude 3.5 Sonnet or GPT-4o) acting as the orchestrator. * **The Workers:** Five smaller models (Llama 3, Haiku, etc.). * **The Job:** I dump complex queries into a queue. The workers vote, the judge synthesizes a majority opinion, and the result is written to a file. This runs asynchronously, 24/7. ## The Operational Workflow: SSH and Tmux Why use the terminal? Why `tmux`? When you are coordinating 20 agents, you cannot rely on a browser that crashes or a laptop that goes to sleep. You need persistence. ### The "Always-On" Advantage `tmux` (terminal multiplexer) allows sessions to run on a server even when you disconnect. My agents are running Python scripts in infinite loops or listening for file system changes. 1. **Start the work:** I SSH into my server from my desktop. 2. **Detach:** I disconnect the session. The agents keep running. 3. **Check-in:** Later, while at a coffee shop, I SSH in from my phone using Termius or Blink. I re-attach to the session, check the output of the "Judge," or see the latest poem from "Legolas," and then disconnect. The work never stops just because I closed my laptop. ## The Tech Stack for Agility You do not need to be a DevOps engineer to set this up. You need two things: ### 1. OpenRouter Running 20 agents on separate API keys (OpenAI, Anthropic, Google) is a billing nightmare. I use [OpenRouter](https://openrouter.ai). It offers a unified API. Whether my "Judge" is using Claude and my "Workers" are using Mistral, it is all one endpoint and one monthly bill. It simplifies the code significantly. ### 2. AI Coding Assistants Don't get bogged down in the theory of agent frameworks. You will see endless videos about LangChain or LangGraph. * **Ignore the noise.** * **Use the AI.** I simply ask my AI coding assistant: *"Write a Python script that monitors a text file for new questions. When a question is added, send it to OpenRouter using the Claude 3.5 Sonnet model and append the answer to a log file."* The AI writes the plumbing. I deploy it to a `tmux` window. Repeat. ## Conclusion We are moving past the era of the "chat." We are entering the era of the "system." By combining the reliability of old-school Linux tools with the intelligence of modern LLMs, you can build a personal operations center that works for you, even when you are asleep.