AI and automation are two terms we’ve all heard enough over the past few years. Tools like Claude, ChatGPT, and agentic automation platforms like OpenClaw are everywhere right now, but some of us prefer to self-host our own stack. I’ve been experimenting with different setups to build an AI automation stack, and the best combination I’ve landed on includes n8n, Dify, and Ollama. n8n is already my favorite automation tool, and when you throw it into the mix with the other two, you get an AI automation stack that can rival even the cloud.
The stack that gets AI automation right
It separates three things that usually get mixed up
This stack works well because it separates three things that usually get mixed up in self-hosted AI setups. First is connecting different apps and systems, second is building LLM apps and RAG workflows, and third is running models locally for privacy.
n8n fits into the first layer. It’s built for integrations, with a large set of connectors and solid control over triggers, retries, and branching. It also scales well with queue mode using Redis and multiple workers, so things don’t fall apart as usage grows. Dify handles the second layer as the LLM app platform. It focuses on agentic workflows, RAG pipelines, and deployment. You can expose what you build as APIs, which makes it easy for n8n to plug into.
Ollama sits at the bottom as the model layer. It lets you run open models locally through a REST API and supports OpenAI-compatible endpoints. That makes it easy to connect with tools that already expect that format. All three tools work well together because they follow similar patterns. They use HTTP APIs, run well in containers, and avoid tight coupling. You can swap parts later without rebuilding everything.
The main reason this setup works is that each tool sticks to what it does best. n8n handles automation, Dify handles LLM apps, and Ollama handles inference. The privacy argument for this stack is also straightforward. Ollama runs locally, and conversation data does not leave your machine, while the API is local by default. When you pair this with Dify and n8n self-hosting, your prompts, documents, and workflow data remain within your infrastructure. Though your data safety also depends on how you configure outbound access for plugins, connectors, and webhooks.
This stack integrates flawlessly
Docker is the easiest way to run this stack
There are a few easy ways to connect these tools depending on where you want logic to live. The most common setup is n8n calling Dify as an AI layer, with Dify handling prompts, memory, and RAG, and then calling Ollama for inference. This keeps automation and AI logic separated.
If you want to keep things simple, n8n can call Ollama directly for tasks like classification or extraction, and only use Dify for more complex workflows. This avoids unnecessary overhead while still keeping Dify for cases that need context or reasoning.
You can also flip the flow, where Dify acts as the main workflow layer and calls n8n via webhooks when it needs external integrations. This works well if most of your logic sits inside Dify, but you still want to reuse n8n’s integrations as tools.
To reduce setup effort, there are existing connectors and patterns across the ecosystem. n8n already supports local model setups with Ollama, community nodes exist for Dify integration, and Dify itself supports Ollama through plugins, which cuts down on custom wiring.
On the deployment side, everything typically sits behind a reverse proxy that handles access and routing. n8n and Dify expose APIs through it, Dify talks to Ollama internally, and databases, Redis, and vector stores sit in a separate data layer to keep things isolated.
Docker is the easiest way to run this stack. Dify runs as multiple services, including API, workers, and supporting components, while n8n can start simple and scale with queue mode. Ollama is the most lightweight, usually running as a single service with optional GPU support. It’s worth noting that n8n needs its data directory or database backed up to retain workflows and state, while Dify requires backing up its database, storage, and vector data.
While this stack gets everything right, one thing you need to keep in mind is the hardware. To run even the most basic models through Ollama locally, you’d need at least 16 GB of RAM on your device, and a GPU if possible. Resource usage also depends on how you use each layer. n8n and Dify scale based on workflow complexity and data, while Ollama’s requirements are driven by model size and configuration, which is where most of the compute cost usually sits.
