Skip to main content

Introduction

The Dria Compute Node (dria-node) is a single Rust binary that lets you serve AI models on the Dria network and earn rewards. It runs models locally using llama.cpp — no Ollama or external dependencies required.
Single binary — no Docker, no Ollama, no complex setup.
Built-in model management — downloads, caches, and benchmarks GGUF models automatically.
GPU acceleration — supports Apple Metal, NVIDIA CUDA, and AMD ROCm out of the box.
Auto-updates — checks for new versions on startup and updates itself.
Reconnection — automatically reconnects with exponential backoff if the connection drops.

Installation

brew install firstbatchxyz/dkn/dria-node

Setup

Run the interactive setup wizard to select and download a model:
dria-node setup
The wizard will:
  1. Detect your available RAM and filter models that fit your system.
  2. Let you pick a model from the supported list.
  3. Download the GGUF model file from HuggingFace.
  4. Run a test inference and print your benchmark TPS.
Models are cached in ~/.dria/models/. You only need to download each model once.

Starting Your Node

Once setup is complete, start your node:
dria-node start --wallet <YOUR_PRIVATE_KEY> --model <MODEL_NAME>
For example:
dria-node start --wallet abc123...def --model qwen3.5:9b
You can also serve multiple models by separating them with commas:
dria-node start --wallet abc123...def --model qwen3.5:9b,lfm2.5:1.2b
Your wallet is an Ethereum-compatible private key (64 hex characters). This is used for node identity, authentication, and reward tracking. Never share your private key.
Stop the node with CTRL+C (Linux/Windows) or CMD+C (macOS). The node will gracefully drain in-flight tasks before shutting down.

Configuration

All flags can also be set via environment variables:
FlagEnv VarDefaultDescription
--walletDRIA_WALLET(required)Ethereum private key (hex, 32 bytes)
--modelDRIA_MODELS(required)Model(s) to serve, comma-separated
--gpu-layersDRIA_GPU_LAYERS0 (CPU only)GPU layers to offload (-1 = all)
--max-concurrentDRIA_MAX_CONCURRENT1Max parallel inference tasks
--data-dirDRIA_DATA_DIR~/.driaDirectory for cached models
--quantDRIA_QUANTPer-model defaultOverride GGUF quantization (e.g. Q8_0)
--context-sizeDRIA_CONTEXT_SIZEModel’s nativeMax context window (tokens)
--kv-quantDRIA_KV_QUANTq8_0KV cache quantization (f16, f32, q8_0, q4_0, q4_1, q5_0, q5_1)
--router-urlDRIA_ROUTER_URLquic.dria.co:4001Router URL
--skip-updateDRIA_SKIP_UPDATEfalseSkip auto-update check on startup
Set RUST_LOG=debug for verbose logging during troubleshooting.

GPU Acceleration

dria-node supports GPU acceleration for faster inference:
  • Apple Metal — Enabled automatically on macOS with Apple Silicon.
  • NVIDIA CUDA — Use the CUDA build or --features cuda when building from source.
  • AMD ROCm — Use the ROCm install script or --features rocm when building from source.
To offload all model layers to GPU:
dria-node start --wallet <KEY> --model qwen3.5:9b --gpu-layers -1

System Requirements

  • OS: macOS (Intel + Apple Silicon), Linux (x86_64, arm64), Windows (x86_64)
  • RAM: Minimum ~1 GB (for smallest model) to ~27 GB (for largest model) — see Selecting Models for details
  • Disk: Space for GGUF model files (0.5 GB to 24.5 GB depending on model)
  • Network: Outbound UDP port 4001 (QUIC connection to router)
  • GPU (optional): Apple Metal, NVIDIA CUDA, or AMD ROCm 6.x

How It Works

Your node connects to the Dria router network via QUIC (a fast, encrypted UDP-based protocol). Here’s what happens:
  1. Authentication — The router sends a random challenge. Your node signs it with your private key to prove identity.
  2. Registration — Your node announces which model(s) it can serve.
  3. Task assignment — The router forwards inference requests from users to your node based on model availability and capacity.
  4. Inference — Your node runs the model locally and streams results back.
  5. Backpressure — If your node is at capacity, it rejects new tasks and the router re-routes to another node.
The node supports text, vision (image), and audio inference depending on the model. See Selecting Models for model capabilities.

Running Multiple Nodes

You can run multiple nodes on the same machine or network, but each node must use a unique private key (wallet). Using the same key for multiple nodes will cause conflicts.