How hali works
A guided tour through the machinery. From three words in a terminal to a globally discoverable torrent entry.
The three modes of hali
hali operates in three layers that compose together:
Layer 1 works 100% offline. Layer 2 requires a LAN with multicast enabled. Layer 3 requires internet access to the public registry. Each layer is independent — if one fails, the others keep running.
Life of a pull
When you type hali pull mistral, here's what actually happens:
Step by step
1. Parse the input. hali pull accepts three formats:
- A bare search query (
mistral) — runs a HF search and lets you pick interactively - A HuggingFace repo path (
TheBloke/Mistral-7B-Instruct-v0.2-GGUF) — goes straight to that repo - A canonical model ID (
mistral:7b:instruct:q4_k_m) — skips all prompts
2. Search HuggingFace (if needed). The search endpoint returns GGUF models sorted by download count. You pick a repo, then pick a quantization from the list (sorted smallest-first).
3. Derive the model ID. From the repo name and file name, hali computes the canonical four-part ID: base:size:variant:quant. This is the identity the model lives under forever. Together with the HF revision hash, it uniquely identifies this exact version.
4. Check the cache. Before touching the network, hali checks if the model is already on disk. If it is, you're done in under a second.
5. Ask the neighbors. If the daemon is running, hali asks: "Does anyone on the LAN already have this model?" The daemon checks its in-memory index — built from UDP multicast announcements on 239.192.42.1:4269 — and if a match is found, the download happens over the LAN via BitTorrent LSD. No internet traffic. Gigabit speed.
6. Download from HuggingFace (if no LAN peer). A plain HTTP GET from HuggingFace's CDN. hali can optionally hash pieces in parallel (streaming_hash: true) to eliminate a second read pass later. Progress is displayed in the terminal every 150ms.
7. Write the receipt. metadata.json records the model ID, source repo, revision hash, file size, and download timestamp. The cache is self-documenting.
8. Wake the daemon. If the daemon isn't running, hali launches one automatically. It spawns as a background process, detaches from your terminal, and opens ports on loopback + a random torrent port.
9. Hash and seal. The daemon reads the complete file in 16 MiB chunks, computes SHA-1 hashes for each piece, and derives the torrent infohash. If streaming_hash was enabled during download, the hashes arrive precomputed — zero additional I/O. The daemon then builds a .torrent file and a magnet URI.
10. Submit to the registry. The daemon creates a signed manifest (Ed25519) and submits it to hali.network. Your model is now globally discoverable.
11. Broadcast to the LAN. On a jittered 25-40 second interval, the daemon announces model availability via UDP multicast. Every other machine on the LAN hears it and updates its in-memory index. LAN peers can now discover and download from you at local speed.
Deterministic torrent generation
This is the trick that makes cross-seeding work:
Two people who independently download the same model version (same HF repo + same file + same revision) produce identical torrent infohashes. Their pieces are interchangeable. A third person can download from both simultaneously.
How it works:
- Piece length: fixed at 16 MiB
- Torrent comment: a compact JSON identity card (
model_id,revision,format,source) privateflag: set to1(keep DHT/PEX off for LAN mode)- CreatedBy:
bt/1
As long as everyone follows the same recipe, the hashes match.
LAN protocol
Announcement format (UDP multicast, every ~30s with jitter):
"I am node
a3f29b.... I havemistral:7b:instruct:q4_k_m, infohashdef789..., revisionabc123...."
This packet is signed with HMAC-SHA256 using a shared secret stored in <DataDir>/lan.secret (auto-generated on first daemon run). Packets with missing or wrong signatures are silently dropped. A rogue machine on your network cannot inject fake announcements.
Discovery: LAN announcements cover the what — they tell you an infohash exists. The where (IP address + port) is discovered via BitTorrent's built-in Local Service Discovery (LSD), which broadcasts infohash queries and receives peer addresses in response.
Fallback: If multicast is unavailable, the daemon falls back to directed subnet broadcast on UDP 4269. If that fails too (e.g., multicast blocked by router policy), LAN mode degrades silently and downloads go to HuggingFace directly.
Trust scoring
The public registry (hali.network) computes a trust score (0.0–1.0) for every model. This isn't a popularity contest — it's a quality signal that helps users identify reliable models in search results.
The score combines format validity, publisher reputation, download success rate, and model age. Models below the visibility threshold are hidden from search results entirely. New models enter at a moderate score and rise or fall as the network learns more about them.
Ed25519 signing
All submissions to the registry are signed with your Ed25519 private key. This proves authorship without requiring an account or password. The keypair is generated automatically on first run and stored locally — the registry never sees your private key.
Architecture boundaries
The three repos are independent:
- hali-client — your local tool. Works offline. Needs nothing else.
- hali-backend — the registry API. Stores manifests, verifies signatures, computes scores.
- hali-frontend — the public search UI at hali.network.
Each can fail without taking down the others.
Next steps
- Quick Start — your first model in 60 seconds
- Everyday Use — common workflows
- LAN Sharing — set up a team swarm
- Commands Reference — every CLI command documented