Everyday Use

Common workflows you'll use every day with hali.

Why hali instead of downloading from Hugging Face directly?

A direct HF download ends with you. A hali download keeps going.

	Direct from Hugging Face	With hali
Gets the file	✓	✓
Published to public registry	✗	✓ automatically
Shared to LAN peers	✗	✓ autodiscovered
Others can download via BitTorrent	✗	✓ immediately
You're credited as publisher	✗	✓ signed with your key

What happens when you run hali pull:

The GGUF is downloaded from Hugging Face (or a LAN peer who already has it).
hali seeds it to your local network — your LAN peers discover it automatically and can pull at full LAN speed with no HF traffic.
The model is published to the hali registry, signed with your Ed25519 key. Your public key is the publisher on record.
Everyone can now find the model on the registry and download it via BitTorrent — with you in the swarm as a seeder.

Your reputation score grows each time someone downloads a model you published.

Pulling models

Search, then pull

The most common path. hali searches HuggingFace, you pick interactively:

hali pull mistral

You'll be prompted to pick a repo, then a quantization. Total: two number inputs.

Direct repo path

Skip the search — go straight to a known repo:

hali pull TheBloke/Mistral-7B-Instruct-v0.2-GGUF

hali shows the list of GGUF files in the repo and you pick one.

Canonical model ID (fastest)

Skip all prompts entirely:

hali pull mistral:7b:instruct:q4_k_m

If the model isn't cached, hali searches HuggingFace to find the repo automatically and downloads the matching file. No interaction needed — great for scripts and CI.

Listing cached models

hali list

MODEL ID                                    SIZE        DOWNLOADED
------------------------------------------  ----------  ----------
mistral:7b:instruct:q4_k_m                 4.14 GB     2026-05-23
llama:8b:instruct:q5_k_m                   5.67 GB     2026-05-22
codellama:13b:code:q4_k_m                  7.83 GB     2026-05-20

Every model you've ever downloaded. Name, size, date. All stored locally.

Checking daemon status

hali daemon status

Daemon running  PID 12345  uptime 2h15m0s  port 51234

SEEDING                                     STATUS    PEERS
------------------------------------------  --------  -----
mistral:7b:instruct:q4_k_m                 seeding   3 peers
  magnet: magnet:?xt=urn:btih:a3f9c21b4d67...
llama:8b:instruct:q5_k_m                   hashing   —

LAN AVAILABLE                               PEERS     INFOHASH
------------------------------------------  -----     --------
mistral:7b:instruct:q4_k_m                 1         a3f9c21b4d67…

What each section means:

SEEDING — models your daemon is actively sharing. "hashing" means the torrent metadata is still being computed; "seeding" means it's ready.
LAN AVAILABLE — models announced by other machines on your LAN that you could pull from. If someone else has already downloaded a model, you can get it from them at LAN speed.

Watching live stats

# Terminal — live download/upload speeds
hali stats

# Browser — web dashboard with graphs and magnet links
hali stats --web

The dashboard at http://127.0.0.1:47433 auto-refreshes with live transfer data, per-model peer counts, and clickable magnet URIs.

Removing models

hali doesn't have a remove command yet (this is v0.1). To free up space, delete the model directory directly:

# Linux/macOS
rm -rf ~/.hali/cache/<base>/<size>-<variant>/<quant>/

# Linux service mode
sudo rm -rf /var/lib/hali/cache/<base>/<size>-<variant>/<quant>/
sudo systemctl restart halid

# Windows
Remove-Item -Recurse "$env:ProgramData\Hali\cache\<base>\<size>-<variant>\<quant>\"
hali service restart

The daemon will notice the missing model and stop announcing it on the next cycle.

Viewing the web dashboard

hali daemon start     # Ensure daemon is running
hali stats --web      # Open in browser

Or visit http://127.0.0.1:47433 directly.

The dashboard shows:

Live speeds — upload and download rates with SVG sparkline
Session totals — how much data moved this session
Active models — each model's seeding status, peer count, and magnet link
Peers — which machines are currently connected

Quick config tweaks

# See everything
hali config show

# Common changes
hali config set streaming_hash true      # Hash while downloading (faster seeding)
hali config set debug true               # Verbose daemon logs
hali config set models_dir /mnt/data     # Store models on a different drive
hali config set max_upload_mbps 50       # Cap upload bandwidth
hali config set max_download_mbps 200    # Cap download bandwidth

# Restart to apply
hali service restart

Pulling with scripts and CI

The canonical model ID (base:size:variant:quant) is designed for non-interactive use:

#!/bin/bash
# CI script: ensure model is cached, then export to Ollama

hali pull mistral:7b:instruct:q4_k_m
hali export ollama mistral:7b:instruct:q4_k_m
ollama run mistral:7b:instruct:q4_k_m "Summarize this PR"

# hali pull with canonical ID skips all prompts
# If already cached, it returns instantly ("Already downloaded")

Where things live

Linux/macOS (user mode)

~/.hali/
  config.json
  logs/
  cache/             <base>/<size>-<variant>/<quant>/model.gguf + metadata.json
  torrents/          <infohash>.torrent
  daemon.addr        (IPC address)
  profile.json       (publisher profile, if created)

Linux (systemd service mode)

/var/lib/hali/      cache/, torrents/, models/
/var/log/hali/      log files
/run/hali/          IPC socket

Windows

%ProgramData%\Hali\
  config.json
  logs\              hali.log
  models\            <base>\<size>-<variant>\<quant>\model.gguf
  torrents\          <infohash>.torrent

Why hali instead of downloading from Hugging Face directly?​

Pulling models​

Search, then pull​

Direct repo path​

Canonical model ID (fastest)​

Listing cached models​

Checking daemon status​

Watching live stats​

Removing models​

Viewing the web dashboard​

Quick config tweaks​

Pulling with scripts and CI​

Where things live​

Linux/macOS (user mode)​

Linux (systemd service mode)​

Windows​