Health & Metrics
Monitor daemon health and track usage metrics.
Health & Metrics
Whisp provides commands to monitor daemon health and track usage statistics.
Health Check
Get a quick overview of daemon status:
whisp healthOutput:
Daemon Health
─────────────
Version: 0.5.0
Uptime: 2d 4h 32m 15s
Provider: openai
Requests: 1,247
Errors: 12
Active: 1 connection
Memory: 24.5 MB
Last Request: 3s agoHealth Fields
| Field | Description |
|---|---|
| Version | Daemon version |
| Uptime | Time since daemon started |
| Provider | Current AI provider |
| Requests | Total requests handled |
| Errors | Failed requests |
| Active | Current concurrent connections |
| Memory | Memory usage (Linux only) |
| Last Request | Time since last activity |
Using Health for Troubleshooting
# Check if daemon is responsive
whisp health
# High error count? Check logs
whisp health | grep Errors
# Memory growing? Consider restart
whisp restartDetailed Metrics
Get comprehensive usage statistics:
whisp metricsOutput:
Whisp Metrics (last week)
─────────────────────────
Performance
Average response: 245ms
P50 response: 180ms
P95 response: 520ms
P99 response: 890ms
Requests/hour: 12.4
Token Usage
Input tokens: 45.2K
Output tokens: 23.1K
Total tokens: 68.3K
Requests by Type
query: 842
explain: 156
error: 89
chat: 72
dryrun: 45
pipe: 34Metrics Fields
Performance:
| Metric | Description |
|---|---|
| Average response | Mean response time |
| P50/P95/P99 | Percentile response times |
| Requests/hour | Request rate |
Token Usage:
| Metric | Description |
|---|---|
| Input tokens | Tokens sent to AI |
| Output tokens | Tokens received from AI |
| Total tokens | Combined usage |
Request Types:
| Type | Description |
|---|---|
| query | Natural language command generation |
| explain | Command explanations (,. cmd) |
| error | Error recovery suggestions |
| chat | Interactive chat messages |
| dryrun | Dry-run analysis (,d cmd) |
| pipe | Piped input processing |
Time Period Filter
Filter metrics by time period:
# Today only
whisp metrics --period day
# Last 7 days (default)
whisp metrics --period week
# Last 30 days
whisp metrics --period month
# All time
whisp metrics --period allJSON Output
Get metrics as JSON for scripts or monitoring:
whisp metrics --json{
"uptime_secs": 186735,
"avg_response_ms": 245,
"p50_response_ms": 180,
"p95_response_ms": 520,
"p99_response_ms": 890,
"requests_per_hour": 12.4,
"total_input_tokens": 45234,
"total_output_tokens": 23156,
"request_type_counts": {
"query": 842,
"explain": 156,
"error": 89,
"chat": 72,
"dryrun": 45,
"pipe": 34
}
}Diagnostic Check
Run comprehensive diagnostics:
whisp doctorOutput:
Whisp Diagnostic Check
──────────────────────
✓ Daemon is running (PID 12345)
✓ Socket exists at /tmp/whisp.sock
✓ Socket is responsive
✓ Configuration is valid
✓ Config file permissions OK (600)
✓ API key configured for openai
✓ Python3 available
✓ netcat (nc) available
All checks passed!Checks Performed
| Check | Description |
|---|---|
| Daemon running | Process exists and PID file valid |
| Socket exists | Unix socket file present |
| Socket responsive | Can communicate with daemon |
| Config valid | TOML syntax correct |
| Config permissions | File not world-readable |
| API key | Key configured for active provider |
| Python3 | Required tool available |
| netcat | Required tool available |
When Checks Fail
✗ Daemon is not running
Hint: Run 'whisp start' to start the daemon⚠ Config file permissions too open (644)
Hint: Run 'chmod 600 ~/.config/whisp/config.toml'✗ API key not configured for anthropic
Hint: Run 'whisp init' to configure or set ANTHROPIC_API_KEYMonitoring Integration
Prometheus/Grafana
Export metrics periodically:
# Cron job to export metrics
*/5 * * * * whisp metrics --json >> /var/log/whisp-metrics.jsonlHealth Check Script
#!/bin/bash
# healthcheck.sh
if ! whisp health > /dev/null 2>&1; then
echo "Whisp daemon unhealthy, restarting..."
whisp restart
fiAlerting on Errors
# Check error rate
errors=$(whisp metrics --json | jq '.request_type_counts.error // 0')
total=$(whisp metrics --json | jq '[.request_type_counts[]] | add')
if [ "$errors" -gt 0 ] && [ "$total" -gt 0 ]; then
error_rate=$(echo "scale=2; $errors / $total * 100" | bc)
if (( $(echo "$error_rate > 10" | bc -l) )); then
echo "High error rate: ${error_rate}%"
fi
fiUnderstanding Token Usage
Tokens are the billing unit for AI providers. Each query uses:
- Input tokens: Your query + context (directory, history, etc.)
- Output tokens: The AI's response
Typical usage per request type:
| Type | Input | Output | Total |
|---|---|---|---|
| query | ~150 | ~80 | ~230 |
| explain | ~200 | ~150 | ~350 |
| chat | ~300+ | ~200 | ~500+ |
| dryrun | ~180 | ~120 | ~300 |
Chat uses more tokens because it includes conversation history.
Cost Estimation
Whisp automatically fetches model pricing from LiteLLM's community-maintained pricing database. Pricing data is cached locally for 24 hours at ~/.whisp/model_prices.json.
This means cost estimates are automatically accurate for:
- OpenAI models (GPT-4o, GPT-5-nano, etc.)
- Anthropic models (Claude)
- Google Gemini models
- Cerebras models
- Most major providers
Local providers (Ollama) are automatically marked as free.
Per-Token Pricing
Cost is calculated as: (input_tokens × input_price) + (output_tokens × output_price)
| Provider | Model | Input $/1M | Output $/1M |
|---|---|---|---|
| OpenAI | gpt-5-nano | ~$0.10 | ~$0.40 |
| OpenAI | gpt-4o-mini | ~$0.15 | ~$0.60 |
| OpenAI | gpt-4o | ~$2.50 | ~$10.00 |
| Anthropic | claude-haiku | ~$0.25 | ~$1.25 |
| Anthropic | claude-sonnet | ~$3.00 | ~$15.00 |
| Gemini | gemini-1.5-flash | ~$0.075 | ~$0.30 |
| Cerebras | gpt-oss-120b | Varies | Varies |
| Ollama | any | Free | Free |
Prices change frequently. Check provider pricing pages for current rates.
Cost by Provider (Approximate)
Based on typical whisp usage (~500 tokens/request):
| Provider | Model | Cost per 1000 requests |
|---|---|---|
| OpenAI | gpt-4o-mini | ~$0.08 |
| OpenAI | gpt-4o | ~$2.50 |
| Anthropic | claude-haiku | ~$0.13 |
| Anthropic | claude-sonnet | ~$1.50 |
| Gemini | gemini-1.5-flash | ~$0.04 |
| Ollama | any | Free (local) |
Prices are estimates based on typical usage. Actual costs depend on query complexity and context size.
Viewing Cost Data
# Get total tokens and estimated cost for the month
whisp metrics --period month --json | jq '{
input_tokens: .total_input_tokens,
output_tokens: .total_output_tokens,
total_tokens: (.total_input_tokens + .total_output_tokens)
}'Cache Behavior
Pricing data is fetched from GitHub and cached at ~/.whisp/model_prices.json. The cache strategy:
- Fresh cache (< 24 hours old): Use cached data immediately
- Stale cache (> 24 hours): Attempt to fetch new data
- Network failure: Fall back to stale cache if available
- No cache: Fetch with 10-second timeout
Delete the cache file to force a refresh: rm ~/.whisp/model_prices.json
See Providers & Models for detailed pricing information.
Performance Optimization
If response times are slow:
- Check provider: Some providers are faster than others
- Use a faster model:
gpt-5-nano-2025-08-07vsgpt-4o - Use Ollama locally: No network latency
- Check network:
ping api.openai.com
If memory usage is high:
# Restart daemon to clear memory
whisp restart