Skip to main content

Pricing Tables

Computed cost tables by model, quantization, and hardware.

Pricing Formula

cost_per_1K_tokens = (parameterCountB * 0.1) * quantizationMultiplier / 10,000

See Pricing Protocol for the full specification.

Cost per 1K Tokens by Model

q4 Quantization (1.0x multiplier)

ModelParametersComplexityCost per 1K Tokens
Llama 3.2 1B1B0.1$0.00001
Llama 3.2 3B3B0.3$0.00003
Gemma 3 4B4B0.4$0.00004
Llama 3.1 8B8B0.8$0.00008
Qwen 3 8B8B0.8$0.00008
Phi 4 14B14B1.4$0.00014
Mistral Small 24B24B2.4$0.00024
Gemma 3 27B27B2.7$0.00027
Qwen 3 32B32B3.2$0.00032
Llama 4 Scout 109B109B10.9$0.00109

q8 Quantization (1.5x multiplier)

ModelParametersCost per 1K Tokens
Llama 3.2 1B1B$0.000015
Llama 3.2 3B3B$0.000045
Gemma 3 4B4B$0.000060
Llama 3.1 8B8B$0.000120
Qwen 3 8B8B$0.000120
Phi 4 14B14B$0.000210
Mistral Small 24B24B$0.000360
Gemma 3 27B27B$0.000405
Qwen 3 32B32B$0.000480
Llama 4 Scout 109B109B$0.001635

fp16 Quantization (2.0x multiplier)

ModelParametersCost per 1K Tokens
Llama 3.2 1B1B$0.000020
Llama 3.2 3B3B$0.000060
Gemma 3 4B4B$0.000080
Llama 3.1 8B8B$0.000160
Qwen 3 8B8B$0.000160
Phi 4 14B14B$0.000280
Mistral Small 24B24B$0.000480
Gemma 3 27B27B$0.000540
Qwen 3 32B32B$0.000640
Llama 4 Scout 109B109B$0.002180

Electricity Floor Examples

The electricity floor ensures providers cover their power costs plus a 20% margin. Assumes $0.12/kWh electricity rate and 30 tokens/second generation speed.

Formula

seconds = tokenCount / tokensPerSecond
kWh = (watts * seconds) / 3,600,000
floor = kWh * costPerKWh * 1.2

Apple M1 (20W)

TokensSecondskWhElectricityFloor (1.2x)
1003.3s0.0000183$0.0000022$0.0000026
1,00033.3s0.000185$0.0000222$0.0000266
10,000333s0.00185$0.000222$0.000266

Apple M4 Pro (38W)

TokensSecondskWhElectricityFloor (1.2x)
1003.3s0.0000348$0.0000042$0.0000050
1,00033.3s0.000352$0.0000422$0.0000506
10,000333s0.00352$0.000422$0.000506

NVIDIA RTX 4090 (300W)

TokensSecondskWhElectricityFloor (1.2x)
1003.3s0.000275$0.0000330$0.0000396
1,00033.3s0.00278$0.000333$0.000400
10,000333s0.0278$0.00333$0.00400

Token Price vs. Electricity Floor

For most models on efficient hardware (Apple Silicon), the token-based price exceeds the electricity floor. The floor becomes relevant for:

  • Small models on power-hungry hardware (e.g., 1B model on RTX 4090)
  • High electricity rates (e.g., >$0.30/kWh)
  • Slow generation speed (CPU-only inference)

The effective price is always max(token_price, electricity_floor).

Revenue Split

RecipientShare
Provider95%
Network fee5%

Example: Serving 1,000 tokens of Qwen 3 8B (q4):

  • Total cost: $0.00008
  • Provider earns: $0.000076
  • Network fee: $0.000004