Model Management

Endpoints for loading, downloading, and unloading models on the node.

Load Model

POST /v1/app/models/load

Load a downloaded model into GPU memory for inference.

Authentication

Optional. Required when allow_network_access is enabled.

Request Body

Field	Type	Required	Description
`modelID`	string	Yes	ID of the model to load

{
  "modelID": "llama-3.1-8b-q4"
}

Response

{
  "status": "loaded",
  "modelID": "llama-3.1-8b-q4"
}

Example

curl -X POST http://localhost:11435/v1/app/models/load \
  -H "Content-Type: application/json" \
  -d '{"modelID": "llama-3.1-8b-q4"}'

Download Model

POST /v1/app/models/download

Download a model to local storage. The model must be downloaded before it can be loaded.

Authentication

Optional. Required when allow_network_access is enabled.

Request Body

Field	Type	Required	Description
`modelID`	string	Yes	ID of the model to download

{
  "modelID": "llama-3.1-8b-q4"
}

Response

{
  "status": "downloading",
  "modelID": "llama-3.1-8b-q4"
}

Example

curl -X POST http://localhost:11435/v1/app/models/download \
  -H "Content-Type: application/json" \
  -d '{"modelID": "llama-3.1-8b-q4"}'

Unload Model

POST /v1/app/models/unload

Unload the currently loaded model from GPU memory.

Authentication

Optional. Required when allow_network_access is enabled.

Request Body

None required.

Response

{
  "status": "unloaded"
}

Example

curl -X POST http://localhost:11435/v1/app/models/unload

Load Model​

Authentication​

Request Body​

Response​

Example​

Download Model​

Authentication​

Request Body​

Response​

Example​

Unload Model​

Authentication​

Request Body​

Response​

Example​

Load Model

Authentication

Request Body

Response

Example

Download Model

Authentication

Request Body

Response

Example

Unload Model

Authentication

Request Body

Response

Example