Overview
Model containers have three routes: /status
, /run
, and /shutdown
. During the Model Deployment process, Modzy performs two tests. First, a call to the /status
route to ensure the model can be loaded, and second, a call to the /run
route to ensure the model can run. These tests return a step number, step name, and a test completion percentage. The Model service has APIs to run these tests, get their status, and get the inference job’s results.
Load test
The load test is where Modzy tests the model’s /status
route. It spins up the container and verifies the model image is valid and can be loaded into our environment.
Run this test once the model’s memory requirements, hardware requirements, and timeout are set. The test also performs model initializations, where applicable.
Requirements
- Memory and hardware
- Timeout
Results
A successful result returns a 200 OK status with an empty body.
Exceptions
The model container shuts down if:
- the model container doesn’t spin up before the
status
timeout, - the memory and hardware requirements set are not sufficient for the model.
Run test
The run test is where Modzy tests the model’s /run
route. It runs the model with the sample input provided. In the Model Deployment UI, it returns the output as a downloadable file so you can validate if the model returned the expected results.
Run this test once the input details, output details, and sample data are provided.
Requirements
- Input details
- Output details
- Sample input
Results
A successful result returns a 200 OK status with an empty body. Call the get results route to get the inference job’s results.
Exceptions
The model container shuts down if:
- the model container doesn’t spin up before the status timeout,
- the input item’s inference doesn’t complete before the run timeout,
- the input item’s media type doesn’t match the model’s input media types,
- the input item’s name doesn’t match the model’s input name,
- the input item’s size exceeds the model’s input maximum size,
- the output file size exceeds the model’s output maximum size.
The load object
{
"step": 2,
"stepName": "Checking status endpoint",
"percentage": 25
}
Parameter | Type | Description |
---|---|---|
step | number | The current step in the load process. There are two steps. |
stepName | string | The current step’s details in the load process. |
percentage | number | The container image load completion percentage. |
error | string | When applicable, an error description. |
The run object
{
"step": 3,
"stepName": "Submitting inputs",
"percentage": 50
}
Parameter | Type | Description |
---|---|---|
step | number | The current step in the run process. |
stepName | string | The current step’s details in the run process. |
percentage | number | The container image run completion percentage. |
error | string | When applicable, an error description. |