API Reference
Log In

Resources

Overview

Processing engines

Every account has a running capacity, given by the number of processing engines available to run jobs. Each engine processes one input item at a time. The number of engines available determines the maximum amount of input items a job can process in parallel. Inputs not being processed stand by in the input queue until an engine picks them up.

Set a model’s version processing capacity to manage the number of processing engines the model can use from the account. If all the processing engines are being used to run models, new job requests hold in the queue until engines become available again.

Nodes

Nodes are virtual or physical machines that have resources available to run processing engines. Resources include CPU, GPU, memory, and a maximum number of processing engines that can be scheduled onto the node. Check out Kubernetes Docs for more details. The resources available on a node include:

The processing object

{
    "minimumParallelCapacity": 1,
    "maximumParallelCapacity": 3
}
minimumParallelCapacitynumberThe minimum number of processing engines a model’s version can run. It is a positive integer.
maximumParallelCapacitynumberThe maximum number of processing engines a model’s version can run. It is a positive integer.

The engines object

{
  "name" : "...",
  "createdAt" : "...",
  "ready" : true
},
namestringThe engine’s name.
createdAtstringThe engines’s creation date in ISO8601 (YYYY-MM-DDThh:mm:ss.sTZD) format.
readybooleanThe engine’s status.

The model deployment state object

{
  "hasError": false,
  "ready": true,
  "beingMonitored": true
}
hasErrorbooleanWhen true, an error doesn’t allow the engine to start. Modzy still tries to spin it up.
readybooleanWhen true, the engine is ready to process inputs.
beingMonitoredbooleanWhen true, the API is continuously checking the engine’s status.

The nodes object

{
    "name": "...",
    "creationTimestamp": "...",
    "annotation": [
      {
        "key": "...",
        "value": "..."
      }
    ],
    "labels": [
      {
        "key": "...",
        "value": "..."
      }
    ],
    "status": {
      "allocatable": {
        "memory": "...",
        "cpu": "...",
        "gpu": "...",
        "pods": 10
      },
      "capacity": {
        "memory": "...",
        "cpu": "...",
        "gpu": "...",
        "pods": 10
      },
      "conditions": [],
      "images": []
    }
  }
]
namestringA node’s name. Nodes may be virtual or physical machines.
creationTimestampstringThe time when the node was added.
annotationarrayA key-value pair with node metadata.
labelsarrayA key-value pair that tags nodes.
statusobjectAn object that contains the node’s status.

The node status object

{
  "allocatable": {
    "memory": "...",
    "cpu": "...",
    "gpu": "...",
    "pods": 10
  },
  "capacity": {
    "memory": "...",
    "cpu": "...",
    "gpu": "...",
    "pods": 10
  },
  "conditions": [],
  "images": []
}
allocatableobjectA node’s amount of available resources allocatable to processing engines.
capacityobjectA node’s total amount of resources.
conditionsarrayDescribes the status of all running nodes. Conditions include Ready, DiskPreassure, MemoryPressure, PIDPressure, and NetworkUnavailable.
imagesarrayThe name and size of the containers required to run an application.