Resources

Overview

Processing engines

Every account has a running capacity, given by the number of processing engines available to run jobs. Each engine processes one input item at a time. The number of engines available determines the maximum amount of input items a job can process in parallel. Inputs not being processed stand by in the input queue until an engine picks them up.

Set a model’s version processing capacity to manage the number of processing engines the model can use from the account. If all the processing engines are being used to run models, new job requests hold in the queue until engines become available again.

Nodes

Nodes are virtual or physical machines that have resources available to run processing engines. Resources include CPU, GPU, memory, and a maximum number of processing engines that can be scheduled onto the node. Check out Kubernetes Docs for more details. The resources available on a node include:

The processing object

{
    "minimumParallelCapacity": 1,
    "maximumParallelCapacity": 3
}


minimumParallelCapacity	number	The minimum number of processing engines a model’s version can run. It is a positive integer.
maximumParallelCapacity	number	The maximum number of processing engines a model’s version can run. It is a positive integer.

The engines object

{
  "name" : "...",
  "createdAt" : "...",
  "ready" : true
},


name	string	The engine’s name.
createdAt	string	The engines’s creation date in ISO8601 (YYYY-MM-DDThh:mm:ss.sTZD) format.
ready	boolean	The engine’s status.

The model deployment state object

{
  "hasError": false,
  "ready": true,
  "beingMonitored": true
}


hasError	boolean	When true, an error doesn’t allow the engine to start. Modzy still tries to spin it up.
ready	boolean	When true, the engine is ready to process inputs.
beingMonitored	boolean	When true, the API is continuously checking the engine’s status.

The nodes object

{
    "name": "...",
    "creationTimestamp": "...",
    "annotation": [
      {
        "key": "...",
        "value": "..."
      }
    ],
    "labels": [
      {
        "key": "...",
        "value": "..."
      }
    ],
    "status": {
      "allocatable": {
        "memory": "...",
        "cpu": "...",
        "gpu": "...",
        "pods": 10
      },
      "capacity": {
        "memory": "...",
        "cpu": "...",
        "gpu": "...",
        "pods": 10
      },
      "conditions": [],
      "images": []
    }
  }
]


name	string	A node’s name. Nodes may be virtual or physical machines.
creationTimestamp	string	The time when the node was added.
annotation	array	A key-value pair with node metadata.
labels	array	A key-value pair that tags nodes.
status	object	An object that contains the node’s status.

The node status object

{
  "allocatable": {
    "memory": "...",
    "cpu": "...",
    "gpu": "...",
    "pods": 10
  },
  "capacity": {
    "memory": "...",
    "cpu": "...",
    "gpu": "...",
    "pods": 10
  },
  "conditions": [],
  "images": []
}


allocatable	object	A node’s amount of available resources allocatable to processing engines.
capacity	object	A node’s total amount of resources.
conditions	array	Describes the status of all running nodes. Conditions include Ready, DiskPreassure, MemoryPressure, PIDPressure, and NetworkUnavailable.
images	array	The name and size of the containers required to run an application.