Introducing Pipelines for Lengthy-Working AI Workflows

January 17, 2026

39

This weblog publish focuses on new options and enhancements. For a complete listing, together with bug fixes, please see the launch notes.

Clarifai’s Compute Orchestration permits you to deploy fashions by yourself compute, management how they scale, and resolve the place inference runs throughout clusters and nodepools.

As AI techniques transfer past single inference calls towards long-running duties, multi-step workflows, and agent-driven execution, orchestration must do extra than simply begin containers. It must handle execution over time, deal with failure, and route visitors intelligently throughout compute.

This launch builds on that basis with native help for long-running pipelines, mannequin routing throughout nodepools and environments, and agentic mannequin execution utilizing Mannequin Context Protocol (MCP).

Introducing Pipelines for Lengthy-Working, Multi-Step AI Workflows

AI techniques don’t break at inference. They break when workflows span a number of steps, run for hours, or must get better from failure.

In the present day, groups depend on stitched-together scripts, cron jobs, and queue staff to handle these workflows. As agent workloads and MLOps pipelines develop extra advanced, this setup turns into arduous to function, debug, and scale.

With Clarifai 12.0, we’re introducing Pipelines, a local solution to outline, run, and handle long-running, multi-step AI workflows immediately on the Clarifai platform.

Why Pipelines

Most AI platforms are optimized for short-lived inference calls. However actual manufacturing workflows look very completely different:

Multi-step agent logic that spans instruments, fashions, and exterior APIs
Lengthy-running jobs like batch processing, fine-tuning, or evaluations
Finish-to-end MLOps workflows that require reproducibility, versioning, and management

Pipelines are constructed to deal with this class of issues.

Clarifai Pipelines act because the orchestration spine for superior AI techniques. They allow you to outline container-based steps, management execution order or parallelism, handle state and secrets and techniques, and monitor runs from begin to end, all with out bolting collectively separate orchestration infrastructure.

Every pipeline is versioned, reproducible, and executed on Clarifai-managed compute, supplying you with fine-grained management over how advanced AI workflows run at scale.

Let’s stroll by how Pipelines work, what you possibly can construct with them, and learn how to get began utilizing the CLI and API.

How Pipelines Work

At a excessive stage, a Clarifai Pipeline is a versioned, multi-step workflow made up of containerized steps that run asynchronously on Clarifai compute.

Every step is an remoted unit of execution with its personal code, dependencies, and useful resource settings. Pipelines outline how these steps join, whether or not they run sequentially or in parallel, and the way information flows between them.

You outline a pipeline as soon as, add it, after which set off runs that may execute for minutes, hours, or longer.

Initialize a pipeline undertaking

This scaffolds a whole pipeline undertaking utilizing the identical construction and conventions as Clarifai customized fashions.

Every pipeline step follows the very same footprint builders already use when importing fashions to Clarifai: a configuration file, a dependency file, and an executable Python entrypoint.

A typical scaffolded pipeline seems like this:

On the pipeline stage, config.yaml defines how steps are related and orchestrated, together with execution order, parameters, and dependencies between steps.

Every step is a self-contained unit that appears and behaves identical to a customized mannequin:

config.yaml defines the step’s inputs, runtime, and compute necessities
necessities.txt specifies the Python dependencies for that step
pipeline_step.py incorporates the precise execution logic, the place you write code to course of information, name fashions, or work together with exterior techniques

This implies constructing pipelines feels instantly acquainted. For those who’ve already uploaded customized fashions to Clarifai, you’re working with the identical configuration type, the identical versioning mannequin, and the identical deployment mechanics—simply composed into multi-step workflows.

Add the pipeline

Clarifai builds and variations every step as a containerized artifact, making certain reproducible runs.

Run the pipeline

As soon as operating, you possibly can monitor progress, examine logs, and handle executions immediately by the platform.

Beneath the hood, pipeline execution is powered by Argo Workflows, permitting Clarifai to reliably orchestrate long-running, multi-step jobs with correct dependency administration, retries, and fault dealing with.

Pipelines are designed to help the whole lot from automated MLOps workflows to superior AI agent orchestration, with out requiring you to function your individual workflow engine.

Word: Pipelines are at present accessible in Public Preview.

You can begin making an attempt them right now and we welcome your suggestions as we proceed to iterate. For a step-by-step information on defining steps, importing pipelines, managing runs, and constructing extra superior workflows, take a look at the detailed documentation right here.

Mannequin Routing with Multi-Nodepool Deployments

With this launch, Compute Orchestration now helps mannequin routing throughout a number of nodepools inside a single deployment.

Mannequin routing permits a deployment to reference a number of pre-existing nodepools by a deployment_config.yaml. These nodepools can belong to completely different clusters and may span cloud, on-prem, or hybrid environments.

Right here’s how mannequin routing works:

Nodepools are handled as an ordered precedence listing. Requests are routed to the primary nodepool by default.
A nodepool is taken into account totally loaded when queued requests exceed configured age or amount thresholds and the deployment has reached its max_replicas, or the nodepool has reached its most occasion capability.
When this occurs, the subsequent nodepool within the listing is routinely warmed and a portion of visitors is routed to it.
The deployment’s min_replicas applies solely to the first nodepool.
The deployment’s max_replicas applies independently to every nodepool, not as a world sum.

This strategy allows excessive availability and predictable scaling with out duplicating deployments or manually managing failover. Deployments can now span a number of compute swimming pools whereas behaving as a single, resilient service.

Learn extra about Multi-Nodepool Deployment right here.

Agentic Capabilities with MCP Assist

Clarifai expands help for agentic AI techniques by making it simpler to mix agent-aware fashions with Mannequin Context Protocol integration. Fashions can uncover, name, and motive over each customized and open-source MCP servers throughout inference, whereas remaining totally managed on the Clarifai platform.

Agentic Fashions with MCP Integration

You may add fashions with agentic capabilities through the use of the AgenticModelClass, which extends the usual mannequin class to help software discovery and execution. The add workflow stays the identical as present customized fashions, utilizing the identical undertaking construction, configuration recordsdata, and deployment course of.

Agentic fashions are configured to work with MCP servers, which expose instruments that the mannequin can name throughout inference.

Key capabilities embrace:

Iterative software calling inside a single predict or generate request
Software discovery and execution dealt with by the agentic mannequin class
Assist for each streaming and non-streaming inference
Compatibility with the OpenAI-compatible API and Clarifai SDKs

A whole instance of importing and operating an agentic mannequin is offered right here. This repository exhibits learn how to add a GPT-OSS-20B mannequin with agentic capabilities enabled utilizing the AgenticModelClass.

Deploying Public MCP Servers on Clarifai

Clarifai has already supported deploying customized MCP servers, permitting groups to construct their very own software servers and run them on the platform. This launch expands that functionality by making it simple to deploy public MCP servers immediately on the Platform.

Public MCP servers can now be uploaded utilizing a easy configuration, with out requiring groups to host or handle the server infrastructure themselves. As soon as deployed, these servers could be shared throughout fashions and workflows, permitting agentic fashions to entry the identical instruments.

This instance demonstrates learn how to deploy a public, open-source MCP server on Clarifai as an API endpoint.

Pay-As-You-Go Billing with Pay as you go Credit

We’ve launched a brand new Pay-As-You-Go (PAYG) plan to make billing easier and extra predictable for self-serve customers.

The PAYG plan has no month-to-month minimums and much fewer function gates. You prepay credit, use them throughout the platform, and pay just for what you devour. To enhance reliability, the plan additionally contains auto-recharge, so long-running jobs don’t cease unexpectedly when credit run low.

That can assist you get began, each verified person receives a one-time $5 welcome credit score, which can be utilized throughout inference, Compute Orchestration, deployments, and extra. You can too declare an extra $5 in your group.

If you need a deeper breakdown of how pay as you go credit work, what’s altering from earlier plans, and why we made this shift, get extra particulars on this weblog.

Clarifai as an Inference Supplier within the Vercel AI SDK

Clarifai is now accessible as an inference supplier within the Vercel AI SDK. You need to use Clarifai-hosted fashions immediately by the OpenAI-compatible interface in @ai-sdk/openai-compatible, with out altering your present software logic.

This makes it simple to swap in Clarifai-backed fashions for manufacturing inference whereas persevering with to make use of the identical Vercel AI SDK workflows you already depend on. Study extra right here

New Reasoning Fashions from the Ministral 3 Household

We’ve printed two new open-weight reasoning fashions from the Ministral 3 household on Clarifai:

Ministral-3-3B-Reasoning-2512

A compact reasoning mannequin designed for effectivity, providing robust efficiency whereas remaining sensible to deploy on real looking {hardware}.
Ministral-3-14B-Reasoning-2512

The biggest mannequin within the Ministral 3 household, delivering reasoning efficiency near a lot bigger techniques whereas retaining the advantages of an environment friendly open-weight design.

Each fashions can be found now and can be utilized throughout Clarifai’s inference, orchestration, and deployment workflows.

Further Adjustments

Platform Updates

We’ve made a couple of focused enhancements throughout the platform to enhance usability and day-to-day workflows.

Added cleaner filters within the Management Heart, making charts simpler to navigate and interpret.
Improved the Workforce & Logs view to make sure right now’s audit logs are included when deciding on the final 7 days.
Enabled stopping responses immediately from the precise panel when utilizing Evaluate mode within the Playground.

Python SDK Updates

This launch features a broad set of enhancements to the Python SDK and CLI, centered on stability, native runners, and developer expertise.

Improved reliability of native mannequin runners, together with fixes for vLLM compatibility, checkpoint downloads, and runner ID conflicts.
Launched higher artifact administration and interactive config.yaml creation in the course of the mannequin add move.
Expanded take a look at protection and improved error dealing with throughout runners, mannequin loading, and OpenAI-compatible API calls.

A number of further fixes and enhancements are included, protecting dependency upgrades, atmosphere dealing with, and CLI robustness. Study extra right here.

Able to Begin Constructing?

You can begin constructing with Clarifai Pipelines right now to run long-running, multi-step workflows immediately on the platform. Outline steps, add them with the CLI, and monitor execution throughout your compute.

For manufacturing deployments, mannequin routing permits you to scale throughout a number of nodepools and clusters with built-in spillover and excessive availability.

For those who’re constructing agentic techniques, you can too allow agentic mannequin help with MCP servers to offer fashions entry to instruments throughout inference.

Pipelines can be found in public preview. We’d love your suggestions as you construct.