Annoucement

Jul 9, 2025

Introducing Cerebrium run: The Fastest Way to Execute Cloud Code

Michael Louis

Founder & CEO

One of the biggest challenges developers face today—especially when working with AI and cloud applications—is the time it takes to go from idea to execution. Waiting for infrastructure to spin up, setting up local environments, and managing GPU access can all slow down iteration dramatically.

At Cerebrium, we’ve been building infrastructure that gets out of your way so you can move fast. Previously, our customers could deploy their application and it would be live in 8-10s in the cloud - acceptable for many but we still weren’t satisfied.

Introducing cerebrium run - a dead-simple way to execute code in the cloud in 1-2 seconds with no provisioning delays and no CI/CD.

Whether you’re testing, debugging, or iterating on new features, cerebrium run is your shortcut to speed and flexibility. Backed by CPUs and GPUs, with access to your secrets, persistent storage and real-time logs it makes development dramatically better!

With cerebrium run, you can:

  • Quickly iterate on application logic without worrying about CI/CD or the spinning wheel of death

  • Run unit tests in your actual production environment, with real secrets and volumes

  • Trigger one-off tasks like compiling TensorRT engines, preprocessing data, or running migrations

  • Leverage GPU acceleration for any compute-heavy job, without provisioning or scaling anything

Here’s how it works:

Say you have a simple function like this:

def squared(value: int):
    
    result = int(value) ** 2
    print(f"The square of {value} is {result}.")
        
    return {
        "result": result,
    }

With cerebrium run, you can execute it remotely in the cloud with the following command:

cerebrium run main.py::squared --value 2

Within ~2 seconds, your code runs in the cloud, and you’ll see real-time logs streamed back to your CLI.

If you want to change hardware resources, simply add a cerebrium.toml file to the directory. You also have access to all the typical Cerebrium features such as accessing secrets stored through your dashboard or writing to your persistent storage volume. Lets look at an example for copying a file from an S3 bucket to a Cerebrium storage volume.

import os
import boto3

def run():
    # Get credentials from environment
    aws_access_key = os.environ.get("AWS_ACCESS_KEY_ID")
    aws_secret_key = os.environ.get("AWS_SECRET_ACCESS_KEY")
    bucket_name = "my-cerebrium-bucket"
    object_key = "test-file.txt"

    # Set up S3 client
    s3 = boto3.client(
        "s3",
        aws_access_key_id=aws_access_key,
        aws_secret_access_key=aws_secret_key,
    )

    # Download the file to the persistent volume
    local_path = "/persistent-storage/model-config.yaml"
    s3.download_file(bucket_name, object_key, local_path)

    print(f"Downloaded {object_key} to {local_path}")

    return {"status": "success", "path": local_path}

Once satisfied, you can run cerebrium deploy and it will become a scalable REST endpoint that can scale to 1000s of containers in seconds. The difference between cerebrium run and cerebrium deploy is that the former is ephemeral.

Why We Built This

We built cerebrium run because testing and executing cloud code shouldn’t require a full deployment pipeline or manual infrastructure setup. Developers deserve a tool that’s as fast and flexible as their local machine — but with the power of the cloud.

With cerebrium run, you can:

  • Quickly iterate on application logic without worrying about CI/CD or the wheel of death

  • Run unit tests in your actual production environment, with real secrets and volumes

  • Trigger one-off tasks like compiling TensorRT engines, preprocessing data, or running migrations

  • Leverage GPU acceleration for any compute-heavy job, without provisioning or scaling anything

All of this happens inside an isolated, serverless environment that has access to your secrets, storage volumes, and environment variables — exactly like your production setup.

In short: Our customers needed something that makes cloud development feel local, and we built it.

Still Early — But Powerful

This is the first version of cerebrium run, and there’s a lot more coming:

  • Even faster performance

  • Support for dockerfiles

  • Support for FastAPI’s

  • Support for background jobs and async functions

To try it out, please install the latest cli version (pip install —upgrade cerebrium) and test it on any Python file.

If you have feedback or ideas, we’d love to hear from you — we’re building this for you.

© 2025 Cerebrium, Inc.