Build and deploy AI models quickly.

A machine learning framework that makes it easier to train, deploy and monitor machine learning models with just a few lines of code.
$10 free credit - no credit card required
Trusted & Backed By Our Users
01
· Deploy

Serverless GPU Model Deployment

Deploy all major ML frameworks such as Pytorch, Onnx, XGBoost etc with 1 line of code. Don't have your own models? Deploy our prebuilt models that have been optimised to run with sub-second latency.
Pre/Post processing functions
Add logic before and after models in order to get it into the right format.
Automatic Versioning
Every time you deploy a model, we version it so you can rollback with the click of a button.
Cold starts < 1s
We have put a lot of work into our architecture that cold starts even for LLMs like Flan-T5 and GPT Neox are < 1 second.
Powerful serverless GPU's
We run a combination of Nvidia T4's, A10's and A100's and only charge you for your inference time.
02
· Training

Effortless fine-tuning Beta

Fine-tune smaller models on particular tasks in order to decrease costs and latency while increasing performance. It takes just a few lines of code and don't worry about infrastructure, we got it.
Join Training Beta
1
from cerebrium import trainer
2
3
hyperparameters = {
4
"num_train_epochs": 30,
5
"num_warmup_steps": 100,
6
"batch_size": 15,
7
"weight_decay": 0.01
8
}
9
10
run_id = trainer("gpt-neox-20b", "train_dataset.jsonl",
11
hyperparameters)
12
13
Training job started! Est time: 2h32m
14
run_id = b5440db1-f739-47f9-9fd5-083753643019
15
1
from cerebrium import trainer
2
3
hyperparameters = {
4
"num_train_epochs": 30,
5
"num_warmup_steps": 100,
6
"batch_size": 15,
7
"weight_decay": 0.01
8
}
9
10
run_id = trainer("flan-t5-xl", "train_dataset.jsonl",
11
hyperparameters)
12
13
Training job started! Est time: 1h52m
14
run_id = b5440db1-f739-47f9-9fd5-083753643019
15
1
from cerebrium import trainer
2
3
hyperparameters = {
4
"num_train_epochs": 30,
5
"num_warmup_steps": 100,
6
"batch_size": 15,
7
"weight_decay": 0.01
8
}
9
10
run_id = trainer("gpt-j-6b", "train_dataset.jsonl",
11
hyperparameters)
12
13
Training job started! Est time: 2h32m
14
run_id = b5440db1-f739-47f9-9fd5-083753643019
15
03
· Monitoring

Monitoring made simple

Integrate with top ML observability platforms in order to be alerted about feature or prediction drift, compare model versions and resolve issues quickly.
Setup Monitors and alerts
Create thresholds so your team can be alerted about issues with your models.
Monitor Feature and Prediction Drift
Discover the root causes for prediction and feature drift to resolve degraded model performance.
Compare models versions
Compare different version of models and compare them to your baseline model.
Explanability
Understand which features are contributing most to the performance of your model.

Loved by engineers everywhere

I have been looking for something like this - its just amazing!

Teodor
Machine learning engineer

Inference for Flan-T5 is consistently 500ms - I never get cold starts

Ondrej
Machine learning engineer

Sample size is very small, but performance is lovely so far!

Farouq
Software Engineer

Just discovered @cerebriumai and my life is never going to be the same!

Andrew
Entrepreneur
Announcing Cerebrium Bounties! Collaborate with the community to drive innovation ->