I have been looking for something like this - its just amazing!
Inference for Flan-T5 is consistently 500ms - I never get cold starts
Sample size is very small, but performance is lovely so far!
Just discovered @cerebriumai and my life is never going to be the same!