FlexiFlow

Adaptive model-switching for reliable and efficient end-to-end ML workflows.

About

FlexiFlow is a programming model aimed at automatic model switching to maximize performance in machine learning workflows.

With increased use of machine learning in production systems, engineers face practical challenges in deploying and maintaining ML models and workflows. A recurring issue is soft failures: situations where a model does not crash but returns degraded predictions, often due to factors such as data drift.

In multi-step workflows, these failures reduce end-to-end quality and increase operational burden. Static model selection at each step often fails to preserve accuracy across diverse real-world inputs. If one model underperforms, re-running the entire workflow with alternatives can significantly increase latency and cost.

FlexiFlow introduces a dataflow approach that dynamically switches between alternate models when current models show low accuracy. It learns model ranking through a multi-armed bandit strategy that incorporates runtime, assertion pass probability, and workflow structure.

FlexiFlow architecture diagram
FlexiFlow Architecture

Docs

Documentation links will be published with the public repository release.

Code

The FlexiFlow repository is currently being prepared for open-source release. The Dockerfile used to build the FlexiFlow environment will be available in the repository root.

Papers

Upcoming papers will be added here.

Team

Collaborators