Designing a User-Friendly ML Platform with Django

Machine Learning Mar 05, 2020

When building machine learning systems, user experience is rarely a first-order consideration. Typically, accuracy, performance, and scalability take precedence over how easy it is for users to get predictions out of a model.

And yet, no matter how good your DeepBERTceptionCycleGANv7 is, people outside ML research circles may hesitate to use the model if the process to invoke inference reads like some dark magic incantation. We've all seen a README that contains a command like this:

# It's easy to run inference! Just substitute in your own settings.
python bertception.py \
	--mode=infer \
    --input_file=where_did_these_preprocessed_features_come_from.hdf5 \
   	--checkpoint_path=$MODEL_DIR/model.ckpt-2e16 \
    --config_file=$MODEL_DIR/config.json \
    --vocab_file=$VOCAB_DIR/requires_separate_pipeline.vocab \
    --batch_size=256 \
    --beam_size=10 \
    --boot_size=10.5 US \
    --user_ml_phd_from=<stanford|mit|cmu> \
    --verbose
"I think I'll just stick to my linear regression model."

Beyond the CLI

Here at Reverie Labs, we're working to put ML and other computational tools at the core of the drug discovery process. In addition to engineers, our team includes scientists with chemistry, physics, and other non-ML backgrounds. Because these team members need to run our models on a daily basis, even a well-factored Bash command creates too much friction to be practical.

Last year, the engineering team decided it was time to create a user-friendly platform to make it dead simple for team members to get predictions from our ML models. We already had robust infrastructure in place for serving containerized models via Docker and Kubernetes (which Ankit described in detail in a recent post!). It was just a matter of creating a web interface to allow users to interact with this infrastructure.

To tackle this problem, we turned to Django, a popular, open-source web framework. For us, Django comes with a number of advantages over similar frameworks:

  • Django is built on Python, which has also become the standard language for machine learning. It's easy for ML engineers like me who don't have extensive web dev experience to pick up Django.
  • Django bills itself as "The Web framework for perfectionists with deadlines." As a startup, the ability to rapidly prototype an idea without investing too many resources upfront is key.
  • Django projects can contain multiple semi-independent applications. This structure makes it easy to ship new apps with entirely different functionality.

Using Django, we were able to put together a V0 of our ML serving platform in under two weeks. Today, Reverie scientists routinely use this platform to run any model in our rapidly-growing suite of predictive tools with the ease of ordering online takeout.

What's more, our Django platform has expanded to become an entire ecosystem of internal apps spanning the drug discovery process: from visualizing protein-ligand interactions to assisting chemists with retrosynthetic planning.

Hands-on with Reverie's Django Platform

In January, we gave a talk at DjangoBoston, an awesome meetup where Boston-based Django practitioners gather to showcase their work and share insights. In the talk, we go into some more technical details about our Django platform. There's also some background on the Reverie team for those who are curious to learn more about who we are and what we do.

We're hiring!

We’re actively hiring engineers across our tech stack, including Full Stack Engineers, DevOps Engineers, Cloud Architects, and Infrastructure Engineers to work on exciting challenges that are critical to our approach to developing life-saving cancer drugs. You will work with a deeply technical (all engineer and scientist!) YC-backed team that is growing in size and scope. You can read more about us at www.reverielabs.com, and please reach out if you’re interested in learning more.

Gabe Grand

Gabe is an ML engineer at Reverie Labs. He has worked on the Tensorflow team at Google AI, and is an open source TF contributor. He graduated from the Harvard CS program in 2018.