10 tips for data scientists who don't have much experience on productionizing the ML applications

ML systems are increasingly used in day to day applications. Data scientists often spend a lot of time in designing the model but little on the model post-deployment. In contrast, the software development life cycle emphasizes on testing and production systems post-deployment. ML systems will flourish and can make more impact if they mimic the best practices of software engineering.
Machine learning systems directly or indirectly can influence the customers that they are intended to serve. A mis-classification could result in a catastrophe for some random customer. Therefore, it is prudent for data scientists to meticulously emphasize on the testing and production.
This article is an extension of the tweet that I posted. I would like to thank Practical AI podcast members and Tania Allard for sharing useful tips.

Tips on testing and production — ask these questions

  1. ML system testing should cover the testing of the entire pipeline as opposed to checking the only performance of the model post deployment.
  2. Check if the model is deterministic post-deployment? Is there any deviance from expectation or what is seen during training?
  3. Have an eye on infrastructure — constantly monitor the compute time for predictions; is resource usage increasing? Many cloud vendors offer inbuilt dashboards on the workloads of the instances. Also, elasticity is in place to scale up in case of congestion.
  4. Monitor the implicit bias — are certain cohorts of population or certain data points more influencing? This is more than checking the correlations and should include spotting the input distribution changes w.r.t to target variable using other techniques such as density plots, skewness detection, etc.
  5. Continuous Integration/Continuous delivery is an important aspect that data scientists often disregard. Determine how long the current model will be in production. Implement versioning and plan for the next release?
  6. Draw an Impact matrix — Make sure the system maintains privacy and compliance. Even the transformations applied to the data should ensure these.
  7. The production model vs baseline model is mandatory. Baseline model comparison should not only be restricted to model selection, but it should be also be used to access the deployed version.
  8. Reproducibility is the key from an implementation perspective. Is the system reproducible at all stages of deployment? How easy it is to replicate it again in a different environment.
  9. Make sure the tractability exists — What is the use of building a model which is a black box. One should choose to build an explainable model over a grey model. The system should be able to explain why a customer is given such a score during the audit.
  10. Collaborate — each ML system development depends on the funding organization. It can be integrated into an existing product or a new independent system. It is important to perform regression testing and collaborate with different teams during and post-deployment. For instance, data scientists need to work with a DevOps engineer to deploy the model.
    Rubric — scoring guide:
    0 points — if there is no testing in place to check the above
    1 point- Many current ML systems fall in the category where a data scientist manually checks the above points.
    2 points- versioned and automation of the above steps — the only scant amount of current production ML systems are implemented this way. It is irrefutable to say that automating the tracking of ML systems is a tough task but rewards are worth enough.

Comments

Popular posts from this blog

Scenario based interview questions for Data science!

What is Data Preprocessing?