Sunday, September 15, 2019

Machine Learning System Architecture

What is Architecture?

In simple terms, the way software components are arranged and the interactions between them.

Why is it important at the start?

Maintaining ML systems is challenging. They have all the tech debts issues of traditional systems +
issues of its own.

So, Clarity in planning and architecture design helps to mitigate potential issues and errors.

A shared understanding of the system architecture and responsibilities is essential for effective cooperation between data science, engineering, and devops teams.

Specific Challenges of ml systems

1. The need for reproducibility(versioning everywhere)
   This is essentially the ability to duplicate the ml model exactly, this can be necessary for research, model improvements, audits or regulatory reasons depending on the business.

2. Entanglement
   If we have an input feature that we change then the importance, weights or use of the remaining features may all change as well. So there is a challenge of input not being independent, this is refers as change in anything changes everything principle.

3. Data dependencies

4. Configuration issues.
 There is a need for incrementing models and experimenting, this can result in temptation to build models on top of each other and create subtle dependencies. There is a challenge of allowing configurations to be flexible, making it easy to see difference in configuration between two models. This is not straight forward and requires specific steps to be taken.

5. Data and feature preparation.
  Systems can run the risk of massive amount of supporting code written to get data into and out to expected formats. eg: for scikit learn or tensorflow consumption.

6. Model errors can he hard to detect with traditional tests.
 
7. Separation of Expertise



So we have Data Scientists developing the model. Software engineers taking the models and putting them into applications, devops doing the deployments and business having executives, product managers determining what their requirements are. In this context there is a risk of code being thrown over the wall from departments to another, when no one understands the full process. So mitigating the risk of errors and wasted time is important.









No comments:

Post a Comment