Sunday, September 15, 2019

Design Approaches to ML System Architecture

General ML Architectures

1. Train by batch, predict on the fly, serve via REST API.
     The model trained and persisted offline, loaded into a web application, and give real time predictions of the input data given by client via REST API.
2. Train by batch, predict by batch, server through a shared database.

3. Train, predict by streaming.

4. Train by batch, predict on mobile(or other client).





In Pattern1 we are able to serve predictions almost in real time, so it means its easy to A/B test as well.
One of the problem here is, since we are doing this on the fly, we are not able use a slow algorithm, and there is a complexity in scaling.

In Pattern2, It is easy to use a different systems for front end and different system for batch, so different languages, different frameworks can be used. Easier to manage model version and prediction results. We can use an slow and complex algorithm. On the other side, there is lag between prediction to ingesting, so not suitable for many types of consumer applications.

In Pattern3, We can predict with very low latency and we can update the model interactively. On the con side this requires some complex infrastructure.

In Pattern4, We would have low latency for prediction, but we have tight coupling with the device, so we are limited to number of algorithms that are available to use on the device.

Patten1 is the best trade off for most cases.

No comments:

Post a Comment