MACHINE LEARNING - From Research to Deployment

ABOUT

A journey from deep theoretical aspects of machine learning to professional deployment engineering.


OVERVIEW


The word Machine learning has been around for quite some time. In most of the fields that you can name today, one can find this buzz-word hiding in some corner. Although ML may seem relatively new, its core concepts have been around since early 20th century and Mathematicians have been calling them Inverse problems.

But then, how did this word 'Machine learning' become trending?

Improved computing capabilities along with the elegence of efficient algorithms to compute core mathematical operations resulted in the ML community, who exploited these advances to realize their research findings computationally. Then another group of computer scientists started working towards adapting these computations for real time applications, thereby pushing towards deployment for real-life applications. When this gap between research and deployment started becoming lesser and lesser, more funds started flowing and more exciting it became.

Particularly, in the last five years, there have exponential amount of work done towards the aim of bringing research and deployment closer.

  • Computing differentials form the heart of most optimization algorithms, and myraids of research have been done to speed up this computation. Ability to realize these computations and it's differentials with parallel computing on Directed Acyclic Graphs have resulted in the development of various ML frameworks like Tensorflow, Torch, Keras etc. These frameworks abstract away the necessary performance-related code. This paved way for researchers to pipe their programs for deployment. Earlier, this had been an uphill task.

  • Development of various open-source tools and libraries to ease mundane computer-related tasks. Cross-platform compilation is easier with containerization tools like Docker. The code is compiled and tested just inside the container and can be gaurenteed to run on a range of operating systems and devices. Otherwise, testing the code on each platform would have been a time-consuming task.

  • Efficient networking and storage capabilities have resulted in more accessible computing resources on the cloud, which can even scale for business settings at more affordable prices. Moreover, with code migrations made easier with docker, one can even switch cloud services with ease. All these developments have made the clould set-up a lot easier. We just have to know how to instantiate a virtual macine on a cloud with some flavour of linux OS, and all sorts of applications can be built at production grade.

  • Availability of cheaper System-on-chip computer boards and programable friendly micro-controllers have pushed our venture into using embedded systems to achieve more plausible machine learning applications, which breaks the software boundries and enters into the mechatronic world.

It is now time to make use of all these advancements to fortify the way we think and solve problems. The future generation should not see ML as a separate field of study, but as something that augments their problem solving capabilities.



THE MAKING ...


There are a number of tutorials and lectures on this subject out there, but this end-to-end pipeline for taking a solution from research to deployment is mostly available only in bits and pieces. MLR2d aims to organize these contents to help you get a comprehensive view. Just a view is not enough though. We need also the comprehensive skills to achieve a technological perfection!

After completing Masters in Simulation science from RWTH Aachen, Germany in 2017, I got quite some taste of bleeding-edge advancements in ML. I also realized that this field is much more vast and there is a lot that needs to be mastered. In my attempt to achieve perfection, I have spent all my time trying to penetrate deeper into the subject as much as possible in two orthogonal directions - one towards mathematical research and another towards deployment. The contents started becoming volumnous and remembering them all was quite challenging. This website was developed as I was revising my notes, although only less than half of it is currently in digital format.



WHAT TO GET FROM HERE?


These contents are ideal for graduate or doctoral students or anyone who is trying to do something serious. The contents have been categorized into three sections:

  1. Research: Starting from advanced functional analysis and moving into parts of applied math
  2. Bridge: Important concepts that enable adaptation of mathematical formule into computational form
  3. Deploy: Everything required to adapt your solution for different kinds of production settings and extend its working to various other devices.

Each category has several modules, arranged in an order that builds up on the concepts from previous modules. Each content in a module is also available as ipython notebook along with links to code in my github repository (if applicable). Since the contents were initially made for the purpose of my own reference, they can be crisp, yet quite extensive and needs to studied from start to end for complete undertstanding. The idea is, once you study a section, you should be confortable enough to look through the concepts published in journals or quickly understand the documentation or stackover flow answers related to that subject.

I have a github repository for every active section in each module, which contains ipython notebook files of the contents. This website automatically renders the html version of the ipython notebook files from the master branch of my git repository. So the contents are perfectly in sync.

The contents are constantly being updated and they all still require several revisions. Eventually, I'm looking to make this into a place where you can find precisce, neat and well-organized information, so that anyone can master this comprehensively.

Before getting started, read through the overview of each block in the flowchart to understand the overall contents. It is recommended that you start from the 'ML frameworks' and proceed on either direction to get a better picture. A penetrating approch have been adapted so that as you proceed along the direction of flow lines, deeper aspects will be uncovered.





7th March 2019

Aravind

MODULES