TSM - Machine learning in the cloud

Roland Szabo - Junior Python Developer

One of the most important factors that contributed to the huge successes of machine learning in the last 10 years was the increased computing power. In 2010, Dan Cireșan et al. have set the new state of art for handwritten digits using an algorithm developed in the 1980s and by augmenting the data set with a procedure described in 1990. The only difference was the amount of computing power: using a modern GPU they finished training in one day, which might have taken 50 days on a CPU.

But in parallel with the increase in processor clock speed, the amount of information to be processed grew, at an even faster rate. To deal with these situations, many cloud based solutions have appeared, some offered by startups specializing in various areas of machine learning.

One of the first startups to offer machine learning services in the cloud was BigML, who have launched about two years ago.

They started by offering decision trees and then they developed their product, with various improvements such as pruning strategies and ensemble methods.

After we have trained a model, we can visualize them using two different types of diagrams, which help us see the influence each feature has on the outcome. One of the visualizations is an interactive tree where we can walk the tree, observing the decision that is made on each level. The other visualization is called a "Sunburst diagram", which shows us how many data instances support each decision and what is the expected error for it.

Their service can be used in two ways: from a web interface and through an HTTP API.

A decision tree for a model of the grades obtained by a student
Sunburst diagram for the same model

Ersatz labs are a more recent startup, still in private beta, but they plan to go public around April or May.

The charts for accuracy and loss

Their specialization is deep learning. They offer various models of neural networks, for which you can set various hyper parameters and then you start the training, which is done on GPUs.

They analyze the data you upload and, using some heuristics, they suggest values for the hyper parameters that might work best for our data. With only a few adjustments, I managed to build, in 5 minutes, a model that recognized letters and digits from receipts with 80% accuracy.

After the training is done, we can see how the values of accuracy, cost function and the max norms of the weights of the neural network evolved by each iteration. Using this information we can then fine tune our hyper parameters to obtain better results.

Even though the beta invites are usually received in the same day, the fact that Ersatz is still in beta must be kept in mind. While testing the service, on the first dataset I uploaded I encountered a mysterious bug "Training failed". I talked to the customer support and they told me that I had found a bug in Pylearn2, the Python library they use for neural networks. They solved it in 2 days, but even after that, the service had some hiccups.

The models they offer so far are auto encoders for image data, recurrent neural networks for time series, neural networks with sigmoid or ReLU layers, and convolutional networks with or without maxout.

PredictionIO is a bit different from the other products on this list. Even though it is developed by TappingStone, which offers commercial support for it, the actual product is distributed on GitHub under an open source license.

PredictionIO is a recommendation engine, built on scalable technologies, such as MongoDB and Hadoop. With it, using historical data about the actions done by a user (such as viewing, buying or rating a product), we can recommend him other products he might be interested in.

The engine has two components. The first one is a web interface where we can manage the algorithms we use for making recommendations. We can select various algorithms (or implement our own custom ones), set their hyper parameters and run simulated evaluations. The other component is an HTTP API (with SDKs for various languages) through which we can add new users, products and actions and then get recommendations.

Using MongoDB and Hadoop makes PredictionIO quite powerful, but also more complicated. If you want to scale up from the default Hadoop configuration, which runs on a single machine, you are on your own with managing the cluster. For the other services listed here, when you need more processing power, all you have to do is click a button in the browser (and switch to a more expensive plan).

AlchemyAPI offers deep learning services as well, but at a much higher level than Ersatz. You don"t get to train neural networks yourself on your data, but they have pretrained networks for natural language processing tasks such as entity extraction, keyword finding, sentiment analysis, and finding the author and language of a piece of text. All this can be accessed through their API.

They don"t offer much in terms of customization, most of the service being already implemented. As long as we only use the languages for which they have support, it will work quite well, because the problems of entity extraction, sentiment analysis and others are general enough to work in any domain. However, when you want to use it on a language that is not well "known" to them, such as Romanian, the service doesn"t know what to answer.

AlchemyAPI can be used through the SDKs they offer for various languages, such as Python, PHP, Node.js, C++, Java and C# which can then be integrated into our applications.

Entities found by AlchemyAPI in a post about Stephen Wolfram new language.

These are only a few of the cloud based machine learning services. There are many others, ranging from Google Prediction API (which is completely closed and doesn"t say which algorithms it uses for making predictions), to ŷhat, which is exactly the opposite: they don"t offer you any algorithms, only a framework for Python and R, with which you can build scalable solutions.