Machine Learning 101 with Microsoft ML.NET (part 3/3)

Daniel Costea
Senior Software Developer @ EU Agency

PROGRAMMING

To conclude what we have covered so far, it's clear that when building a model, the trainer selection is not the most difficult part. AutoML is able to suggest a list with the best models, due to the evaluation metrics which accompany every model. What is much more complex (and time-consuming) is the data preparation which, along with the training pipeline, builds a model ready to make predictions.

After we have a Machine Learning model, we are ready to consume it and predict some data.

var sampleData = new ModelInput
{
  Luminosity = lux,
  Temperature = temp,
  Infrared = infra,
  CreatedAt = DateTime.Now
  .ToString("dd/MM/yyyy hh:mm:ss"),
  Distance = 0
};

var predictor = mlContext.Model
.CreatePredictionEngine
  <ModelInput, ModelOutput>(model);

var predicted = predictor
  .Predict(sampleData);

Console.WriteLine(
  predicted.PredictedLabel);
Console.WriteLine(predicted.Score);

Whether we just create the model or if we load it from a previously saved file, all we have to do is to instantiate a prediction engine object using CreatePredictionEngine, then to call the Predict method on it, and then we can get the PredictedLabel along with its Score.

Consume Model in Enterprise Scenario

The previous approach for prediction works well with simple scenarios (like console applications), but what happens when we want to scale up for a more complex scenario?

The PredictionEngine object is not thread-safe, for web applications where we may need to create and destroy many prediction engine objects, we need a more complex approach.

Our first impulse, for multithread scenarios, like web applications (by using HTTP or websockets), would be to create a singleton for the PredictionEngine object, but the prediction engine object is not thread-safe, and this can break the functionality. Therefore, we have to use another approach.

The Microsoft.Extensions.ML nuget package provides an object called PredictionEnginePool which offers a pool of initialized PredictionEngine objects which are ready to use.

For initializing the objects pool, we need to add AddPredictionEnginePool to services in Startup.cs file.

public void ConfigureServices(
 IServiceCollection services)
{
  services
  .AddPredictionEnginePool < ModelInput, ModelOutput > ()
    .FromFile(modelName: "Model", 
     filePath:"Model.zip);
}

And then it can be consumed as it follows:

public class PredictController : 
Controller
{
  private readonly 
  PredictionEnginePool < ModelInput, ModelOutput > _engine;

  public PredictController(
  PredictionEnginePool < ModelInput, ModelOutput > engine)
  {
    _engine = engine;
  }

  [HttpPost]
  public IActionResult Post(ModelInput input)
  {
    ModelOutput prediction = _engine.Predict(modelName: "Model", example: input);

    return Ok(prediction);
  }
}

Machine learning models can be retrained and redeployed anytime, and this can introduce downtime for our application. Worry not, the PredictionEnginePool service provides a way to reload a retrained model without taking your application down.

Extensions for Deep Learning

"ML.NET has been designed as an extensible platform. Therefore you can consume other popular ML frameworks (TensorFlow, ONNX, Infer.NET, and more)", the documentation says and this is how ML.NET is stepping into the Deep Learning world. ML.NET is not (yet) capable of training a deep learning model from scratch.

Consume TensorFlow Models

TensorFlow is an open-source framework for Deep Learning and Machine Learning created by Google in 2015. TensorFlow has support for Python, Java, C++, C# and other languages. For C#, you can work with TensorFlow.NET SDK if you plan to build, train and infer Deep Learning models. TensorFlow.NET follows the Python naming conventions very closely and it is open source, as well. Of course, it takes some time learning it and you need data science skills. What happens if you don't have either of them?

Currently ML.NET (by using Microsoft.ML.TensorFlow nuget package) is limited to scoring and transfer learning. The advantage here is you can use, in a very simple manner, TensorFlow models trained with other frameworks (Azure Custom Vision, Keras etc.) for various use cases like computer vision, image recognition, voice recognition, language translation, handwriting recognition and more. Nevertheless, training a Deep Learning model from scratch, like Inception, it may take several days or even weeks (depending on the computing power).

For example, you can do scoring on images using the TensorFlow Inception model, which is a frozen model saved in protobuf format (.pb). This model is used for image classification and it was pre-trained with photos of different objects like animals, vegetables and other things you can find in day to day life. The images are classified in 1000 (one thousand) different classes, and the model will output the most similar classes for your image, along with their score.

In the first line of code we call the LoadFromEnumerable method with an empty list of objects. Since we plan to do only scoring, we don't need to load input data when building the training pipeline. Yet, we still need this for reading the data schema. Next, a training pipeline is built with an image, model loaders, and a few transformations like resize image and extract pixels in order to prepare the image for scoring.

var data = mlContext.Data.LoadFromEnumerable(
 new List<ImageNetData>());

var pipeline = mlContext
  .Transforms.LoadImages(
    outputColumnName: "input",
    imageFolder: imagesFolder,
    inputColumnName: nameof(ImageNetData.ImagePath))
  .Append(mlContext.Transforms.ResizeImages(
    outputColumnName: "input",
    imageWidth: ImageNetSettings.imageWidth,
    imageHeight: ImageNetSettings.imageHeight,
    inputColumnName: "input"))
  .Append(mlContext.Transforms.ExtractPixels(
    outputColumnName: "input",
    interleavePixelColors: ImageNetSettings
     .channelsLast,
    offsetImage: ImageNetSettings.mean))
  .Append(mlContext.Model
    .LoadTensorFlowModel(modelLocation)
    .ScoreTensorFlowModel(
      inputColumnNames: new[] { "input" },
      outputColumnNames: new[] { "softmax2" },
      addBatchDimensionInput: true));

  ITransformer model = pipeline.Fit(data);

  var predictor = mlContext.Model
   .CreatePredictionEngine<ImageNetData, 
    ImageNetPrediction>(model);

predictor.Predict(new ImageNetData 
  { ImagePath = path, Label = label });

But that's not all, many times you don't want to limit the predictions to the existing classes. Instead of that, you might want to intercept the final layer and complete the training yourself, with your set of images, for your desired classes.

Identifying the layers in a graph is not black magic and, most probably, we don't know the model, so a tool like Netron is an excellent way to visualize the graph.

Let's observe the softmax layer at the end. Normally, for classifying the image using the original thousand classes, we set the softmax layer as output, so the final classification (identification of the class) is done by the model itself.

When our intention is to complete the training with our dataset (classes and images), we call that transfer learning.

Let's see the code first and then let's get into the details.

var data = mlContext.Data.LoadFromTextFile<ImageNetData>(dataLocation);

var pipeline = mlContext
  .Transforms.Conversion.MapValueToKey(
    outputColumnName: LabelToKey,
    inputColumnName: nameof(ImageNetData.Label))
  .Append(mlContext.Transforms.LoadImages(
    outputColumnName: "input",
    imageFolder: trainImagesFolder,
    inputColumnName: nameof(ImageNetData.ImagePath)))
  .Append(mlContext.Transforms.ResizeImages(
    outputColumnName: "input",
    imageWidth: ImageNetSettings.imageWidth,
    imageHeight: ImageNetSettings.imageHeight,
    inputColumnName: "input"))
  .Append(mlContext.Transforms.ExtractPixels(
    outputColumnName: "input",
    interleavePixelColors: ImageNetSettings
    .channelsLast,
    offsetImage: ImageNetSettings.mean))
    .Append(mlContext.Model
      .LoadTensorFlowModel(modelLocation)
      .ScoreTensorFlowModel(
        inputColumnNames: new[] { "input" },
        outputColumnNames: new[] {
        "softmax2_pre_activation" }, 
        addBatchDimensionInput: true))
    .Append(mlContext.MulticlassClassification
      .Trainers.LbfgsMaximumEntropy(
      labelColumnName: LabelToKey, 
      featureColumnName: "softmax2_pre_activation"))
  .Append(mlContext.Transforms.Conversion
  .MapKeyToValue(PredictedLabelValue, 
    PredictedLabel))
  .AppendCacheCheckpoint(mlContext);

ITransformer model = pipeline.Fit(data);

var predictor = mlContext.Model
  .CreatePredictionEngine<ImageNetData, 
   ImageNetPrediction>(model);

predictor.Predict(new ImageNetData 
  { ImagePath = path, Label = label });

If we compare the code for transfer learning with the code for scoring, we notice that:

The LoadFromTextFile method gets the location of data as argument. The location contains images and a .csv file containing the classes.
The pipeline is stopping at the softmax2_pre_activation layer and the rest of the layers (like softmax2) are ignored
We have to take care of multi-classification, because we have our classes (instead of using the one thousand original classes)

Consume ONNX Models

ONNX is a standard, interoperable and open format created by Facebook and Microsoft for Deep Learning models. With ONNX, AI developers can more easily move models between state-of-the-art tools. The same as TensorFlow, a lot of use-cases are covered by the existing ONNX models: image classification, object detection & image segmentation, face & gesture analysis, image manipulation, speech & audio processing, machine translation, language modelling, and other interesting models.

YOLO (You Only Look Once) is a well-known Deep Learning model for real-time multi-object detection (~30 fps on CPU) capable of identifying objects in 80 classes. There are larger versions like YOLO9000 which extends YOLO to detect objects in more than 9000 classes. A smaller version trained with just 20 object classes called Tiny YOLO is what we will use in our code sample. The existing classes are:

person
bird, cat, cow, dog, horse, sheep
aeroplane, bicycle, boat, bus, car, motorbike, train
bottle, chair, dining table, potted plant, sofa, tv/monitor

From a code perspective, we have to load data with an empty list (as we did for TensorFlow scoring) in order to read the data schema, but the rest is very similar to scoring. A much consistent logic is used to interpret the results and that is because of the complexity of the output data. The model returns a list of best predicted objects along with their precision and their bounding box.

var data = mlContext.Data.LoadFromEnumerable(
  new List<ImageNetData>());

var pipeline = mlContext.Transforms.LoadImages(outputColumnName: "image", 
  imageFolder: "", 
  inputColumnName: nameof(ImageNetData.ImagePath))
  .Append(mlContext.Transforms.ResizeImages(outputColumnName: "image", 
  imageWidth: ImageNetSettings.imageWidth, 
  imageHeight: ImageNetSettings.imageHeight, 
  inputColumnName: "image"))
  .Append(mlContext.Transforms.ExtractPixels(outputColumnName: "image"))
  .Append(mlContext.Transforms.
   ApplyOnnxModel(modelFile: modelLocation, 
    outputColumnNames: new[] 
   { TinyYoloModelSettings.ModelOutput }, 
   inputColumnNames: new[] 
   { TinyYoloModelSettings.ModelInput }));

var model = pipeline.Fit(data);

Microsoft ML.NET is improving continuously adding new features, trainers and training scenarios. For example, the GPU support for CUDA was added recently for [re]training our models locally and for inference support for Blazor client-side applications (using WebAssembly).

Why I love ML.NET

ML.NET is not going to replace existing frameworks like TensorFlow, but considering that AI is going to be adopted by the majority of the applications, as a .NET developer and not having a data science background, I prefer a code-first framework.

ML.NET is very easy to learn and you can use it on premises on different platforms like Linux, macOS or Windows with C# of F#.

You can find me on github.com/dcostea and twitter.com/dfcostea for more information and cool projects.

(end of part 3/3)