Machine Learning over Container

Kunal Maheshwari
3 min readMay 26, 2021

— Launching python environment with docker, to run M.L models

Greeting!!

Containers creates an isolated environment for tasks we need to perform. Implementing machine learning algorithms to train models requires some libraries (using python), such as scikit-learn, pandas. But sometimes when using other algorithms or may be other tasks the version of libraries of python may conflict. Thus it’s very ideal to work in isolated environment for every tasks.
The same facility Containers provides.

Thus in this article, I’ll be deploying a container with required libraries to execute machine learning programs which will train models for us. Let’s begin

Firstly, we need to create environment with installing python and required libraries, i.e. scikit-learn & pandas for my example.

For base image, I’ll use centos:7 . For that let’s pull this image :~

docker pull centos:7

After that, let’s create a test container for creating environment :~

docker run -it --name creating_py_env centos:7

In this container, python is not installed. Let’s patch that up

yum install python36 -y

Now, let’s setup environment by install required libraries:
— scikit-learn
— pandas

Installing scikit-learn

Required libraries are installed :~

Now, let’s commit this container for base environment :~

docker commit creating_py_env ml_python:v1

Another way to create environment, and more suitable is Dockerfile. Just for sake, Dockerfile for same environment will somewhat look like :~

FROM centos:7RUN yum install python36 -yRUN pip3 install scikit-learn pandasCMD /bin/bash

Now, moving further let’s launch our environment for executing ML program.

docker run -it --name ml_exec_env ml_python:v1

For machine learning, I’ll be using simple example of Linear Regression, using dataset called, SalaryData.csv , which looks like

YearsExperience,Salary
1.1,39343.00
1.3,46205.00
1.5,37731.00
2.0,43525.00
2.2,39891.00
2.9,56642.00
3.0,60150.00
3.2,54445.00
3.2,64445.00
3.7,57189.00
3.9,63218.00
4.0,55794.00
4.0,56957.00
4.1,57081.00
4.5,61111.00
4.9,67938.00
5.1,66029.00
5.3,83088.00
5.9,81363.00
6.0,93940.00
6.8,91738.00
7.1,98273.00
7.9,101302.00
8.2,113812.00
8.7,109431.00
9.0,105582.00
9.5,116969.00
9.6,112635.00
10.3,122391.00
10.5,121872.00

Which is used by further program :~

import pandas as pd
from sklearn.linear_model import LinearRegression
import joblib
dataset = pd.read_csv("SalaryData.csv")X = dataset["YearsExperience"].values.reshape(-1,1)
y = dataset["Salary"]
model = LinearRegression()
model.fit(X,y)
joblib.dump(model, "salary_model.pk1")

Here, using joblib module, our model will be saved in salary_model.pk1 file.

Now finally we can use this model for prediction, For which I’ll use load_model.py program

Executing this program will result as:

That’s it!!

Conclusion :~

— We’ve used containers to provision environment for executing and testing machine learning model.

Thank you.

--

--