In this post, we will deploy a serverless machine learning model to AWS using Serverless. The set-up of Serverless is discussed here.
Let’s create a directory
mkdir scikit-regression && cd scikit-regression
Windows 10, we can create a virtual environment and activate it as follows:
Install Virtualenv
In your VS Code command shell prompt type
pip install virtualenv
Start virtualenv
virtualenv env
Activate virtualenv
On Windows, virtualenv (venv) creates a batch file called
envScriptsactivate.bat
To activate virtualenv on Windows, and activate the script is in the Scripts folder :
pathtoenvScriptsactivate
Example:
C:Users'Username'envScriptsactivate.bat
Create a requirements.txt file
Add the scikit-learn version
scikit-learn==0.22.0
Run in your command prompt
pip install -r requirements.txt
For more information on how to set up a virtual environment, please visit here.
Train and Save Model
Linear Regression on Boston Housing Dataset
This data was originally a part of UCI Machine Learning Repository and has been removed now. This data also ships with the scikit-learn library. There are 506 samples and 13 feature variables in this data-set. The objective is to predict the value of prices of the house using the given features.
The description of all the features is given below:
CRIM: Per capita crime rate by town
ZN: Proportion of residential land zoned for lots over 25,000 sq. ft
INDUS: Proportion of non-retail business acres per town
CHAS: Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
NOX: Nitric oxide concentration (parts per 10 million)
RM: Average number of rooms per dwelling
AGE: Proportion of owner-occupied units built prior to 1940
DIS: Weighted distances to five Boston employment centers
RAD: Index of accessibility to radial highways
TAX: Full-value property tax rate per $10,000
B: 1000(Bk – 0.63)², where Bk is the proportion of [people of African American descent] by town
LSTAT: Percentage of lower status of the population
MEDV: Median value of owner-occupied homes in $1000s
We are going to use three variables: ‘LSTAT’, ‘AGE’ and ‘RM’ as features and MEDV as traget variable.
# To add a new cell, type '# %%'
# To add a new markdown cell, type '# %% [markdown]'
# %%
from sklearn.datasets import load_boston
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.feature_selection import SelectKBest, f_regression
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error
import joblib
import time
import numpy as np
RANDOM_STATE = 42
# %%
from sklearn.datasets import load_boston
boston_dataset = load_boston()
# %%
print(boston_dataset.DESCR)
# %%
boston = pd.DataFrame(boston_dataset.data, columns=boston_dataset.feature_names)
boston.head()
# %%
boston.info()
# %%
boston.describe()
# %%
# Prepare the data for training
X = pd.DataFrame(np.c_[boston['LSTAT'], boston['AGE'], boston['RM']], columns = ['LSTAT','AGE','RM'])
_, y = load_boston(return_X_y=True)
# %% [markdown]
# Splitting the data into training and testing sets
# %%
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=RANDOM_STATE)
print(X_train.shape)
print(X_test.shape)
print(y_train.shape)
print(y_test.shape)
# %%
def create_model():
model = Pipeline([
('scaler', StandardScaler()),
('selector', SelectKBest(score_func=f_regression, k='all')),
('lr', LinearRegression())
])
return model
# %%
model = create_model()
# %%
model.fit(X_train, y_train)
# %%
y_pred_test = model.predict(X_test)
print(mean_squared_error(y_test, y_pred_test))
# %%
model = create_model()
# %%
model.fit(X_train, y_train)
# %%
model_id = str(time.time())
model_name = 'model_' + model_id + '.joblib'
joblib.dump(model, model_name, compress=False)
print(model_name, 'saved.')
Create Serverless Project and Handler Prototype
sls create --template aws-python3 --name boston-housing
Install a plugin for Python requirements
This will automatically add the plugin to your project’s package.json
and the plugins section of its serverless.yml
. The plugin will now bundle your python dependencies specified in your requirements.txt
or Pipfile
when you run sls deploy
. We also install a particular version of the plugin.
sls plugin install -n serverless-python-requirements@4.2.4
handler.py
import json
import joblib
model_name = 'model_1616253959.4820366.joblib'
model = joblib.load(model_name)
def predict(event, context):
body = {
"message": "OK",
}
if 'queryStringParameters' in event.keys():
params = event['queryStringParameters']
LSTAT = (params['LSTAT'])
AGE = (params['AGE'])
RM = (params['RM'])
inputVector = [LSTAT, AGE, RM]
data = [inputVector]
predictedPrice = model.predict(data)[0] # convert to units of 1 USDs
predictedPrice = round(predictedPrice, 2)
body['predictedPrice'] = predictedPrice
else:
body['message'] = 'queryStringParameters not in event.'
print(body['message'])
response = {
"statusCode": 200,
"body": json.dumps(body),
"headers": {
"Content-Type": 'application/json',
"Access-Control-Allow-Origin": "*"
}
}
return response
# to test locally
def do_main():
event = {
'queryStringParameters': {
'LSTAT': 7.14,
'AGE': 28.14,
'RM': 6.62
}
}
response = predict(event, None)
body = json.loads(response['body'])
print('Price:', body['predictedPrice'])
with open('event.json', 'w') as event_file:
event_file.write(json.dumps(event))
# do_main()
Test function locally using Serverless
# invoke lambda function locally
# change serverless.yml file
service: boston-housing # NOTE: update this with your service name
functions:
predict-price:
handler: handler.predict
memorySize: 512
timeout: 30
events:
- http:
path: get-price
method: get
request:
parameters:
queryStrings:
LSTAT: true
AGE: true
RM: true
# event.json file we created earlier using the local test
sls invoke local --function predict-price --path event.json
You can also test the function by invoking globally or using unit tests.

Deploy Model to AWS
serverless.yml
service: boston-housing # NOTE: update this with your service name
provider:
name: aws
runtime: python3.8
lambdaHashingVersion: 20201221
stage: dev
region: us-east-1
# you can add packaging information here
package:
# include:
# - include-me.py
# - include-me-dir/**
exclude:
- node_modules/**
- .vscode/**
- __pycache__/**
- .ipynb_checkpoints/**
- (*).ipynb
- env/**
functions:
predict-price:
handler: handler.predict
memorySize: 512
timeout: 30
events:
- http:
path: get-price
method: get
request:
parameters:
queryStrings:
LSTAT: true
AGE: true
RM: true
plugins:
- serverless-python-requirements
custom:
pythonRequirements:
dockerizePip: non-linux
slim: true
Deploy using the following command
set AWS_ACCESS_KEY_ID=<your-key-here>
set AWS_SECRET_ACCESS_KEY=<your-secret-key-here>
# AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY are now available for serverless to use
sls deploy
To learn how to create AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY please visit here.

Invoke function globally
sls invoke --function predict-price --path event.json
To debug
set SLS_DEBUG=true
# or logs using
sls logs --function predict-price
References
- https://www.udemy.com/course/deploy-serverless-machine-learning-models-to-aws-lambda/
- https://github.com/serverless/examples