Automating Jupyter Notebooks with Discord Bot, AWS Lambda, and Sagemaker

Efficient Jupyter notebook Report Generation and Delivery for Data Scientists and Business Analysts

Mahesh
20 min readJul 23, 2023
Photo by Jamie Street on Unsplash

Many times, data scientists or business analysts use Jupyter Notebook to prepare custom ad-hoc reports for management. And there are instances when these reports become so useful that they need to be productionized.

In this article, we will explore various ways to automate notebook execution using Discord bot, AWS Lambda, and Sagemaker Processing jobs. Our approach involves:

  1. Discord bot: Utilizing the Discord bot to send specific reports to designated customers. This enables anyone to request a report without the need to add report generation logic to the application’s front-end and back-end.
  2. AWS Lambda: Employing AWS Lambda to handle different Discord bot queries, such as generating a report or checking its status. This is particularly useful when report execution might take some time, and it is essential to know if any failures occur. Using AWS Lambda also helps save costs on instances, as we do not have continuous loads.
  3. Sagemaker Processing job: Leveraging Sagemaker Processing job to handle the heavy lifting of report execution. This includes uploading the generated report to S3 and sending the download link to the original requester through Discord webhook.

Prerequisites

  1. Python knowledge because we will not go through every line of code
  2. Some Docker experience
  3. DevOps knowledge will be beneficial
  4. Jupyter notebook workings
  5. AWS Lambda experience
  6. Good to have Sagemaker experience but not required

System design

Creating Discord Bot

There are numerous articles available online that provide step-by-step guides on creating a Discord bot. Therefore, if you encounter any difficulties with the following steps, feel free to refer to these comprehensive resources for assistance.

  1. Enable developer mode in your discord app
Developer mode in Discord

2. Create a new server (or guild in the language of discord)

  • Click on the plus icon
  • Click on ‘Create My Own’
  • Select either, although you may want to go ahead with ‘For me and my friends’
  • Enter a suitable server name, and click on Create

A newly created server will look like this:

3. Go to https://discord.com/developers/applications and create a new application

Give a name to your application (this will not be your bot’s name) and agree to the TOS and policy before clicking on create

You can put in description and tags as you like but they are optional.

  • Go to the “Bot” tab from the left-side menu and click on “Add Bot” > “Yes, do it!”
  • If it shows “Too many people have this username,” change your application name to something less generic. Remember, bots and applications can have different names, but they are initialized to the same when you first create the bot.
  • Head to the “OAuth2” tab and navigate to its child URL Generator tab. Check the “application.commands” box. This step allows you to add new slash commands to your bot, enabling it to respond to specific user inputs.
  • Go to the URL shown at the bottom of the box; it’ll look something like this, after you select your test server from the dropdown list:

Click on authorize.

4. Go to your AWS account, go to Lambda and create a new function

Feel free to choose any name you like for this function as it will be responsible for handling all our Discord commands. The function runtime should be set to Python 3.9, and for the architecture, select x86_64. If your system operates on ARM architecture, opt for arm64 instead. This choice is essential as we will be creating some Lambda layers later, and the system architecture on which the layers are built must match the Lambda architecture.

If you wish, you can enable tags in the advanced settings for better organization and management.

Paste the following code in your lambda function:

In the above code, we have three functions:

a. Verify signature: This function verifies whether the signature sent in the Discord message payload matches with the signature after converting the Discord payload.

b. Command handler: The command handler function manages different types of commands. We will extend this function later on, but currently, it only responds to the “blep” command.

c. Main Lambda handler: This function first verifies the signature and then checks the type of message. If the type is 1, it means it’s a ping type command, and in that case, it returns a JSON type 1 response. Otherwise, it responds with the command handler function.

To make this code work, we will need a lambda layer and an evironment variable PUBLIC_KEY .

5. Go back to your discord developer portal, in general information tab there is a button a copy public key

Add that public key to your lambda environment variable.

6. We need to add the PyNaCl package to the Lambda layer so that the function can import it. You have the option to either install it in the same directory as your code and deploy the entire directory as the Lambda function or separate the Lambda code from the PyNaCl package.

  • In your project directory create a new folder called layer and inside this folder create a new folder python . We will install the PyNaCl package inside python folder and then zip python folder to upload it as a layer on AWS Lambda.
  • Do a targeted pip install of PyNaCl into the python foler pip install -t layer/python PyNaCl -q . Make sure you have your conda or pip environment activated that has python and pip pre-installed.
  • (Optional) Install zip: sudo apt install zip
  • Zip the folder so that it can be uploaded on lambda layers cd layer && zip -q --recurse-paths layer.zip

Use following commands to achieve the same result but from a jupyter notebook:

!rm -rf layer/python
!mkdir -p layer/python
!pip install -q --target layer/python PyNaCl
!cd layer && zip -q --recurse-paths layer.zip .
  • In the AWS Lambda service, navigate to Layers, click on the “Create Layer” button, provide a suitable name, upload your zip file, and select the correct architecture (in our case, it is X86). Choose Python 3.9 as the compatible runtime.
  • Now, go back to your Lambda function, click on “Add a layer” under the Layers section. Select “Custom layer” from the custom layers dropdown, and then choose your layer from the list. In the version dropdown, select the latest version (there should be one if you have uploaded the zip file only once). Finally, click on “Add.”

7. Now, we need to add an API Gateway endpoint that our Discord bot will invoke to send commands. This API Gateway URL will be connected as a trigger to our Lambda function.

  • In your Lambda function interface, click on “Add trigger.”
  • Under trigger configuration, select API Gateway, and then choose “Create a new API.” Select REST API and mark the security as Open. Leave the advanced settings as they are (unless you know what you are doing), and click on the “Add” button.
  • Come back to your lambda function interface, in configuration panel copy the API endpoint.
  • Go back to your Discord developer portal of your app, in OAuth2 settings, under General, select both scopes and check Administrator bit permissions.

After you have finished creating an MVP system, start removing the permissions one by one to practice the least access policy principle.

  • Go back to General Information tab, at the bottom under INTERACTION ENDPOINT URL input, paste the API endpoint you got from lambda trigger.

It will throw an error if Discord cannot hit your endpoint and receive a response. In that case, either the lambda function code has an error, or the API gateway is misconfigured.

8. Now, we need to register a command for our Discord bot so that we can interact with it.

There are two types of commands: global commands and guild commands. Global commands may take some time to reflect, while guild commands are instantly registered, making them ideal for R&D purposes.

For our hello world, we will register a global command. To do that, you will need your server ID and bot token.

  • Right-click on your server and click on “Copy Server ID” to save it.
  • In your Discord developer portal, go to the Bot menu option, click on “Reset Token,” then “Yes, do it,” and finally copy and save the token.
  • In your project directory, create a new file named “register_command.py” and paste the following code. Don’t forget to replace the server ID and bot token.

Now from terminal run python register_command.py . It should return an output.

9. Go to your discord app, type /blep and it should show the blep command we registered.

I registered two commands, one global and one for the server. According to Discord’s documentation, when making a command POST request, it upserts the command, meaning it creates a new one if it doesn’t exist or updates the existing command if it does. In my case, it took a while for the global command to appear.

Send the message with /blep command and it will return Hello World.

To add the bot to your server, click on the bot response and then click on “Add to Server.”

If you’ve successfully done that, congratulations!

Next, we will register two new commands, edit our lambda code to handle those commands, create an SQS to start a Sagemaker processing job, and create processing job scripts and a container image. All of this will be covered in the next section.

Create a Dynamodb table

To store user requests, track them, and update their status, we need an efficient mechanism for persistent storage. In this example, I have used DynamoDB, but you can choose any key-value pair database, an RDS, or even object storage like S3.

  1. Go to Dynamodb service in AWS and click on create table

2. Give a suitable table name.

3. In this example, our partition keys are unique (we will use the Discord request ID as the tracking ID for report execution), but you can also use non-unique keys as partition keys, such as customer_id for whom the report was generated or the report requester ID. In these cases, you must also add a unique sort key like request_id, or you can leave it blank.

4. For our example, you can leave it with the default settings and add tags as you desire before clicking on “Create Table.”

Edit Lambda

Edit the lambda we created earlier with the following code:

Let’s go through the code at a high level. It has 9 function.

  1. Verify signature — this function has not changed since the last version of code
  2. Parse options — parses discord mesage options
  3. parse user — parses user details from the event body
  4. parse request details — parses request id and request timestamp from request object in the event
  5. add items to dynamodb — adds the message details to dynamo db. Our processing job later on, will get items from same table using this request_id and will update the fields that we are leaving blank here
  6. get_dynamo_item — is called when check_status command is invoked. Using request_id it gets the status key from the dyanmo db table
  7. send sqs message — sends the command with necessary data to the SQS which futher triggers the lambda to execute the report. Remember to add the environment variable SQS_SAGEMAKER_PIPELINE_TRIGGER queue url when we will create a SQS later on
  8. command handler — we have extended this function to handle two more commands, check_status and get_report
  9. lambda handler — main lambda handler hasn’t changed

Environment Variables

This function has 2 environment variables:

  1. PUBLIC_KEY — obtained from your discord developer portal, under your app’s General Information, there will be a button to ‘Copy’ public key
  2. SQS_SAGEMAKER_PIPELINE_KEY — SQS public URL, add this when we will create a SQS queue.

IAM

To run this lambda we will have to add some more permissions:

dynamodb:GetItem
dynamodb:PutItem

On your dynamodb table

sqs:SendMessage
sqs:GetQueueAttributes

On the SQS queue

logs:CreateLogGroup
logs:CreateLogStream
logs:PutLogEvents

(logs can have wildcard * resource)

Register commands

We are going to register two new commands,

  • get_report — which will start report execution, and
  • check_status — which will check status of report execution

You can also extend your bot with other commands based on your application’s needs, such as generating a new presigned URL for an already generated report or checking when a certain report was last requested, etc.

Don’t forget to replace SERVER_ID and BOT_TOKEN. APP_ID can be retrieved from discord developer portal, in General Information, Application ID section.

You will received a bigger response with server commands as compared to global commands.

After this as you will type / in your server, you will see three commands, blep, get_report and check_status.

Change the above lambda code to test whether input from these two new commands successfully reaches lambda or not.

Build image for Sagemaker Processing job container

We will use a custom image for our processing job, which will include the papermill and discordwebhook packages, along with basic packages such as plotly, pandas, ipykernel, and boto3.

In the above Dockerfile, I am using micromamba as the base image. Alternatively, you can choose other base images like python:3.9-slim-buster. However, keep in mind that it will require adjustments to the entrypoint in the Dockerfile and the processing job API.

There is no need to ‘burn’ the notebook in the image as the Sagemaker processing job will automatically download it from S3 during production. For “local” runs, it will upload it to S3 along with our entrypoint script.

Micromamba comes pre-installed with different Python versions; that’s why we install Python 3.9. If you use a Python 3.9 base image, there is no need to install Python and pip again.

  1. Create a new private repository (Sagemaker is not compatible with public repos).
  2. Build your Docker image and then upload it to ECR. You can use push commands by going to your repo and clicking on the “View push commands” button. Make sure you have AWS CLI installed and your AWS profile configured, otherwise, you will not be able to push images to ECR.

If you choose a different base image, your Dockerfile might look like this (your processing job API entrypoint will change if you use a different base image):

FROM python:3.9-slim-buster

LABEL maintainer="Mahesh Rajput"

RUN pip3 install --no-cache-dir ipykernel \
ipywidgets \
pandas \
plotly \
papermill \
boto3 \
discordwebhook==1.0.3 -q

ENTRYPOINT ["python"]

Main report

We will prepare a sample notebook that will print a set of parameters (which are what you will likely use in production), along with 2 dummy Plotly charts so that our final HTML is not very bland.

To inject parameters to our report and execute it through CLI or python command, we will use papermill.

But before that, enable tags in your notebook by going to View -> Cell Toolbar -> Tags. In the cell that has parameters, add a parameters tag.

To test the report I will suggest, create a local environment that has all the packages of container-env.yaml installed in your conda [recommended] or virtualenv.

To test the report run papermill report.ipynb gen_report.ipynb -r report_type copy_everything_at_once -p customer_id 8989898 . You should have another file in your report.ipynb directory gen_report.ipynb . Refer to official papermill documentation for more info on execution commands.

To convert .ipynb to html run jupyter nbconvert --to html gen_report.html --no-input . You can also add other tags like remove-cell, no-input and have those cells removed in your final html.

R&D Sagemaker Processing job

If you have experience with sagemaker jobs, pipelines etc then you can skip this section.

We will run processing job locally in this section to test our report execution and debug the errors.

  • Install a new kernelspec that our notebook will use. The environment of this kernelspec must have sagemaker and boto3.
    --python -s -m ipykernel install --user --name <ENV_NAME> --display-name "Python Sagemaker RnD"
  • Start the jupyter notebook or jupyterlab from any env that has the installed.
  • Imports
import os

import boto3
import sagemaker
from sagemaker.processing import ProcessingInput, ScriptProcessor
  • Sagemaker execution role (you can copy it from IAM roles)
role = ""
  • Create instance of script processor

report_processor = ScriptProcessor(
role=role,
image_uri="report-bot-conda:2", #"ECR_REPOR_IMAGE_URI",
command=["/usr/local/bin/_entrypoint.sh", "python"],
instance_count=1,
instance_type="local", # <- this makes the job run locally
volume_size_in_gb=10,
# max_runtime_in_seconds=3600, # Not supported in local mode
base_job_name="Report-Executor",
tags=[
{
"Key":"type:personel",
"Value":"blog"
},
{
"Key":"Name",
"Value":"Blog-Report-Processing-Job"
}
]
)

For local testing we can use the image name from our local docker builds. Note the command because we are using micromamba, this will change to just [“python”] if we will use python3.9 base image.

Instance type “local” is what makes the execution run on our machine, if we will change this to a valid processing instance type like ml.t3.medium then it will use sagemaker resource.

Because max_runtime argument is note supported in “local” mode , we comment it.

  • Start the job
report_processor.run(
code="main.py",
inputs=[
ProcessingInput(
source="./report.ipynb",
destination="/opt/ml/processing/input/report",
input_name="report",
)
],
outputs=[],
arguments=[
"--report_type",
"mistaken",
"--customer_id",
"998989",
"--user_id",
"945250524129800203",
"--request_timestamp",
"2387823283",
"--request_id",
"e06316a1-b12e-4ee8-a0d2-04cd9300675e",
],
wait=True,
logs=True,
)

Code argument, is our entrypoint script. This installs the kernel in the container, executes the report using papermill, updates the dynamodb table of report status and pull discord webhook to send message to user on discord.

  • Contents of main.py

Our main.py has 6 functions.

  1. parse args — parses arguments received during job execution. We will pass these arguments as parameters to our notebook.
  2. create presigned url — generates a presigned url of an S3 object. I have set the expiration time to 90 seconds, you can change it as per your desire.
  3. run papermill — executes our notebook through papermill
  4. upload artifacts to S3 — uploads out generated report html to S3 and also calls presigned url function to get a download link
  5. pull discord webhook — pulls discord webhook to notify the report requester about the report status. Don’t forget to change the DISCORD_WEBHOOK_URL with your webhook link
  6. update dynamodb — updates our dynamodb table with the report status.

In the condition __name__=="__main__" , we first parse the arguments, move report to currect directory (because we put it in different location in the processor input parameter), install the kernelspec otherwise papermill will not be found, run the report, convert the .ipynb to html file, upload it to S3 and generates a presigned URL (which expires 90 seconds after generating it), send message to user with report download link and finally update our dynamo db table with necessary details.

For me, working on the main.py file was one of the one of the most time-consuming portions of this solution.

Change the S3_BUCKET and DISCORD_WEBHOOK_URL variables with your values.

Once you will execute the report_processor.run cell, it will start the execution. Debug the code based on the errors you will see.

Production suggestions

  1. If you want to interact with databases that are behind the VPC, use the vpc_config argument to provide subnets and security groups to the processing job.
  2. Always store database credentials and other secrets in AWS Secrets Manager (if using AWS). DO NOT mount them as environment variables; it is an anti-pattern. If credentials are changed, you will have to manually update them in all places of all your production solutions.

Create a SQS and new Lambda function

Why

The above lambda needs to respond quickly to user commands and handle other user requests. For report executions, we will send each get_report command to SQS.

We’ll create a new Lambda that will be triggered by new messages in SQS, and this Lambda will start the Sagemaker processing job.

The Sagemaker processing job will execute the notebook using Papermill, update the report status (success or failure), and send a message to the report requester via a Discord webhook about the status of the report. If the report execution is successful, it will include a presigned download link obtained from S3 in the message.

  1. In Amazon SQS, click on Create Queue
  • I chose Standard, depending on your problem, you can also go for FIFO but keep in mind the difference in throughput between them.
  • Give a suitable name
  • Under configuration, choose your suitable retention period and other parameters to your liking. Normally I select message wait time as 20 seconds for long polling, but your project can demand different configuration.
  • In access policy, you can leave it default or add your lambda permissions.
  • We will not create a dead letter queue for this SQS, but you should one for your production solution.
  • Add tags if you desire.

Copy the SQS URL and add it in environment varible of the first lambda we created.

2. Go to AWS Lambda, create a new function with same runtime Python3.9 and architecuture as you did before.

Once it is created, we will need 5 things:

a. Function code
b. Adding SQS trigger
c. Adding sagemaker layer
d. Adding environment variables
e. Adding roles to your function IAM policy

a. Function code

In the above code, we have 2 functions:

  1. “execute_processing_job” — which starts the processing job and exits.
  2. “main_lambda_handler” — which calls “execute_processing_job” and also reports back to SQS on failed messages. This is very important; if you do not return this response, your message will stay in the queue until it expires (i.e. SQS will not delete the message).

Note that we have set the argument wait=False in the Lambda because, unlike the notebook, we want our Lambda to start the job and terminate. The report status will be handled by the processing job.

b. In your lambda function, click on Add trigger and select SQS, choose the right queue from dropdown, change batch size and batch window as your desire.

It will be better to comment line 80 and 81, in the lambda function code and test the trigger. Once you feel confident uncomment them.

c. Adding sagemaker layer

Just like we created a PyNaCl layer earlier, repeat the same steps but for sagemaker sdk.

!rm -rf sm-layers/python
!mkdir -p sm-layers/python
!pip install -q --target sm-layers/python sagemaker
!cd sm-layers && zip -q --recurse-paths layer.zip .

Upload the zip file in Lambda service layers. Add the layer in your lambda function.

d. Environment variables

  • DISCORD_WEBHOOK_URL — go to discord server setting, integrations and create a new webhook. Click on Copy Webhook URL and add it as env variable.
  • SAGEMAKER_EXECUTION_ROLE — in your aws IAM roles, search for sagemaker and you will see AmazonSageMaker-ExecutionRole-XXXXXXX, click on this role and copy it ARN. Add this as another env varibale.
  • IMAGE_URI — copy the image URI that we uploaded to ECR
  • S3_BUCKET — s3 bucket name of your choice that stores our ipynb jupyter notebook.

e. Adding roles to Lambda IAM Policy

To be able to received and send message failures to SQS and execute sagemaker processing job (and S3 transportation) I added:

  • AmazonSageMaker-ExecutionPolicy-XXXXXXXX (available on aws)
  • Sagemaker-Processing-job permission
"sagemaker:DescribeProcessingJob"
"iam:PassRole"
"sagemaker:AddTags"
"sagemaker:CreateProcessingJob"

with wildcard * resource
  • AWSLambdaSQSQueueExecutionRole — available on aws

For the CI/CD of this solution, I have uploaded our Sagemaker processing job entrypoint script along with the Lambda function (otherwise, you should upload it to S3 and refer to the S3 URI in the processing job code parameter).

Alternate architecutre

You don’t have to use Sagemaker processing job. I can think of the following alternate architectures:

Using ECS

Instead of starting a processing job, our Lambda triggered by SQS will start a FARGATE task. This can save you costs depending on your report type, and it is as scalable as Sagemaker processing job. However, we are using a part of this architecture in my company’s production. Despite being the one who suggested this architecture, I don’t want to invite controversy by basing my whole blog on this. But I assure you this architecture works well too. Just be mindful of the CI/CD for this one.

Using Sagemaker pipeline

Sagemaker pipeline has a callback as well as a Lambda step. The callback sends a message to an SQS, and a new message in SQS triggers a Lambda. Then this Lambda can update the user on Discord with the report status.

This allows us to decouple processing job fomr updating the status and sending response to discord user.

Using State machine (Step Functions)

Use this onlt if you are sure your report can execute within the limitations of Lambda.

Apache Airflow

Use Airflow with either Lambda or Sagemaker and integrate the report generation with your existing data pipelines. Your data engineers will thank you for this 😊

End to end run

  1. Send a message on discord

Note the tracking id, cda28 and report type with customer id

2. You will see logs in your discord bot command handler function, SQS queue, lambda that is triggered by a new message in SQS and lastly sagemaker processing job will show a new job:

3. Check back after a while and it should show either Failed or Completed status.

4. If you miss checking job status, discord bot should notify you with a new message with report status. Note the tracking id cda28 in response.

5. And lastly we can check the status of report with the check_status bot command

6. I also downloaded the html from the URL and it shows the correct input report type as well customer id.

If you encounter back-to-back errors, do not get disheartened. It took me 2 weekends and dozens of failed attempts to complete this seemingly simple project. Keep trying, and you’ll get there!

Cost

The services that cost me on my AWS account were mainly Sagemaker because I am on the AWS Free tier. However, even outside the free tier, the total cost of this R&D project is not likely to exceed $3–$5.

To further reduce costs, you can use your local computer as an endpoint instead of Lambda for your Discord bot and have the server only start the processing job. This way, you can optimize expenses while still achieving your objectives.

Alternate uses of Discord bot

In my opinion, using a Discord bot with Lambda is best suited for solutions like QA chatbots and other generative AI applications. Of course, I am speaking from the perspective of a Machine Learning Engineer.

There are numerous other tasks that can be automated using Discord bots, such as:

  • New employee induction
  • Sending birthday wishes
  • Assigning tasks
  • Reminders and scheduling for meetings
  • Checking system status, and many more.

The flexibility and ease of implementation make Discord bots a versatile tool for streamlining various processes and enhancing overall efficiency.

Thank you reading the article.

You can connect with me on LinkedIn: https://www.linkedin.com/in/maheshrajput/

My website: https://mrmaheshrajput.github.io/

--

--