Voiced by Amazon Polly |
Overview
Amazon SageMaker provides cloud-oriented services which are at the heart of Data Science workflows. We can leverage the customization of notebooks catered to our tasks and increase Data Science productivity!
Customized Cloud Solutions to Drive your Business Success
- Cloud Migration
- Devops
- AIML & IoT
Introduction
Amazon SageMaker
Amazon SageMaker is a group of ML services fully managed by the AWS cloud. SageMaker supports Frameworks and toolkits like Jupyter, TensorFlow, PyTorch, etc. It enables developers to create, train, and deploy ML models in the cloud.
Background on Amazon SageMaker Notebooks
A SageMaker notebook is an ML instance running on Jupyter App. It is like running Jupyter notebook locally but only done on AWS cloud with different compute and memory power. They are efficient and easy to deploy to prepare and process data, write code to train ML models, or deploy/host them. Notebooks can run on prebuilt kernels which are optimized for specific tasks; for example, conda_pytorch_p36, Sparkmagic, conda_python3, etc. SageMaker Notebooks support native Amazon Linux 2(AL2) and Amazon Linux (AL1) OS, and they are fully maintained by the AWS cloud itself. The instances themselves can run on many instance types (with differing CPUs and memory power) and as per your requirements, they can be deployed.
Within a SageMaker instance, we can even create multiple Notebook instances, and each instance runs separately or is a standalone instance
Some of the features that SageMaker notebook instances are:
- Fully managed and Scalable cloud infrastructure
As it is a serverless service, AWS cloud takes care of all the infrastructure for you. This includes software and security updates/patches, maintenance, etc.
- Support for TensorFlow, MXNET, Keras, etc
There is default support for ML-related libraries on every SageMaker instance and other libraries can be imported or customized from the start using a lifecycle configuration, suitable for an ML project or task.
- Automated labeling tool and workflow
SageMaker Ground Truth can be utilized for labeling tasks that can be pivotal in ML models.
There are a lot of other features that are native to the SageMaker instances and others that can be integrated with it.
Lifecycle Configuration: Customizing a Notebook Instance
A lifecycle configuration is just really a shell script that only runs whenever you create a notebook or start one. Every time a notebook is created, a new lifecycle configuration is created, and the scripts run with it. For example, a sample lifecycle script looks something like this
A lifecycle configuration always runs as the root user, or it should be run as one. To affect only the Jupyter kernel, we need to set the source and then pip installs any packages we want or are required in any of the environments or the notebooks. It should be noted that all of this is done in the conda environment, in which most of the kernels operate
We can also have a package that is covered and accessed in all conda environments, and the base directory where all the environments should be set to “/home/ec2-user/anaconda3/envs/*”. We can also install monitoring on the instances every time an instance starts, which can then send the logs to the cloud watch for generating alerts. This use-case makes it easy since we don’t have to do it every time, we run an instance notebook or do it manually from the terminal when we can do it from the AWS console. Another use-case would be to run some tasks periodically which often requires lambda to be run in the background.
Step-by-Step Process to Implement Lifecycle Configuration for a Notebook Instance
To create a lifecycle configuration, we must do the following:
- In the SageMaker console, under SageMaker Dashboard, go to Lifecycle Configuration
- We create our first lifecycle script that will run every time we run a notebook instance
3. Now we must write a custom script installing packages/libraries that are required by you every time you run a notebook instance
In the following code, I am entering the shell as an ec2-user, which is the root user and then selecting the environments that need to have those libraries installed. The envs are then looped through and activated and each of the following libraries is installed, and finally, then the same environment is deactivated to finish the script process. I copied the script that I wrote into the script are then we click on the create configuration.
4. Now when we create an instance, in the Additional configurations, we need to add the select the Script
Conclusion
SageMaker provides us with fast, scalable notebook instances that can be launched in a matter of minutes, and with the help of lifecycle configurations, we can make notebook instances curated to our needs, making them highly customizable and easy to manage from the start to moment the notebook instances stop running. In this way, we can be more productive in our Data Science projects or tasks without having to do things repeatedly, saving us a lot of time.
Get your new hires billable within 1-60 days. Experience our Capability Development Framework today.
- Cloud Training
- Customized Training
- Experiential Learning
About CloudThat
CloudThat is a leading provider of Cloud Training and Consulting services with a global presence in India, the USA, Asia, Europe, and Africa. Specializing in AWS, Microsoft Azure, GCP, VMware, Databricks, and more, the company serves mid-market and enterprise clients, offering comprehensive expertise in Cloud Migration, Data Platforms, DevOps, IoT, AI/ML, and more.
CloudThat is the first Indian Company to win the prestigious Microsoft Partner 2024 Award and is recognized as a top-tier partner with AWS and Microsoft, including the prestigious ‘Think Big’ partner award from AWS and the Microsoft Superstars FY 2023 award in Asia & India. Having trained 850k+ professionals in 600+ cloud certifications and completed 500+ consulting projects globally, CloudThat is an official AWS Advanced Consulting Partner, Microsoft Gold Partner, AWS Training Partner, AWS Migration Partner, AWS Data and Analytics Partner, AWS DevOps Competency Partner, AWS GenAI Competency Partner, Amazon QuickSight Service Delivery Partner, Amazon EKS Service Delivery Partner, AWS Microsoft Workload Partners, Amazon EC2 Service Delivery Partner, Amazon ECS Service Delivery Partner, AWS Glue Service Delivery Partner, Amazon Redshift Service Delivery Partner, AWS Control Tower Service Delivery Partner, AWS WAF Service Delivery Partner, Amazon CloudFront Service Delivery Partner, Amazon OpenSearch Service Delivery Partner, AWS DMS Service Delivery Partner, AWS Systems Manager Service Delivery Partner, Amazon RDS Service Delivery Partner, AWS CloudFormation Service Delivery Partner, AWS Config, Amazon EMR and many more.
FAQs
1. What are some of the benefits of using SageMaker?
ANS: – SageMaker is a fully managed, server-less machine-learning cloud service which can be leveraged to do ML model creation at scale. We can create and manage SageMaker notebooks and automate them. This requires little effort and is one of the major advantages of using AWS SageMaker.
2. Are there pertained models available on SageMaker?
ANS: – SageMaker Studio in fact has a plethora of highly accurate, pre-trained models and algorithms/solutions at hand if one wants to quickly deploy and test, or maybe use them in the projects.
3. Can I customize my SageMaker notebooks?
ANS: – Yes, SageMaker notebooks can be configured using lifecycles rules to cater to one’s needs and requirements.

WRITTEN BY Mohmmad Shahnawaz Ahangar
Shahnawaz is a Research Associate at CloudThat. He is certified as a Microsoft Azure Administrator. He has experience working on Data Analytics, Machine Learning, and AI project migrations on the cloud for clients from various industry domains. He is interested to learn new technologies and write blogs on advanced tech topics.
Comments