Voiced by Amazon Polly |
Overview
AWS LakeFormation is a fully managed service that helps organizations build, secure, and manage data lakes on Amazon Web Services (AWS). It simplifies the process of setting up and managing data lakes by providing capabilities for data ingestion, storage, and data access. One key feature of AWS Lake Formation is its fine-grained permission model, which allows organizations to have granular control over data access and security.
The fine-grained permission model in AWS Lake Formation enables organizations to define precise access controls at different levels, including databases, tables, columns, and rows. This level of granularity ensures that data is accessed only by authorized users or groups, protecting sensitive information and maintaining data privacy and compliance. Amazon Athena can be used to query data that is registered with AWS LakeFormation. Here we are taking an example of Amazon Athena, but one can very well use Amazon QuikSight, Amazon Redshift, Amazon EMR, and Amazon SageMaker with Amazon LakeFormation.
Athena User Request Workflow
Source: https://docs.aws.amazon.com/images/athena/latest/ug/images/lake-formation-athena.png
Let's understand how a query is initiated in Amazon Athena
Step 1: Register the S3 bucket in AWS LakeFormation
Step 2: Once registered, you can set up user access for the data.
Step 3: When a user fires a query using Amazon Athena to access the data, it sends the user credentials to AWS LakeFormation
Step 4: AWS LakeFormation validates the credentials and provides a temporary token to access the data
Step 5: The user then gets access to data based on the temporary token
Reference Architecture for Demonstration
So, we have two IAM users,
- Nehal – Administrator User
- Tina – Customer User
Step 1: Both users initiate a query request via Amazon Athena with their individual IAM roles.
Step 2: Using the get data access permission, AWS LakeFormation requests temporary credentials to access the data.
Step 3: It then checks whether the AWS LakeFormation Service role has access to the data in S3 or not.
Step 4: Based on the access, it gets a temporary token.
Step 5: Then this temporary token is passed back to Amazon Athena, which assumes the AWS LakeFormation service role.
Step 6: Finally, S3 Get Object API call is made through the AWS LakeFormation service role, and data is written back to the user.
Customized Cloud Solutions to Drive your Business Success
- Cloud Migration
- Devops
- AIML & IoT
Demonstration
The goal is that Nehal being the administrator user, should get access to the entire dataset while Tine being a consumer, used to be able to access only selected fields like name, phone number, dob, address, and city.
Step 1: Create 2 IAM users, one being Nehal having Administrator access and Tina being a consumer with the following set of permissions.
Step 2: Next, we create an S3 bucket and upload the sample data to a data folder.
Our sample data looks like this.
Step 3: Next, we set up the AWS LakeFormation to build our data lake.
Step 4: First will create a database inside AWS LakeFormation
Step 5: Next step in AWS Lake Formation, register our S3 bucket using AWS Lake Formation Service Role.
Step 6: Next, we will grant Glue Role onto the database we created so that once we run the Glue Crawler, the Glue Crawler is able to populate the tables inside this database.
Step 7: Here, we create a Glue Crawler to catalog the data present inside S3
Step 8: Create a Glue Crawler and run it
Step 9: The Glue Crawler runs and takes some mins to populate the tables inside our database created
Step 10: You can check out the schema populated in the database in AWS LakeFormation
Step 11: Now jump onto the Amazon Athena to query the data. Since Nehal is the administrator User, so once she fires a SQL query using Amazon Athena, she can see all the data.
Step 12: Now, let’s grant permission to Tina IAM User, who is a consumer user on a selected column
Step 13: Validate the results by jumping onto the Amazon Athena with the Tina User login in
Summary
The fine-grained permission model in AWS Lake Formation provides organizations with granular control over data access and security in their data lakes. By defining precise permissions at different levels, organizations can protect sensitive data, enforce data governance policies, and ensure compliance with regulations. The scalability and flexibility of the permission model, combined with integration with other AWS services, make AWS Lake Formation a powerful solution for managing and securing data lakes in the cloud.
References
https://docs.aws.amazon.com/athena/latest/ug/lf-athena-access.html
https://docs.aws.amazon.com/athena/latest/ug/what-is.html
https://docs.aws.amazon.com/lake-formation/latest/dg/what-is-lake-formation.html
Get your new hires billable within 1-60 days. Experience our Capability Development Framework today.
- Cloud Training
- Customized Training
- Experiential Learning
About CloudThat
CloudThat is an official AWS(Amazon Web Services) Advanced Consulting Partner and Training partner, AWS Migration Partner, AWS Data and Analytics Partner, AWS DevOps Competency Partner, Amazon QuickSight Service Delivery Partner, AWS EKS Service Delivery Partner, and Microsoft Gold Partner, helping people develop knowledge of the cloud and help their businesses aim for higher goals using best-in-industry cloud computing practices and expertise. We are on a mission to build a robust cloud computing ecosystem by disseminating knowledge on technological intricacies within the cloud space. Our blogs, webinars, case studies, and white papers enable all the stakeholders in the cloud computing sphere.
To get started, go through our Training page and Managed Services Package, CloudThat’s offerings.
WRITTEN BY Nehal Verma
Click to Comment