Automation, AWS Data Life Cycle Management, Cloud security

6 Mins Read

Amazon Security Lake: A centralized purpose-built data lake for security data. 


Security best practices require effective logging across various resources and security event data management processes to centralize and analyze data. Logs from firewalls, on-premises infrastructure, and cloud services such as Amazon VPC and Amazon CloudTrail are collected into Amazon S3 and AWS Lake Formation to simplify the management of the AWS data lake. But still, it isn’t easy to implement security domain-specific aspects such as data ownership, normalization, and enrichment. Amazon Security Lake can be used to analyze security data to get a complete insight into your security across the entire organization. With Amazon Security Lake, you can create a purpose-built customer data lake that automatically centralizes security data from on-premises, custom sources, and the cloud. It also helps to protect your applications, workloads, and data.

In this blog, we discuss Amazon Security Lake, a new service launched in November 2022 to centralize, manage and optimize large volumes of logs and event data to enable incident response, threat detection, and investigation to address and analyze security issues using preferred analytics tools.

Challenges for Security Team

Customers want to prevent their entire organization from future security events by identifying potential threats and vulnerabilities, assessing security alerts, and responding accordingly by collecting logs and event data from different sources. To gather security insights from the data, the data needs to be aggregated and normalized into a consistent form. It is very time-consuming and costly, as customers use different security solutions for specific use cases with their own data stores and formats. There are mainly four challenges faced by the security team while analyzing the organization-wide security data:

1. Large Volumes of Security Data: The logs and data events are collected from various data sources from on-premises infrastructure, cloud, and custom sources; a huge amount of data is collected over a short span of time. To get effective security insights from aggregated data, sometimes it’s necessary to store data for a long duration, leading to storage in GBs or TBs.

2. Inconsistent and Incomplete Data: As the logs and event data are collected from different sources, different types of logs have different formats, making it difficult to query them. You must get log data to gain visibility. It is important to properly configure security solutions for your applications and workloads. Also, some security solutions store logs only for a specific period, like 30 days, but what if we need data for a longer period?

3. Lack of Data Ownership: Direct data ownership is another challenge. Customers ingest the security data to analytic solutions to get insights, because of which data is insulated from the security industry. Many innovations happening in the security industry needs data ownership.

4. More Data Wrangling, less analysis: It is necessary to track infrastructure changes, generate alarms, get performance, and normalize data regularly requires more manpower. Achieving this in a defined budget is complex and leads to data wrangling instead of accurate analysis.

Amazon Security Lake is the solution to automate the security data analysis in your entire organization.

Attend Any 5 AWS Certification Trainings at the cost of 2 with AWS Mastery Pass

  • AWS Authorized Instructor led Sessions
  • AWS Official Curriculum
Subscribe Now

Amazon Security Lake

Amazon Security Lake automatically centralizes security data into a purpose-built data lake in your account. It aggregates, normalizes, and manages security data across your entire organization into a security data lake, which further helps to analyze security data using preferred analytics solutions.

Features of Amazon Security Lake

1. Data aggregation: Amazon Security Lake creates a purpose-built security data lake in your account, collects logs and event data from various sources like cloud, on-premises, and custom sources, and stores it in Amazon S3 buckets so that you have control and ownership of your security data.  

2. Data Normalization and Support for OCSF: Security Lake has adopted an open standard, the Open Cybersecurity Schema Framework (OCSF), to normalize and combine security data from various enterprise security data sources and AWS. You can aggregate and normalize data from Amazon VPC Flow Logs, AWS CloudTrail Management events, Amazon Route 53 Resolver query logs, and security findings from solutions integrated through the AWS security hub and from custom data and third-party security solutions into OCSF format. With support for OCSF, Security Lake makes security data available to your preferred analytics tool.   

3. Multi-account and multi-Region support: Amazon Security Lake service can be enabled across multiple accounts and regions where the service is available. Security data across accounts can be aggregated per region or consolidated from multiple regions into roll-up regions for compliance requirements.  

4. Data lifecycle management and optimization: The lifecycle of security data is managed by setting the retention period and storage costs with automated tiering using Amazon Security Lake. It also automatically partitions and converts security data to storage and query efficient Apache Parquet format.  

Configure Amazon Security Lake for Security Data collection

Prerequisite to configure Amazon Security Lake  

  1. To start with Amazon Security Lake, first, delegate an AWS account with it from the management account of AWS Organization. The delegated account enables Amazon Security Lake, which aggregates security data across multiple accounts and regions. You can also enable Amazon Security Lake for a standalone AWS account.   
  2. To enable Amazon Security Lake to perform ETL (Extract, Transform, and Load) jobs on logs and event data from various sources, create a role named AmazonSecurityLakeMetaStoreManager, so you can be able to create a data lake or query data.  

Once the AWS account is delegated to enable Amazon Security Lake and the role is created, you can configure Security Data Lake in your account to aggregate, normalize and manage security data from various data sources.  

Step 1: Define the collection Objective: To enable Amazon Security Lake, select data sources, regions, and accounts and specify the role ARN created in the prerequisite.  

Figure 1: Define Collective Objective  

Step 2: Define Target Objective: In this step, you define the roll-up region and set storage classes, if required, so that data is ingested from the multiple areas and accounts in your organization.  

Figure 2: Define Target Objective and Enable Amazon Security Lake  

Step 3: In the Sources options, you can enable different data sources like CloudTrail, VPC Flow Logs, Route 53, and Security Hub Findings in all or a specific region.  

Figure 3: Different Data Sources enabled across multiple regions.  

Step 4: You can view the Regions in which buckets are created and can view the buckets for logs stored in Apache Parquet format.  

Figure 4: Region-wise buckets and Logs in Apache Parquet format  


Amazon Security Lake is a fully managed security data lake service that automatically centralizes, normalizes, manages and analyzes security data from various sources like AWS and third-party into a data lake stored in your AWS account. It is easy to enable and aggregate logs and event data from the cloud, on-premises, and custom sources in a few clicks.

Get your new hires billable within 1-60 days. Experience our Capability Development Framework today.

  • Cloud Training
  • Customized Training
  • Experiential Learning
Read More

About CloudThat

Incepted in 2012 is the first Indian organization to offer Cloud training and consultancy for mid-market and enterprise clients. Our business goal is providing global services on Cloud Engineering, Cloud Training and Cloud Expert Line. The expertise in all major cloud platforms including Microsoft Azure, Amazon Web Services (AWS), VMware and Google Cloud Platform (GCP) position us as pioneers in the realm. 




    Click to Comment

Get The Most Out Of Us

Our support doesn't end here. We have monthly newsletters, study guides, practice questions, and more to assist you in upgrading your cloud career. Subscribe to get them all!