AWS, Cloud Computing, Data Analytics

4 Mins Read

Exporting Filtered AWS CloudTrail Lake Events to Amazon S3 for Targeted Analysis

Voiced by Amazon Polly

Introduction

Monitoring cloud activity is essential for ensuring security, compliance, and operational excellence in any AWS environment. AWS CloudTrail Lake allows you to store, aggregate, and analyze activity logs, providing deep insights into API usage across your AWS accounts. With the integration of Amazon Athena, organizations can now perform zero-ETL queries directly on CloudTrail Lake data, making analysis more efficient and manageable.

Amazon Athena is a serverless interactive query service that lets you run SQL queries against data in Amazon S3 and other supported sources. Security and operations teams can investigate events more effectively by combining Athena’s querying capabilities with AWS CloudTrail Lake’s comprehensive activity logs. This is especially useful when correlating logs from CloudTrail with application or traffic logs stored in Amazon S3.

AWS CloudTrail, by default, captures a wide range of activity, making it challenging to isolate specific events relevant to a particular investigation or compliance requirement. For instance, if your organization needs to focus only on monitoring changes to Amazon S3 buckets for compliance reporting, analyzing the AWS CloudTrail logs can be inefficient. AWS can export a filtered subset of AWS CloudTrail Lake logs to Amazon S3 to address this. This enables teams to conduct focused, targeted analysis while avoiding data overload.

This blog walks you through exporting a subset of AWS CloudTrail Lake events to Amazon S3 in Parquet format. Whether your goal is to meet compliance objectives, perform security investigations, or streamline operational monitoring, this automated solution provides a clear and efficient pathway.

Pioneers in Cloud Consulting & Migration Services

  • Reduced infrastructural costs
  • Accelerated application deployment
Get Started

Solution Overview

This solution helps you automate the export of filtered AWS CloudTrail Lake data to Amazon S3 using an AWS CloudFormation template. The export is triggered by an Amazon EventBridge scheduled rule, which invokes an AWS Lambda function to run a customized Amazon Athena query. The results are stored in Amazon S3 in Parquet format for efficient downstream processing or third-party integration.

Key Components

  1. Amazon EventBridge Rule: This scheduled rule defines when the Lambda function is invoked. You can set this to run at regular intervals (e.g., daily or hourly).
  2. AWS Lambda Function:
    • Reads a SQL query from a predefined Amazon S3 bucket.
    • Combines it with a CREATE TABLE AS SELECT (CTAS) statement.
    • Executes the query using Amazon Athena.
    • Saves the results to Amazon S3 in Parquet format.
    • Tracks the execution time using AWS Systems Manager Parameter Store.
  3. Amazon Athena Database and Table:
    Used for querying and storing temporary output tables created during query execution.
  4. Amazon S3:
    • Stores the exported logs in compressed Parquet format.
    • Hosts the SQL file that defines the custom filtering logic for the export.
  5. AWS CloudFormation Template:
    Automates deployment by provisioning all the required components, including AWS IAM roles, Amazon EventBridge rule, AWS Lambda function, Amazon Athena workgroup, and associated configurations.

Pre-requisites

Before deploying the solution, ensure you have the following:

  • An Event Data Store in AWS CloudTrail Lake with query federation enabled. This setting allows Athena to query the event data directly.
  • The Event Data Store ID is what you will use as a parameter when deploying the AWS CloudFormation stack.

Step-by-Step Walkthrough

Step 1: Deploy the AWS CloudFormation Template

Download the DeployResources.yaml AWS CloudFormation template and use the AWS Console to create a new stack:

  1. Go to the AWS CloudFormation
  2. Click on Create Stack > With new resources.
  3. Provide the following parameters:
    • Stack Name.
    • AWS Account Number and Region (for the Amazon Athena output location).
    • AWS CloudTrail Lake Event Data Store ID.
  4. The template auto-fills certain storage configurations.
  5. Leave default settings for stack options.
  6. Acknowledge AWS IAM resource creation and launch the stack.

Once the stack is deployed, the AWS Lambda function, AWS IAM roles, Amazon EventBridge rule, and required Amazon S3 buckets will be created.

Step 2: Upload Custom Query

Upload the demo.sql file containing the SQL query to your “Custom Query Bucket.” By default, this file includes a filter that exports only the GetBucketACL events from CloudTrail, corresponding to access control checks on S3 buckets.

You can modify this query to suit your needs. For example:

  • Filter by service (eventSource = ‘dynamodb.amazonaws.com’)
  • Filter by user action (e.g., disabling multi-factor authentication)

Step 3: Test the Solution

Instead of waiting for the scheduled EventBridge rule to trigger the Lambda function, you can test the setup manually:

  1. Navigate to the AWS Lambda function created by the stack.
  2. Use the “Test” option to trigger a manual execution.
  3. Capture the Amazon Athena Execution ID from the function logs.

To validate:

  1. Open Amazon Athena > Query Editor.
  2. Look for the recent query using the captured Execution ID.
  3. You should see a temporary table (e.g., temp_table_<timestamp>) created with the filtered results.

The results will also appear in the designated S3 bucket in Parquet format.

Step 4: Sample SQL Queries

Here are additional examples to customize your demo.sql file:

Example 1: Filter by service (e.g., DynamoDB)

Example 2: Track users disabling MFA

Step 5: Cleanup

To avoid incurring additional costs, delete the following once testing is complete:

  • AWS CloudFormation stack and associated resources
  • AWS IAM roles and policies
  • Amazon S3 buckets used in the deployment
  • AWS CloudTrail Lake event data store if no longer needed

Conclusion

This solution presents a powerful and scalable method to export and analyze specific AWS CloudTrail Lake data subsets. By automating the filtering and export process, organizations can drastically reduce the noise in their logs and gain targeted insights into API activity.

Whether you’re a cloud architect, security analyst, or DevOps engineer, this setup lets you focus on the most relevant events while maintaining access to complete contextual data. It streamlines security investigations, compliance reporting, and operational monitoring without the burden of processing full datasets. With this approach, your AWS environment becomes easier to manage, more secure, and better aligned with organizational goals.

Drop a query if you have any questions regarding AWS CloudTrail and we will get back to you quickly.

Empowering organizations to become ‘data driven’ enterprises with our Cloud experts.

  • Reduced infrastructure costs
  • Timely data-driven decisions
Get Started

About CloudThat

CloudThat is a leading provider of Cloud Training and Consulting services with a global presence in India, the USA, Asia, Europe, and Africa. Specializing in AWS, Microsoft Azure, GCP, VMware, Databricks, and more, the company serves mid-market and enterprise clients, offering comprehensive expertise in Cloud Migration, Data Platforms, DevOps, IoT, AI/ML, and more.

CloudThat is the first Indian Company to win the prestigious Microsoft Partner 2024 Award and is recognized as a top-tier partner with AWS and Microsoft, including the prestigious ‘Think Big’ partner award from AWS and the Microsoft Superstars FY 2023 award in Asia & India. Having trained 850k+ professionals in 600+ cloud certifications and completed 500+ consulting projects globally, CloudThat is an official AWS Advanced Consulting Partner, Microsoft Gold Partner, AWS Training PartnerAWS Migration PartnerAWS Data and Analytics PartnerAWS DevOps Competency PartnerAWS GenAI Competency PartnerAmazon QuickSight Service Delivery PartnerAmazon EKS Service Delivery Partner AWS Microsoft Workload PartnersAmazon EC2 Service Delivery PartnerAmazon ECS Service Delivery PartnerAWS Glue Service Delivery PartnerAmazon Redshift Service Delivery PartnerAWS Control Tower Service Delivery PartnerAWS WAF Service Delivery PartnerAmazon CloudFront Service Delivery PartnerAmazon OpenSearch Service Delivery PartnerAWS DMS Service Delivery PartnerAWS Systems Manager Service Delivery PartnerAmazon RDS Service Delivery PartnerAWS CloudFormation Service Delivery PartnerAWS ConfigAmazon EMR and many more.

FAQs

1. How frequently are the Amazon Athena queries executed?

ANS: – Query execution is scheduled using Amazon EventBridge. You can configure the cron expression in the AWS CloudFormation template to set your desired frequency, such as hourly, daily, etc.

2. What happens if the Amazon Athena query fails during execution?

ANS: – The AWS Lambda function includes basic error handling. You can inspect AWS Lambda logs in Amazon CloudWatch to troubleshoot the failure. Common causes include incorrect SQL syntax, missing permissions, or configuration errors.

WRITTEN BY Rachana Kampli

Share

Comments

    Click to Comment

Get The Most Out Of Us

Our support doesn't end here. We have monthly newsletters, study guides, practice questions, and more to assist you in upgrading your cloud career. Subscribe to get them all!