AWS, Cloud Computing

3 Mins Read

Automating Auditd and Amazon CloudWatch Log Stream Verification for Amazon EC2 with AWS Step Functions

Voiced by Amazon Polly

Introduction

In regulated or security-sensitive environments, it is critical to ensure that operating system–level auditing is always active. On Linux systems, auditd provides deep visibility into security events, while streaming those logs to Amazon CloudWatch ensures centralized monitoring, retention, and alerting.

But here’s the operational reality.

When you run dozens or hundreds of Amazon EC2 instances across environments, manually verifying whether:

  • auditd is running, and
  • logs are actually being shipped to Amazon CloudWatch

becomes nearly impossible.

We faced the same challenge and built a serverless, scalable verification framework using AWS Step Functions, AWS Lambda, and Amazon DynamoDB. The workflow automatically inspects every instance, validates the configuration, and stores audit evidence.

Pioneers in Cloud Consulting & Migration Services

  • Reduced infrastructural costs
  • Accelerated application deployment
Get Started

Problem Statement

Security policies might say:

  • auditd must be enabled
  • log streaming must be active
  • proof should be available

However, infrastructure drifts.

New instances may be launched from outdated AMIs, agents may fail, or configurations might change over time. Without continuous verification, teams only discover problems during incidents or audits, which is too late.

We needed a solution that:

  • works across large fleets
  • runs automatically
  • scales safely
  • provides traceable evidence
  • avoids manual SSH checks

Overview of the Solution

The system uses orchestration and batching to inspect instances in a controlled way.

High-level flow:

  1. AWS Step Functions starts the execution.
  2. A Lambda function discovers Amazon EC2 instances and divides them into batches.
  3. Each instance ID is sent to another AWS Lambda.
  4. AWS Lambda connects to the server via SSH.
  5. It validates auditd status and Amazon CloudWatch agent/log streaming.
  6. Results are stored in Amazon DynamoDB.

At the end, we have a centralized compliance record for the entire fleet.

AWS Services Used

Amazon EC2 – Hosts workloads we need to validate.

AWS Lambda

  • AWS Lambda 1: inventory and batching
  • AWS Lambda 2: instance verification

AWS Step Functions – Controls sequencing, retries, and progress.

Amazon DynamoDB – Stores results as audit evidence.

Amazon VPC – Required so Lambda can reach private instances.

AWS IAM – Provides secure permissions.

Architecture & Workflow

Step 1 – Orchestration Begins

AWS Step Function execution is triggered manually or via a schedule.

Step 2 – Instance Discovery

The first AWS Lambda query runs on Amazon EC2 and retrieves the list of running instances.
To prevent overload, it splits them into smaller groups.

This is important because connecting to too many servers at once can exhaust network or authentication resources.

Step 3 – Per-Instance Verification

For each instance, the second AWS Lambda:

  • SSHs into the machine
  • checks if the auditd service is active
  • validates whether logs are configured to stream to Amazon CloudWatch
  • captures the outcome

Step 4 – Evidence Storage

The result (pass/fail, timestamps, instance metadata) is written into Amazon DynamoDB.

Step 5 – Progression

The AWS Step Function continues until all batches are complete.

Important Networking Requirement

One key design condition:

The verification AWS Lambda must run inside the same Amazon VPC (or have routing access) to the Amazon EC2 instances.

Since many workloads are private, public connectivity is not available.
By attaching the Lambda to the appropriate subnets and security groups, secure internal access is achieved.

This is often the most crucial step people miss.

Why Amazon DynamoDB?

Auditors and security teams usually ask:

“Can you prove that monitoring was enabled?”

Amazon DynamoDB becomes that proof.

You get:

  • historical record
  • searchable data
  • integration capability with dashboards
  • ability to trigger remediation later

Without a persistent store, verification has no long-term value.

Implementation Details

To keep this article focused and readable, we’ve published the complete implementation on GitHub.

Inside the repository, you’ll find:

  • AWS Step Function workflow
  • Instance discovery & batching AWS Lambda
  • SSH validation AWS Lambda
  • Amazon DynamoDB schema ideas
  • Permission setup

GitHub Repository:
https://github.com/DeepakRao121/POC/tree/main/ec2AuditMonitor

You can directly reuse or adapt it for your environment.

Operational Impact

After implementing this automation, we achieved:

  • continuous compliance verification
  • removal of manual server checks
  • faster audit readiness
  • clear visibility of unhealthy nodes
  • scalable governance

Instead of reacting during reviews, we now know the state at any time.

Cost Perspective

Because the solution is serverless, costs remain minimal.

You mainly pay for:

  • AWS Lambda runtime
  • AWS Step Function state transitions
  • small Amazon DynamoDB writes

Compared to manual effort or compliance penalties, this is extremely efficient.

Possible Enhancements

This framework can easily evolve into:

  • auto-remediation if auditd is stopped
  • Slack/email alerts
  • dashboards for compliance percentage
  • environment or application filtering
  • integration with patch pipelines

It can become the backbone of OS-level governance.

Conclusion

Ensuring that audit logging is active across every server is fundamental to security and compliance. Yet in fast-moving cloud environments, maintaining that assurance manually is unrealistic.

By combining AWS Step Functions, AWS Lambda, and Amazon DynamoDB, we can continuously validate the health of auditd and Amazon CloudWatch streaming in a structured, repeatable, and scalable manner. Batching prevents operational strain, orchestration provides reliability, and centralized evidence delivers accountability.

What used to require ad-hoc verification and last-minute preparation for audits becomes an always-available, automated control.

This is how DevOps and security should work together, proactive, observable, and resilient.

Drop a query if you have any questions regarding Amazon CloudWatch and we will get back to you quickly.

Empowering organizations to become ‘data driven’ enterprises with our Cloud experts.

  • Reduced infrastructure costs
  • Timely data-driven decisions
Get Started

About CloudThat

CloudThat is an award-winning company and the first in India to offer cloud training and consulting services worldwide. As a Microsoft Solutions Partner, AWS Advanced Tier Training Partner, and Google Cloud Platform Partner, CloudThat has empowered over 850,000 professionals through 600+ cloud certifications winning global recognition for its training excellence including 20 MCT Trainers in Microsoft’s Global Top 100 and an impressive 12 awards in the last 8 years. CloudThat specializes in Cloud Migration, Data Platforms, DevOps, IoT, and cutting-edge technologies like Gen AI & AI/ML. It has delivered over 500 consulting projects for 250+ organizations in 30+ countries as it continues to empower professionals and enterprises to thrive in the digital-first world.

FAQs

1. Does this solution modify anything on the EC2 instances?

ANS: – No. The workflow is designed for verification and evidence collection. It only checks the status of auditd and Amazon CloudWatch log streaming. However, the same framework can be extended later to perform automatic remediation if required.

2. How does the Lambda function connect to private Amazon EC2 instances?

ANS: – The verification Lambda runs inside the same Amazon VPC (or a peered network) with appropriate subnet routing and security group permissions. This allows secure SSH access without exposing instances to the internet.

3. What happens if an instance is unreachable?

ANS: – The AWS Lambda captures the failure and records it in Amazon DynamoDB. AWS Step Functions can retry based on defined policies, and unreachable hosts become immediately visible for investigation.

WRITTEN BY Deepak S

Deepak S is a Senior Research Associate at CloudThat, specializing in AWS services. He is passionate about exploring new technologies in cloud and is also an automobile enthusiast.

Share

Comments

    Click to Comment

Get The Most Out Of Us

Our support doesn't end here. We have monthly newsletters, study guides, practice questions, and more to assist you in upgrading your cloud career. Subscribe to get them all!