|
Voiced by Amazon Polly |
Introduction
In regulated or security-sensitive environments, it is critical to ensure that operating system–level auditing is always active. On Linux systems, auditd provides deep visibility into security events, while streaming those logs to Amazon CloudWatch ensures centralized monitoring, retention, and alerting.
But here’s the operational reality.
When you run dozens or hundreds of Amazon EC2 instances across environments, manually verifying whether:
- auditd is running, and
- logs are actually being shipped to Amazon CloudWatch
becomes nearly impossible.
We faced the same challenge and built a serverless, scalable verification framework using AWS Step Functions, AWS Lambda, and Amazon DynamoDB. The workflow automatically inspects every instance, validates the configuration, and stores audit evidence.
Pioneers in Cloud Consulting & Migration Services
- Reduced infrastructural costs
- Accelerated application deployment
Problem Statement
Security policies might say:
- auditd must be enabled
- log streaming must be active
- proof should be available
However, infrastructure drifts.
New instances may be launched from outdated AMIs, agents may fail, or configurations might change over time. Without continuous verification, teams only discover problems during incidents or audits, which is too late.
We needed a solution that:
- works across large fleets
- runs automatically
- scales safely
- provides traceable evidence
- avoids manual SSH checks
Overview of the Solution
The system uses orchestration and batching to inspect instances in a controlled way.
High-level flow:
- AWS Step Functions starts the execution.
- A Lambda function discovers Amazon EC2 instances and divides them into batches.
- Each instance ID is sent to another AWS Lambda.
- AWS Lambda connects to the server via SSH.
- It validates auditd status and Amazon CloudWatch agent/log streaming.
- Results are stored in Amazon DynamoDB.
At the end, we have a centralized compliance record for the entire fleet.
AWS Services Used
Amazon EC2 – Hosts workloads we need to validate.
AWS Lambda –
- AWS Lambda 1: inventory and batching
- AWS Lambda 2: instance verification
AWS Step Functions – Controls sequencing, retries, and progress.
Amazon DynamoDB – Stores results as audit evidence.
Amazon VPC – Required so Lambda can reach private instances.
AWS IAM – Provides secure permissions.
Architecture & Workflow
Step 1 – Orchestration Begins
AWS Step Function execution is triggered manually or via a schedule.
Step 2 – Instance Discovery
The first AWS Lambda query runs on Amazon EC2 and retrieves the list of running instances.
To prevent overload, it splits them into smaller groups.
This is important because connecting to too many servers at once can exhaust network or authentication resources.
Step 3 – Per-Instance Verification
For each instance, the second AWS Lambda:
- SSHs into the machine
- checks if the auditd service is active
- validates whether logs are configured to stream to Amazon CloudWatch
- captures the outcome
Step 4 – Evidence Storage
The result (pass/fail, timestamps, instance metadata) is written into Amazon DynamoDB.
Step 5 – Progression
The AWS Step Function continues until all batches are complete.
Important Networking Requirement
One key design condition:
The verification AWS Lambda must run inside the same Amazon VPC (or have routing access) to the Amazon EC2 instances.
Since many workloads are private, public connectivity is not available.
By attaching the Lambda to the appropriate subnets and security groups, secure internal access is achieved.
This is often the most crucial step people miss.
Why Amazon DynamoDB?
Auditors and security teams usually ask:
“Can you prove that monitoring was enabled?”
Amazon DynamoDB becomes that proof.
You get:
- historical record
- searchable data
- integration capability with dashboards
- ability to trigger remediation later
Without a persistent store, verification has no long-term value.
Implementation Details
To keep this article focused and readable, we’ve published the complete implementation on GitHub.
Inside the repository, you’ll find:
- AWS Step Function workflow
- Instance discovery & batching AWS Lambda
- SSH validation AWS Lambda
- Amazon DynamoDB schema ideas
- Permission setup
GitHub Repository:
https://github.com/DeepakRao121/POC/tree/main/ec2AuditMonitor
You can directly reuse or adapt it for your environment.
Operational Impact
After implementing this automation, we achieved:
- continuous compliance verification
- removal of manual server checks
- faster audit readiness
- clear visibility of unhealthy nodes
- scalable governance
Instead of reacting during reviews, we now know the state at any time.
Cost Perspective
Because the solution is serverless, costs remain minimal.
You mainly pay for:
- AWS Lambda runtime
- AWS Step Function state transitions
- small Amazon DynamoDB writes
Compared to manual effort or compliance penalties, this is extremely efficient.
Possible Enhancements
This framework can easily evolve into:
- auto-remediation if auditd is stopped
- Slack/email alerts
- dashboards for compliance percentage
- environment or application filtering
- integration with patch pipelines
It can become the backbone of OS-level governance.
Conclusion
Ensuring that audit logging is active across every server is fundamental to security and compliance. Yet in fast-moving cloud environments, maintaining that assurance manually is unrealistic.
What used to require ad-hoc verification and last-minute preparation for audits becomes an always-available, automated control.
This is how DevOps and security should work together, proactive, observable, and resilient.
Drop a query if you have any questions regarding Amazon CloudWatch and we will get back to you quickly.
Empowering organizations to become ‘data driven’ enterprises with our Cloud experts.
- Reduced infrastructure costs
- Timely data-driven decisions
About CloudThat
CloudThat is an award-winning company and the first in India to offer cloud training and consulting services worldwide. As a Microsoft Solutions Partner, AWS Advanced Tier Training Partner, and Google Cloud Platform Partner, CloudThat has empowered over 850,000 professionals through 600+ cloud certifications winning global recognition for its training excellence including 20 MCT Trainers in Microsoft’s Global Top 100 and an impressive 12 awards in the last 8 years. CloudThat specializes in Cloud Migration, Data Platforms, DevOps, IoT, and cutting-edge technologies like Gen AI & AI/ML. It has delivered over 500 consulting projects for 250+ organizations in 30+ countries as it continues to empower professionals and enterprises to thrive in the digital-first world.
FAQs
1. Does this solution modify anything on the EC2 instances?
ANS: – No. The workflow is designed for verification and evidence collection. It only checks the status of auditd and Amazon CloudWatch log streaming. However, the same framework can be extended later to perform automatic remediation if required.
2. How does the Lambda function connect to private Amazon EC2 instances?
ANS: – The verification Lambda runs inside the same Amazon VPC (or a peered network) with appropriate subnet routing and security group permissions. This allows secure SSH access without exposing instances to the internet.
3. What happens if an instance is unreachable?
ANS: – The AWS Lambda captures the failure and records it in Amazon DynamoDB. AWS Step Functions can retry based on defined policies, and unreachable hosts become immediately visible for investigation.
WRITTEN BY Deepak S
Deepak S is a Senior Research Associate at CloudThat, specializing in AWS services. He is passionate about exploring new technologies in cloud and is also an automobile enthusiast.
Login

February 12, 2026
PREV
Comments