AWS Athena: Serverless, Interactive Data Manipulation Service by Amazon

Overview

Amazon Athena is an interactive query service offered by Amazon that makes it easy to explore data directly in the Amazon S3 bucket using standard SQL. Athena is serverless, so there is no infrastructure to manage, and we only pay for the queries we run. Athena is easy to use. It simply directs the data that is present in an S3 bucket and starts querying the data using standard SQL. Most results are delivered within seconds.

Athena is used to analyze the data which is already available in the Amazon S3 bucket. Athena can operate with various types of structured and unstructured data types which include data formats like CSV, JSON, and ORC. If you want to do interactive, ad hoc SQL queries against data stored in Amazon S3, you should use Athena. Athena provides us with the easiest way to run queries for data in the Amazon S3 bucket.

Without the need to format data, Amazon Athena can perform interactive queries on the data stored in Amazon S3. For example, if you need to quickly check the web server logs to investigate a problem with our website, Athena can be helpful.

Freedom Month Sale — Upgrade Your Skills, Save Big!

Up to 80% OFF AWS Courses
Up to 30% OFF Microsoft Certs

Act Fast!

Benefits of Athena

Flexible
Serverless
Cost-Effective
Widely accessible
Fast performance
Secure
Easy integrations with other AWS services

Workflow of Athena

workflow

Demo on Athena

Step 1: Open the AWS S3 console.

Step 2: Click on create a bucket. Enter the bucket name.

step2

Step 3: Click on create a bucket.

step3

Step 4: Create a Test folder in the bucket. And upload one CSV file.

step4

Step 5: We have 3 columns of data in a CSV file.

step5

Step 6: Open the Amazon Athena console.

step6

Step 7: Click on settings and select our S3 bucket in the query result location which is already created in the previous steps.

step7

Step 8: Using the below query we can create a database and we need to select a database.

step8

Step 9: There are many options in the dropdown for creating a table. We need to select S3 bucket data.

step9

Step 10: Enter the table name and choose our existing database.

step10

Step 11: Select the location where your data is stored.

step11

Step 12: Select the input file format which is uploaded in the bucket.

step12

Step 13: Enter the column name and select the Column type. We have only 3 columns if you have too many columns in your file, then you can use the bulk column feature.

step13

Step 14: Click on create a table.

step14

Step 15: Now we will query the data which is in the file using standard SQL.

I am running the below query to display only “Kashyap” name data.
select * from demo where Name = ‘Kashyap’;

step15

Conclusion

As you can see in the blog, Amazon Athena is not a complex service. We can use it easily and makes our workflow simpler. We just need to write proper queries for an accurate result within the seconds. I have covered all the points of Amazon Athena. If you want more learn about Amazon Athena, you can refer official document of amazon Athena – https://docs.aws.amazon.com/athena/

Freedom Month Sale — Discounts That Set You Free!

Up to 80% OFF AWS Courses
Up to 30% OFF Microsoft Certs

Act Fast!

About CloudThat

CloudThat is an award-winning company and the first in India to offer cloud training and consulting services worldwide. As a Microsoft Solutions Partner, AWS Advanced Tier Training Partner, and Google Cloud Platform Partner, CloudThat has empowered over 850,000 professionals through 600+ cloud certifications winning global recognition for its training excellence including 20 MCT Trainers in Microsoft’s Global Top 100 and an impressive 12 awards in the last 8 years. CloudThat specializes in Cloud Migration, Data Platforms, DevOps, IoT, and cutting-edge technologies like Gen AI & AI/ML. It has delivered over 500 consulting projects for 250+ organizations in 30+ countries as it continues to empower professionals and enterprises to thrive in the digital-first world.

FAQs

1. What can be done with Amazon Athena?

ANS: – You can analyze the data which are kept in Amazon S3 with the aid of Amazon Athena. Without aggregating or loading the data into Athena, you can use ANSI SQL to execute interactive analytics using Athena. Unstructured, semi-structured, and structured data sets can all be processed by Amazon Athena. Examples include columnar data formats like Apache Parquet and Apache ORC, CSV, JSON, and Avro. For simple visualization, Amazon Athena connects with Amazon QuickSight. Additionally, you can use an ODBC or JDBC driver to connect to Amazon Athena and generate reports or analyze data using SQL clients or business intelligence software.

2. Are there any additional charges associated with Amazon Athena?

ANS: – Amazon Athena pulls information straight from Amazon S3 and executes a query and stores the results in the S3 bucket of your choice. So, you are charged at standard S3 charges for these result sets. Use lifecycle policies to limit the amount of data that is kept in S3.

3. What data formats does Amazon Athena support?

ANS: – A wide range of data formats, including CSV, TSV, JSON, and Textfiles, are supported by Amazon Athena. It also supports open-source columnar formats like Apache ORC and Apache Parquet. Additionally, compressed data formats like Snappy, Zlib, LZO, and GZIP are also supported by Athena. You may boost performance and cut expenses by partitioning, compressing, and using columnar formats.

WRITTEN BY Kashyap Nitinbhai Shani

Kashyap Nitinbhai Shani is a Research Associate at CloudThat. He is interested to learn advanced technologies and gain insights into new and upcoming cloud services. He likes writing tech blogs and learning new languages.