Big Data, Cloud Computing, Data Analytics

3 Mins Read

What is ‘BIG DATA’? And Big data analytics?

Voiced by Amazon Polly

What is Big data?

It is definitely the most talked of ‘new kid on the block’ in the analytics fraternity. Everyone seems to be talking about it. So, what exactly is Big data?

Big data consists of data sets that grow so large that they become ‘difficult / awkward’ to work with using existing database management systems (Oracle, Sybase, MySQL, Teradata etc.). Difficulties include capture, storage, search, sharing, and mining the data for analytics. With an explosion in sources of data – internet forms, cookies, sensors, mobile applications, satellite data etc., the quantum on data is growing and will continue to grow at an astronomical pace. The cost of storage of data is reducing exponentially too. The cost of a 4 GB pen drive is now 10% of what it was a couple of years ago. Coupled together, these two trends will fuel the growth in quantum of data that we will have access to.

The world’s technological per capita capacity to store information has roughly doubled every 40 months since the 1980s (about every 3 years). Some people say it is no going to double every 1.5 years. Every day 2.5 quintillion bytes of data is created.

As you can visualize, these new sources of data will mostly be non-relational data and the storage is in non-relational DBMS. Thus, transactional data within an organization will, bynecessity, be in the traditional relational database while there is this ‘other’ data which is where there will be maximum growth. This ‘other’ data will need to be mined and put into MIS and reports, analyzed for trends and used to create probability equations.

This ‘other’ data is generally called Big data.

Customized Cloud Solutions to Drive your Business Success

  • Cloud Migration
  • Devops
  • AIML & IoT
Know More

Connecting Big Data

bigdatarefinery

(Source:-https://connollyshaun.blogspot.in/)

As systems and processes stabilize and mature on capture and storage of Big Data, the focus is shifting to ‘WHAT NEXT’?

Logically, the next step is to mine the data for information – business intelligence and Analytics.

I will specifically look at Apache Hadoop in this context. HadoopMapReduce is a software framework for easily writing applications which process vast amounts of data (multi-terabyte datasets) in-parallel on large clusters (thousands of nodes) of commodity hardware in a reliable, fault-tolerant manner.

Many vendors that have caught the Hadoop bug and released versions of the software such as Cloudera, HortonWorks, Microsoft with HDInsight.

 Cloudera’s Hadoop

Sounds simple, but for a data analyst with no Java coding skills, it is all Latin and Greek . Until you take a look at Pig – high-level platform for creating MapReduce programs used with Hadoop. The language for this platform is called Pig Latin. It abstracts the programming from the Java MapReduce idiom into a notation which makes MapReduce programming high level, similar to that of SQL for RDBMS systems. High Level language is very close to natural language and spoken English, thus making it very user friendly for the non – coder.

Wow ..this makes life so much better for data analysts. And enables us Analysts to look forward to many more projects where we will effectively crunch ‘Big Data‘.

Software that competes with Hadoop is Google’s BigQuery.  And the comparisons between these two giants is a story for another day. But in the real world out there, Hadoop is the current favorite Big Data Management and Analysis system.

 Interesting aside:- Apache Hadoop is an open-source software framework that supports data-intensive distributed applications. Hadoop is written in the Java programming language Hadoop was created by Doug Cutting and Mike Cafarella and Doug named it after his son’s toy elephant!! All parts of the Hadoop framework have names commonly found in a Zoo J.

From 2002 onwards, Subhashini has a decade of experience across roles in Analytics in Retail Finance and Banking. These roles have been across Risk Management, Collections strategy, Fraud Control and Marketing in GE Money, Standard Chartered Bank, Tata Motors Finance and Citi GDM. Her area of interest is the integration of results / outputs of Analytics with Business Decisions – Tactics and Strategy.

She is currently active in the Analytics Training and Consulting arena.

(Link to LinkedIn profile – https://in.linkedin.com/pub/subhashini-s-tripathi/3/405/77b )

Get your new hires billable within 1-60 days. Experience our Capability Development Framework today.

  • Cloud Training
  • Customized Training
  • Experiential Learning
Read More

About CloudThat

CloudThat is a leading provider of Cloud Training and Consulting services with a global presence in India, the USA, Asia, Europe, and Africa. Specializing in AWS, Microsoft Azure, GCP, VMware, Databricks, and more, the company serves mid-market and enterprise clients, offering comprehensive expertise in Cloud Migration, Data Platforms, DevOps, IoT, AI/ML, and more.

CloudThat is the first Indian Company to win the prestigious Microsoft Partner 2024 Award and is recognized as a top-tier partner with AWS and Microsoft, including the prestigious ‘Think Big’ partner award from AWS and the Microsoft Superstars FY 2023 award in Asia & India. Having trained 850k+ professionals in 600+ cloud certifications and completed 500+ consulting projects globally, CloudThat is an official AWS Premier Consulting Partner, Microsoft Gold Partner, AWS Training PartnerAWS Migration PartnerAWS Data and Analytics PartnerAWS DevOps Competency PartnerAWS GenAI Competency PartnerAmazon QuickSight Service Delivery PartnerAmazon EKS Service Delivery Partner AWS Microsoft Workload PartnersAmazon EC2 Service Delivery PartnerAmazon ECS Service Delivery PartnerAWS Glue Service Delivery PartnerAmazon Redshift Service Delivery PartnerAWS Control Tower Service Delivery PartnerAWS WAF Service Delivery PartnerAmazon CloudFront Service Delivery PartnerAmazon OpenSearch Service Delivery PartnerAWS DMS Service Delivery PartnerAWS Systems Manager Service Delivery PartnerAmazon RDS Service Delivery PartnerAWS CloudFormation Service Delivery PartnerAWS ConfigAmazon EMR and many more.

WRITTEN BY CloudThat

CloudThat is a leading provider of cloud training and consulting services, empowering individuals and organizations to leverage the full potential of cloud computing. With a commitment to delivering cutting-edge expertise, CloudThat equips professionals with the skills needed to thrive in the digital era.

Share

Comments

    Click to Comment

Get The Most Out Of Us

Our support doesn't end here. We have monthly newsletters, study guides, practice questions, and more to assist you in upgrading your cloud career. Subscribe to get them all!