Voiced by Amazon Polly |
Introduction
I am sure that you have already heard about the next generation of MapReduce proposed by Hadoop. Its popularly called MapReduce2 or MR2.
With MR2 they are introducing many enhancements, the prime one being introduction of a new component called YARN (Yet Another Resource Negotiator).
With the current MapReduce implementation, there is just one Job Tracker that takes care of two critical functions:
- Manage resources across the cluster and schedule jobs using that information.
- Keep track of job execution. This includes rerunning failed nodes, job check-pointing, etc.
In YARN, the Job Tracker goes away and each of these two tasks are given to two different components.
Resource Manager
There is one Resource Manager per Hadoop Cluster and is responsible for scheduling jobs. It has the state information about all the nodes and thus it can make smarter scheduling decisions.
Customized Cloud Solutions to Drive your Business Success
- Cloud Migration
- Devops
- AIML & IoT
Application Master
There is one Application Master (AM) per job. Resource Manager schedules one AM per job, and once that is done, its AM’s responsibility to successfully complete the execution of the job. This takes away a lot of responsibility from the Resource Manager, and thus it can scale to many more nodes in the cluster and many more jobs.
Few points to know about this new architecture are:
- Application Master aggressively writes check-pointing state to HDFS and thus load on HDFS increases. This application state is used for job recovery; if Application Master fails a job can be restarted from the last checkpoint.
- The Task Node is replaced by NodeManager (More about this in future blog). There is an option to write Node Manager log to HDFS. Thus logs can go to a central place and debugging will be easier. This will further stress the HDFS cluster.
- Yarn no longer just works with MapReduce. It can work with other distributed computing platform.
- Yahoo introduced Storm: A Real time distributed computing platform. Open Source by Yahoo!!
- This version also introduces Web services for Hadoop Cluster status. No longer you need to scrape web pages to automate stuff.
Please share the article if you liked it. Let me know of your thoughts in comments below
Get your new hires billable within 1-60 days. Experience our Capability Development Framework today.
- Cloud Training
- Customized Training
- Experiential Learning
About CloudThat
CloudThat is a leading provider of Cloud Training and Consulting services with a global presence in India, the USA, Asia, Europe, and Africa. Specializing in AWS, Microsoft Azure, GCP, VMware, Databricks, and more, the company serves mid-market and enterprise clients, offering comprehensive expertise in Cloud Migration, Data Platforms, DevOps, IoT, AI/ML, and more.
CloudThat is the first Indian Company to win the prestigious Microsoft Partner 2024 Award and is recognized as a top-tier partner with AWS and Microsoft, including the prestigious ‘Think Big’ partner award from AWS and the Microsoft Superstars FY 2023 award in Asia & India. Having trained 850k+ professionals in 600+ cloud certifications and completed 500+ consulting projects globally, CloudThat is an official AWS Advanced Consulting Partner, Microsoft Gold Partner, AWS Training Partner, AWS Migration Partner, AWS Data and Analytics Partner, AWS DevOps Competency Partner, AWS GenAI Competency Partner, Amazon QuickSight Service Delivery Partner, Amazon EKS Service Delivery Partner, AWS Microsoft Workload Partners, Amazon EC2 Service Delivery Partner, Amazon ECS Service Delivery Partner, AWS Glue Service Delivery Partner, Amazon Redshift Service Delivery Partner, AWS Control Tower Service Delivery Partner, AWS WAF Service Delivery Partner, Amazon CloudFront Service Delivery Partner, Amazon OpenSearch Service Delivery Partner, AWS DMS Service Delivery Partner, AWS Systems Manager Service Delivery Partner, Amazon RDS Service Delivery Partner, AWS CloudFormation Service Delivery Partner, AWS Config, Amazon EMR and many more.

WRITTEN BY Bhavesh Goswami
Bhavesh Goswami is the Founder & CEO of CloudThat Technologies. He is a leading expert in the Cloud Computing space with over a decade of experience. He was in the initial development team of Amazon Simple Storage Service (S3) at Amazon Web Services (AWS) in Seattle. and has been working in the Cloud Computing and Big Data fields for over 12 years now. He is a public speaker and has been the Keynote Speaker at the ‘International Conference on Computer Communication and Informatics’. He also has authored numerous research papers and patents in various fields.
Comments