Explain Hadoop core component.

2 years ago
Cloud Computing

There are basically 3 important core components of hadoop -

1. For computational processing e. MapReduce:

MapReduce is the data processing layer of Hadoop. It is a software framework for easily writing applications that process the vast amount of structured and unstructured data stored in the Hadoop Distributed Filesystem (HSDF). It processes huge amounts of data in parallel by dividing the job (submitted job) into a set of independent tasks (sub-job).

In Hadoop, MapReduce works by breaking the processing into phases: Map and Reduce. The Map is the first phase of processing, where we specify all the complex logic/business rules/costly code. Reduce is the second phase of processing, where we specify lightweight processing like aggregation/summation.

2. For storage purposes e., HDFS:

Acronym of Hadoop Distributed File System - which is the basic motive of storage. It also works as the Master-Slave pattern. In HDFS NameNode acts as a master which stores the metadata of the data node and the Data node acts as a slave which stores the actual data in a local disc parallel.

3. Yarn :

which is used for resource allocation.YARN is the processing framework in Hadoop, which provides Resource management, and it allows multiple data processing engines such as real-time streaming, data science, and batch processing to handle data stored on a single platform.

2
Dipti KC
Dec 19, 2022
More related questions

Questions Bank

View all Questions