Tuesday, February 6, 2018

Counters in MapReduce


A Counter is generally used to keep track of occurrences of any events. In Hadoop Framework, whenever any MapReduce job gets executed, the Hadoop Framework initiates counters to keep track of the job statistics like number of rows read, number of rows written as output etc.
These are built in counters in Hadoop Framework. Additionally, we can also create and use our own custom counters.
Typically some of the operations of Hadoop counters are:
  • Number of mapper and reducer launched..
  • The number of bytes was read and written
  • The number of tasks was launched and successfully ran
  • The amount of CPU and memory consumed is appropriate or not for your job and cluster nodes

By default MapReduce provides us with many built-in counters to track all this details, and also provides us the freedom to create our own counters.
In the case if we want to have track any kind of of statistics about the records written as logic in mapper and reducers. Then custom counters come into the picture.
Another use of custom counters is in the debugging process – where it can also be used to determine the number of BAD records

Built-In counters

Built in counters are of three types:
Mapreduce Task Counters
File system counters
Job Counters

Custom Counters

1.Introduction:
Apart from this Built-in counters Mapreduce allows us to create our own set of counters which can be incremented as desired by the user in mapper or reducer for some statististical research.
Counters are defined by a Java enum, which serves to group related counters.
For custom counter implementation follow https://acadgild.com/blog/counters-in-mapreduce/

1 comment:

  1. Good Post! Thank you so much for sharing this pretty post, it was so good to read and useful to improve my knowledge as updated one, keep blogging.

    Big Data Hadoop training in electronic city

    ReplyDelete