Sunday, June 14, 2015

WHAT ARE INPUTSPLITS?

12:47 AM - By ajay desai 0


The input data present in the form of HDFS block, which when given as input to a mapper is called input split, i.e. the input splits are created from HDFS blocks.

The size of an input split >= HDFS block size

Even if the file size is less than block size say 30 MB. 64 MB of block will be allocated for that file and the size of input split will be 64 MB.Because, the default block size allocated for a file is 64 MB.

If we want to change the size of input split we need to change maximum split size value present in mapred-site.xml


mapred-site.xml

<property>
   <name>mapred.max.split.size</name>
   <value>100 MB </value>
</property>

let us consider a file of size 320 MB, then it is divided into 5 blocks each of 64 MB.  Now the no: of input splits will be 4 each of 100 MB as shown below: -


Tags:
About the Author

I am Azeheruddin Khan having more than 6 year experience in c#, Asp.net and ms sql.My work comprise of medium and enterprise level projects using asp.net and other Microsoft .net technologies. Please feel free to contact me for any queries via posting comments on my blog,i will try to reply as early as possible. Follow me @fresher2programmer
View all posts by admin →

Get Updates

Subscribe to our e-mail newsletter to receive updates.

Share This Post

0 comments:

adsense

© 2014 Fresher2Programmer. WP Theme-junkie converted by Bloggertheme9
Powered by Blogger.
back to top