Tuesday, June 2, 2015

HOW THE CLIENT READS A FILE FROM HDFS

8:41 AM - By ajay desai 0

1) The client calls the open( ) method using DFS (Distributed File System) object in order to read a file.

2) The Distributed File System calls the name node by using a RPC (Remote Procedure Call) to know the locations of the first few blocks of a file.

3)As the name node contains address of the blocks i.e. metadata, it returns the address of the data nodes which contain the requested data blocks. After getting the address of those data nodes, the DFS provides a FSDataInputStream to the client.

4) Now the client calls read( ) method on the stream.

5) As this input stream is connected to the respective data nodes which contain the requested blocks of the file, it reads the blocks from the first data node which is very close to it. After reading the blocks, it provides input to the client and close its connection with that data node.

6) After that, the FSDataInputStream goes to the next nearest data node to get read the next set of blocks.  This process is repeated till the entire file data is read and sent to the client.

7) When the client gets the complete file data, the client sends a close signal to the FSDataInputStream by calling close( ) method using DFS object.

This entire process is abstracted from the client, the client feels that, it is getting data from a continuous input stream.

Tags:
About the Author

I am Azeheruddin Khan having more than 6 year experience in c#, Asp.net and ms sql.My work comprise of medium and enterprise level projects using asp.net and other Microsoft .net technologies. Please feel free to contact me for any queries via posting comments on my blog,i will try to reply as early as possible. Follow me @fresher2programmer
View all posts by admin →

Get Updates

Subscribe to our e-mail newsletter to receive updates.

Share This Post

0 comments:

adsense

© 2014 Fresher2Programmer. WP Theme-junkie converted by Bloggertheme9
Powered by Blogger.
back to top