Friday, June 5, 2015


4:53 AM - By ajay desai 0

1) The HDFS client creates a file by calling create( ) method using (DFS) Distributed File System object.

2) The DFS sends a message to the namenode to create a new file through RPC (Remote Procedure Call).

3) The name node checks whether the file already exists, if the file does not exist, then the name node checks whether the client has a privilege to create a file. If any of these two checks fail, then the name node sends an IOException message to the DFS. If the file does not exists then the name node creates a record of that file in its namespace and gives a +ve acknowledgement to the DFS.

4) After getting a +ve acknowlegement, the DFS provides a FSDataOutputStream to the client.

5) The client calls write ( ) method to write data to this stream which splits this data into packets and stores these data packets into its internal queue called data queue.

6) The FSDataOutputStream asks the namenode to allocate data nodes for storing the blocks.

7) The name node allocates data nodes based on the replication factor i.e. no: of copies required for each block and sends an acknowledgement to this stream. if the replication factor is 3 then, three data nodes are allocated. lets us assume that, the three data nodes are D1, D2 and D3 respectively.

8) Now the FSDataOutputStream is connected to all the three data nodes in a pipeline and this stream sends the data packets to the data node: D1 which streams a copy these packets to its next node D2 and D2 sends a copy of these packets to D3. These data packets are stored as blocks in the data nodes.

The FSDataOutputStream maintains an Internal queue apart from data queue called as ack queue, which contains the same set of data packets present in the data queue. When all the datanodes send an acknowledgement that a particular packets has been received by all of them, then it removes that packet from the ack queue.  

About the Author

I am Azeheruddin Khan having more than 6 year experience in c#, and ms sql.My work comprise of medium and enterprise level projects using and other Microsoft .net technologies. Please feel free to contact me for any queries via posting comments on my blog,i will try to reply as early as possible. Follow me @fresher2programmer
View all posts by admin →

Get Updates

Subscribe to our e-mail newsletter to receive updates.

Share This Post



© 2014 Fresher2Programmer. WP Theme-junkie converted by Bloggertheme9
Powered by Blogger.
back to top