What is Journal Node?
– Journal nodes are the ones that help to store the recent edits in the HDFS and will perform the sync activity between Active & Passive Namenode. Usually, It is recommended to keep 3 journal nodes to make sure it is not affected by the disaster.
– All the 3 Journal nodes would have the same edits information and they sync with each other at a regular intervals of time.
Now Imagine a situation where One of the journal nodes fails and is unable to recover the data from the Host. At this time we need to recover the Journal nodes manually
In Non-working Journal Node
- Identify “Journalnode directories” from the Cloudera manager or from hdfs-site.xml
- Move the content inside the “current” directory to /tmp or any other backup location. (Make sure the backup is taken properly)
- Copy the content from the Working JournalNode current directory and paste it into the Non-Working Journal Node path
- Start the Journal Node
The above steps help to sync the data between the journal node and help to start back healthy.
Check here to know more about recovering Namenode during host failure
Good Luck with your Learning, Give us a like if this blog is helpful