Setup Hadoop Cluster Using Ansible Playbook
I setting my complete Hadoop cluster on top of EC2 instances of AWS.In my case, I am going to set up a name node and one data node, and one client.
These EC2 instances are my target nodes so in my inventory file, I am mentioning the target system IPs and their private ssh key.
To set up the complete Hadoop cluster I have to run three playbooks.
→ namenode_setup.yml
→ datanode_setup.yml
→ client_setup.yml
Name Node setup:
Before doing the task on the name node we will take the inputs:
- Namenode Public IP
- Namenode Private IP
- Enter name node port
- Namenode folder
Tasks which we have to perform on the namenode target node :
step:1 “downloading JDK”
step:2 “checking the JDK installed or not”
step:3 “installing JDK”
step:4 “downloading Hadoop”
step:5 “checking the Hadoop installed or not”
step:6 “installing Hadoop”
step:7 “making directory for name node”
step:8 “From controller node copying N_hdfs-site.xml file to name node /etc/hadoop/hdfs-site.xml”
- this is the N_hdfs-site.xml file at the controller node.
step:9 “From controller node copying N_core-site.xml file to name node /etc/hadoop/core-site.xml”
- this is the N_core-site.xml file at the controller node
step:10 “formatting the name node”
step:11 “starting name node service”
step:12 “see the status of service”
These are the SC of playbook namenode_setup.yml …
After running the playbook namenode_setup.yml:
Namenode setup successfully
Data Node setup:
Before doing the task on the data node we will take the inputs:
- Datanode Public IP
- Namenode Public IP
- Namenode Port
- Datanode Folder
Tasks which we have to perform on the data node target node :
Step:1 “downloading JDK”
Step:2 “checking the JDK installed or not”
Step:3 “installing JDK”
Step:4 “downloading Hadoop”
Step:5 “checking the Hadoop installed or not”
Step:6 “installing Hadoop”
Step:7 “making directory for data node”
Step:8 “From the controller node copying D_hdfs-site.xml file to data node /etc/hadoop/hdfs-site.xml”
- this is the D_hdfs-site.xml file
Step:9 “From the controller node copying D_core-site.xml file to data node /etc/hadoop/core-site.xml”
- this is the D_core-site.xml file
step:10 “starting data node service”
step:11 “see the status of service”
These are the screenshots of datanode_setup.yml…
After running the playbook datanode_setup.yml:
Datanode setup successfully.
Client Setup:
Before doing the task on the client we will take the inputs:
- Client Public IP
- Namenode Public IP
- Namenode Port
Tasks which we have to perform on the client target node :
Step:1 “downloading JDK”
Step:2 “checking the JDK installed or not”
Step:3 “installing JDK”
Step:4 “downloading Hadoop”
Step:5 “checking the Hadoop installed or not”
Step:6 “installing Hadoop”
Step:7 “From controller node copying C_core-site.xml file to data node /etc/Hadoop/core-site.xml”
Note: At the client-side, we only configure the core-site.xml because the client never shares the space client only uploads the data and retrieves the data.
- this is the C_core-site.xml file
Step:8 “checking status of connection”
These are the screenshots of client_setup.yml…
After running the playbook client_setup.yml:
client node setup successfully.
My complete Hadoop cluster setup successfully using ansible-playbook…