The results and ramblings of research

phewww!

Towards building robust hadoop cluster

with 2 comments

We have been working on using AWS to build our hadoop infrastructure. AWS is a pretty interesting cloud environment, and I do feel it is a great tool for any startup. However, we have been overwhelmed with numerous issues in getting our cluster running in a robust manner. A colleague (Swatz, and check out here blog) and I have identified some simple parameters that magically help produce a uber cluster:

1) Check your ulimits: Hadoop writes numerous files. The default linux kernel has a maximum open file limit of 1024 which is too low. We increased this to 65536 (nofile parameter in /etc/security/limits.conf). Check your maximum open file usage using:

sudo lsof | awk '{if(NR>1) print $3}' | sort | uniq -c | sort -nr
   1858 mapred
    946 root
    241 hdfs
     37 syslog
     33 ganglia
     28 messagebus
     16 daemon

2) Use ephemeral disks for storage, hadoop temporary files and HDFS. They are pretty fast and dont have the network lag of EBS.
3) Swap files are always good. We generally add 2-4 1GB swap files. Swap files are never a bad idea.
4) Don’t be scared if a couple of EC2 machines go down once in a while. Hardware failure are pretty common and can occur on amazon AWS as well. If it happens consistently, you need to check your configurations.
5) Use an instance store AMI. EBS is slow and has a network overhead that is in my opinion not worth it. The link works if you follow the instructions :).
6) Logs are written for a reason. Check hadoop tasktracker, datanode, namenode and jobtracker logs when stuff blows up. Check your instance log files to see if you configuration issues like incorrect partition mounting etc.
7) Use the right instance type on Amazon. Remember Hadoop needs high IO (network and hard drive) bandwitdhs, RAM and decent CPU. Rule of thumb: “First IOPS then FLOPS!”.
8) Don’t put too many mappers and reducers on a compute node. Try finding the sweet spot for your startup. We found 6 Mappers and 4 Reducers per node with each Mapper having 1200M of RAM and Reducers having 1600M of RAM works best for our needs.

Good rule of thumb is 4:3 per core

Using more reducers is not recommended. In general case,

= we usually choose the optimal number of reducers as the max no. of reduce slots or less (to catch up with failure nodes) , available in the cluster.
= which creates the fewest files possible
= Each task time between 5 and 15 minutes
So, overloading the number of reducers, will
– have bad performance because of network transfers with various InputFormats
– shuffle errors
– affects the workflow system, if you have dependent jobs in the next queue.(like Oozie)
– causes DDoS (distributed denial of Service) in the shuffle phase and will be
a problem for the next read in the pipeline.

We RAN the terasort algorithm to stress test our AWS cluster. First generate the input data for terasort (on a Terabyte)

$ hadoop jar hadoop-*examples*.jar teragen 10000000000 /user/use_name/tera-in/ -Dmapred.map.tasks=10000

Note, the input is 10billion and not 1 trillion since number of records are input to teragen. (each record is 100bytes ensuring the output data is 1 trillion bytes).
Next run the terasort algorithm

$ hadoop jar hadoop-*examples*.jar terasort /user/use_name/tera-in/ /user/use_name/tera-out/ -Dmapred.reduce.tasks=5000

Last validate the outputs using

$ hadoop jar hadoop-*examples*.jar teravalidate /user/use_name/tera-out/ /user/use_name/tera-validate/

We did find some the changes above resulted in a very robust cluster. Some graphs of our cluster:

CPU Usage of Cluster over 1 hour

CPU Usage of Cluster over 1 hour

Note the first half of the CPU is the mapper phase and the second for reducer. The load per processor is shown below.

Load on processors for Tera Sort

Load on processors for Tera Sort

The memory usage of the cluster is shown below:

Memory Usage of Cluster for Terasort

Memory Usage of Cluster for Terasort

Lastly, the network io graph

Network IO during Terasort

Network IO during Terasort

Overall, our changes have helped created a much more robust cluster. 🙂

Advertisements

Written by anujjaiswal

May 2, 2013 at 1:02 pm

Posted in AWS, EC2, Hadoop, Hardware, HDFS, Linux, MapRed

2 Responses

Subscribe to comments with RSS.

  1. SutoCom

    May 3, 2013 at 2:10 pm

  2. SutoCom

    May 5, 2013 at 5:56 pm


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s

%d bloggers like this: