The results and ramblings of research

phewww!

Stress Testing our AWS Hadoop Deployment – Some more results

with one comment

So to continue our previous post, we have been working to deploy a stable, robust hadoop cluster on AWS. We had experienced numerous issues with our previous deployments, however, as we outlined in our previous post, some parameters allowed us to magically have a pretty stable cluster. Obvious, next steps were to stress test our cluster. To do so we ran our tests on a small 10+1 node hadoop cluster (10 datanodes&tasktracker nodes + 1 namenode&jobtracker node). We ran the followings:

Test 1 – DFSIO Tests

1) Write Tests – We first ran the DFSIO write test using

$ hadoop jar hadoop-mapreduce-client-jobclient-2.0.0-cdh4.1.2-tests.jar TestDFSIO -write -nrFiles 100 -fileSize 1000

The above command runs a write test which generates 100 output files each size of size 1GB for a total of 100GB.

2) Read Tests – Next we ran the DFSIO read test using

$ hadoop jar hadoop-mapreduce-client-jobclient-2.0.0-cdh4.1.2-tests.jar TestDFSIO -read -nrFiles 100 -fileSize 1000

Note, the default directory for the outputs is /benchmarks/TestDFSIO.

Lastly, clean the output folders (delete test files) using

$ hadoop jar hadoop-mapreduce-client-jobclient-2.0.0-cdh4.1.2-tests.jar TestDFSIO -clean

Please download jar for running the test.

Outputs:

Write Test
13/05/06 16:16:50 INFO fs.TestDFSIO: —– TestDFSIO —– : write
13/05/06 16:16:50 INFO fs.TestDFSIO: Date & time: Mon May 06 16:16:50 PDT 2013
13/05/06 16:16:50 INFO fs.TestDFSIO: Number of files: 100
13/05/06 16:16:50 INFO fs.TestDFSIO: Total MBytes processed: 100000.0
13/05/06 16:16:50 INFO fs.TestDFSIO: Throughput mb/sec: 6.0895988251458775
13/05/06 16:16:50 INFO fs.TestDFSIO: Average IO rate mb/sec: 6.641181468963623
13/05/06 16:16:50 INFO fs.TestDFSIO: IO rate std deviation: 1.9043254369666331
13/05/06 16:16:50 INFO fs.TestDFSIO: Test exec time sec: 390.825
13/05/06 16:16:50 INFO fs.TestDFSIO:

Read Test
13/05/06 16:23:01 INFO fs.TestDFSIO: —– TestDFSIO —– : read
13/05/06 16:23:02 INFO fs.TestDFSIO: Date & time: Mon May 06 16:23:01 PDT 2013
13/05/06 16:23:02 INFO fs.TestDFSIO: Number of files: 100
13/05/06 16:23:02 INFO fs.TestDFSIO: Total MBytes processed: 100000.0
13/05/06 16:23:02 INFO fs.TestDFSIO: Throughput mb/sec: 18.524110055442662
13/05/06 16:23:02 INFO fs.TestDFSIO: Average IO rate mb/sec: 20.380735397338867
13/05/06 16:23:02 INFO fs.TestDFSIO: IO rate std deviation: 6.731484273400149
13/05/06 16:23:02 INFO fs.TestDFSIO: Test exec time sec: 171.871

Test 2 – 50 Chained Jobs

The next part of our test was designed to test the overall infrastructure under load. Our cluster consisted of 15 compute nodes (TT+DN) and 1 JT+NN nodes. Furthermore, we some of the jobs that were run included extremely large number of mappers (10K-15K upwards) and large number of reducers. Overall, we feel that such a test would replicate real-world load and give us a fair idea of how our compute nodes perform. Note, we did use the following parameter – mapred.reduce.slowstart.completed.maps and set it to 0.95. This ensured that the reducers start only after a large number of mappers are completed ensuring the reducers dont timeout due to inactivity. Our jobs were executed over a 1 day period and the graphs below illustrate the cluster utilization.

Cluster Load Per Processor over 1 day

Cluster Load Per Processor over 1 day

 

Cluster Memory utilization over 1 day

Cluster Memory utilization over 1 day

Cluster Network over 1day

Cluster Network over 1day

Cluster CPU utilization over 1 day

Cluster CPU utilization over 1 day

Advertisements

Written by anujjaiswal

May 17, 2013 at 10:44 am

Posted in AWS, Hadoop, HDFS

Tagged with , , ,

One Response

Subscribe to comments with RSS.

  1. SutoCom

    May 19, 2013 at 1:29 pm


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s

%d bloggers like this: