Improving Map Reduce Performance in Heterogeneous Distributed System using HDFS Environment-A Review

Main Article Content

Shraddha Thakkar, Prof. Sanjay Patel

Abstract

Hadoop is a Java-based programming framework which supports for storing and processing big data in a distributed computing environment. It is using HDFS for data storing and using Map Reduce to processing that data. Map Reduce has become an important distributed processing model for large-scale data-intensive applications like data mining and web indexing. Map Reduce is widely used for short jobs requiring low response time. The current Hadoop implementation assumes that computing nodes in a cluster are homogeneous in nature. Unfortunately, both the homogeneity and data locality assumptions are not satisfied in virtualized data centers. Hadoop’s scheduler can cause severe performance degradation in heterogeneous environments. We observe that, Longest Approximate Time to End (LATE), which is highly robust to heterogeneity. LATE can improve Hadoop response times by a factor of 2 in clusters.
DOI: 10.17762/ijritcc2321-8169.150301

Article Details

How to Cite
, S. T. P. S. P. (2015). Improving Map Reduce Performance in Heterogeneous Distributed System using HDFS Environment-A Review. International Journal on Recent and Innovation Trends in Computing and Communication, 3(3), 903–910. https://doi.org/10.17762/ijritcc.v3i3.3934
Section
Articles