Chapter 2. MapReduce

MapReduce is a programming model for data processing. The model is simple, yet not to simple to express useful programs in. Hadoop can run MapReduce programs written i various languages; in this chapter, we look at the same program expressed in Java, Ruby and Python. Most importantly, MapReduce programs are inherently parallel, thus puttin very large-scale data analysis into the hands of anyone with enough machines at thei disposal. MapReduce comes into its own for large datasets, so let’s start by looking at one.

results matching ""

    No results matching ""