18–29 Aug 2014
KTH main campus
Europe/Stockholm timezone

Abstractions for processing large data sets

29 Aug 2014, 11:15
45m

Speaker

Jonas Yngvesson (Google Inc.)

Description

At Google there is often a need to process very large data sets across many machines. Building efficient parallel processing programs is not trivial and to avoid the overhead of each engineer reinventing the wheel, Google has created several programming models to abstract away the complexities of parallelism and have the programmer concentrate on the core of his processing problem instead. I will talk mainly about the MapReduce model, the conceptual model the programmer get to work with and how it is implemented under the hood. I will also mention a few other models in use at Google, like Pregel, Dremel and Flume.

Presentation materials

There are no materials yet.