Tuesday, 3 February 2009

BigTable, MapReduce and Hadoop

This post won't make much sense as this is just a reminder to myself.

Over the next few weeks, I want to spend sometime looking at MapReduce and how it can be applied with Windows Azure (big thanks to Gary Short for mentioning this).

I also want to look at how BigTable compares to Windows Azure Table Storage Services, and finally i want to look into Hadoop.

Sorry this is just a list of things I want to look at, but I will expand out on this later

I'm also wondering how this can relate to Silverlight. Perhaps you could have multiple Silverlight Nodes running with MapReduce talking back to Windows Azure (via a proxy web service), as well as queues and worker roles


Anonymous said...

Hi chrishayuk,
Did you thought about use microsoft robotics instead of Hadoop?

chrishayuk said...

I'm not really planning to use Hadoop, I'm just planning to get an understanding of it.

I'm curious on why you are comparing robotics to Hadoop. Would love you to expand on your thoughts

mtc3b said...

I'm looking at the same thing. We looked at Hadoop and Disco and now I'm tasked with writing something similar to the Word Count MapReduce example in Azure. You made any interesting progress?

chrishayuk said...

Yeah, I played with MapReduce, using LINQ and got an understanding of how it relates with Hadoop.

I need to have some more thoughts in that area in regards to Azure.

The real issue I see is to do with data transfer and locality of data. Hadoop does some very clever stuff with routing the data to various submasks to keep the data within the current hub / router etc.

That level of support just isn't there with Windows Azure.

I still need to put more thought into it though