One challenge with 10 solutions

Technologies we use for Data Analytics has evolved a lot, recently. Good old relational database systems become less popular every day. Now we have to find our way through several new technologies, which can handle big (and streaming) data, preferably on distributed environments. Python has all the rage now, but of course there are lots of alternatives as well. SQL will always shine, and some other oldies-but-goldies, which we can never under-estimate, are still out there. So there are really a wide range of alternatives. Let's ramble through some of them, shall we? I'll define a simple challenge in this post, and provide ten solutions written in ten different technologies : Awk Perl Bash SQL Python MapReduce Pig Hive Scala MongoDB Together they represent the last 30+ years ! Using these technologies, we'll list the 10 most favorite movies, using the two CSV datasets provided by Grouplens website. The dataset We'll use Mov