Home » Tag Archives: Apache Mahout

Tag Archives: Apache Mahout

Amazon Elastic Map Reduce to compute recommendations with Apache Mahout

apache-mahout-logo

Apache Mahout is a “scalable machine learning library” which, among others, contains implementations of various single-node and distributed recommendation algorithms. In my last blog post I described how to implement an on-line recommender system processing data on a single node. What if the data is too large to fit into memory (>100M preference data points)? Then we have no choice, ...

Read More »

Creating an on-line recommender system with Apache Mahout

apache-mahout-logo

Recently we’ve been implementing a recommender system for Yap.TV: you can see it in action after installing the app and going to the “Just for you” tab. We’re using Apache Mahout as the base for doing recommendations. Mahout is a “scalable machine learning library” and contains both local and distributed implementations of user- and item- based recommenders using collaborative filtering ...

Read More »

What is Big Data – Theory to Implementation

jcg-logo

What is Big Data? You may ask; and more importantly why it is the latest trend in nearly every business domain? Is it just a hype or its here to stay? As a matter of fact “Big Data” is a pretty straightforward term – its just what its says – a very large data-set. How large? The exact answer is ...

Read More »

Mahout and Scalding for poker collusion detection

apache-mahout-logo

When I’ve been reading a very bright book on Mahout, Mahout In Action (which is a great hands-in intro to machine learning, as well), one of the examples has caught my attention. Authors of the book where using well-known K-means clusterization algorithm for finding similar players on stackoverflow.com, where the criterion of similarity was the set of the authors of ...

Read More »

Apache Mahout: Build a spam filter server

apache-mahout-logo

Something quite interesting has happened with Lucene. It started as a library, then its developers began adding new projects based on it. They developed another open source project that would add crawling features (among others features) to Lucene. Nutch is in fact a full featured web serach engine that anyone can use or modify. Inspired in some famous papers from ...

Read More »

Apache Mahout: Getting started

apache-mahout-logo

Recently I have got an interesting problem to solve: how to classify text from different sources using automation? Some time ago I read about a project which does this as well as many other text analysis stuff – Apache Mahout. Though it’s not a very mature one (current version is 0.4), it’s very powerful and scalable. Build on top of ...

Read More »
Do you want to know how to develop your skillset and become a ...

Subscribe to our newsletter to start Rocking right now!

To get you started we give you our best selling eBooks for FREE!
Get ready to Rock!
To download the books, please verify your email address by following the instructions found on the email we just sent you.

THANK YOU!

Close