Fast Matrix Factorization in R

Fast Matrix Factorization in R

This article will be a wrap-up of our series related to collaborative filtering techniques and how to apply them to large datasets in R. In a previous post, we covered the nearest neighbors method. Here, the focus will be on the model-based algorithm — namely, matrix factorization. In comparison to the neighbors method, it is often better in terms of prediction accuracy and time needed to train the model.

The idea behind matrix factorization is to capture patterns in rating data in order to learn certain characteristics, AKA latent factors that describe users and items. In the picture below, these factors are stored in two matrices: P  (user factors) and Q (item factors). Let’s imagine that items are movies that users have rated. For movies, those factors might measure obvious dimensions such as the amount of action or comedy, orientation towards children, less-defined dimensions such as depth of character development or quirkiness, or completely uninterpretable dimensions. For users, each factor measures how much the user likes movies that score high on the corresponding movie factor.


via Feed

May 19, 2017 at 06:09AM


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s