Fast Matrix Factorization in R
This article will be a wrap-up of our series related to collaborative filtering techniques and how to apply them to large datasets in R. In a previous post, we covered the nearest neighbors method. Here, the focus will be on the model-based algorithm — namely, matrix factorization. In comparison to the neighbors method, it is often better in terms of prediction accuracy and time needed to train the model.
The idea behind matrix factorization is to capture patterns in rating data in order to learn certain characteristics, AKA latent factors that describe users and items. In the picture below, these factors are stored in two matrices: P (user factors) and Q (item factors). Let’s imagine that items are movies that users have rated. For movies, those factors might measure obvious dimensions such as the amount of action or comedy, orientation towards children, less-defined dimensions such as depth of character development or quirkiness, or completely uninterpretable dimensions. For users, each factor measures how much the user likes movies that score high on the corresponding movie factor.
via DZone.com Feed https://dzone.com
May 19, 2017 at 06:09AM