pyspark.mllib.recommendation.
ALS
Alternating Least Squares matrix factorization
New in version 0.9.0.
Methods
train(ratings, rank[, iterations, lambda_, …])
train
Train a matrix factorization model given an RDD of ratings by users for a subset of products.
trainImplicit(ratings, rank[, iterations, …])
trainImplicit
Train a matrix factorization model given an RDD of ‘implicit preferences’ of users for a subset of products.
Methods Documentation
Train a matrix factorization model given an RDD of ratings by users for a subset of products. The ratings matrix is approximated as the product of two lower-rank matrices of a given rank (number of features). To solve for these features, ALS is run iteratively with a configurable level of parallelism.
pyspark.RDD
RDD of Rating or (userID, productID, rating) tuple.
Number of features to use (also referred to as the number of latent factors).
Number of iterations of ALS. (default: 5)
Regularization parameter. (default: 0.01)
Number of blocks used to parallelize the computation. A value of -1 will use an auto-configured number of blocks. (default: -1)
A value of True will solve least-squares with nonnegativity constraints. (default: False)
Random seed for initial matrix factorization model. A value of None will use system time as the seed. (default: None)
Train a matrix factorization model given an RDD of ‘implicit preferences’ of users for a subset of products. The ratings matrix is approximated as the product of two lower-rank matrices of a given rank (number of features). To solve for these features, ALS is run iteratively with a configurable level of parallelism.
A constant used in computing confidence. (default: 0.01)