Collaborative Filtering
ALSSink
@Plugin(type = SparkSink.PLUGIN_TYPE)
@Name("ALSSink")
@Description("A building stage for an Apache Spark ML Collaborative Filtering model. "
+ "This technique is commonly used for recommender systems and aims to fill in the "
+ "missing entries of a User-Item association matrix. "
+ "Spark ML uses the ALS (Alternating Least Squares) algorithm.")
public class ALSSink extends RecommenderSink {
...
}
Parameters
Model Name | The unique name of the recommendation model. |
User Field | The name of the input field that defines the user identifiers. The values must be within the integer value range. |
Item Field | The name of the input field that defines the item identifiers. The values must be within the integer value range. |
Rating Field | The name of the input field that defines the item ratings. The values must be within the integer value range. |
Data Split | The split of the dataset into train & test data, e.g. 80:20. Default is 70:30. |
Model Configuration | |
Factorization Rank | A positive number that defines the rank of the matrix factorization. Default is 10. |
Nonnegative Constraints | The indicator to determine whether to apply nonnegativity constraints for least squares. Support values are 'true' and 'false'. Default is 'false'. |
Maximum Iterations | The maximum number of iterations to train the ALS model. Default is 10. |
Regularization Parameter | The nonnegative regularization parameter. Default is 0.1. |
User Blocks | The number of user blocks. Default is 10. |
Item Blocks | The number of item blocks. Default is 10. |
Implicit Preference | The indicator to determine whether to use implicit preference. Support values are 'true' and 'false'. Default is 'false'. |
Alpha Parameter | The nonnegative alpha parameter in the implicit preference formulation. Default is 1.0. |
ALSPredictor
@Plugin(type = SparkCompute.PLUGIN_TYPE)
@Name("ALSPredictor")
@Description("A prediction stage that leverages a trained Apache Spark ML ALS recommendation model.")
public class ALSPredictor extends RecommenderCompute {
...
}
Parameters
Model Name | The unique name of the recommendation model. |
User Field | The name of the input field that defines the user identifiers. The values must be within the integer value range. |
Item Field | The name of the input field that defines the item identifiers. The values must be within the integer value range. |
Prediction Field | The name of the field in the output schema that contains the predicted rating. |
Smart Adaptive Recommendations
The Smart Adaptive Recommendation (SAR) algorithm is outside the scope of Apache Spark ML. It is currently part of the commercial offering of Dr. Krusche & Partner and will be open sourced by mid of 2020.