pyspark.mllib.classification.
LogisticRegressionWithLBFGS
Train a classification model for Multinomial/Binary Logistic Regression using Limited-memory BFGS.
Standard feature scaling and L2 regularization are used by default. .. versionadded:: 1.2.0
Methods
train(data[, iterations, initialWeights, …])
train
Train a logistic regression model on the given data.
Methods Documentation
New in version 1.2.0.
pyspark.RDD
The training data, an RDD of pyspark.mllib.regression.LabeledPoint.
pyspark.mllib.regression.LabeledPoint
The number of iterations. (default: 100)
pyspark.mllib.linalg.Vector
The initial weights. (default: None)
The regularizer parameter. (default: 0.01)
The type of regularizer used for training our model. Supported values:
“l1” for using L1 regularization
“l2” for using L2 regularization (default)
None for no regularization
Boolean parameter which indicates the use or not of the augmented representation for training data (i.e., whether bias features are activated or not). (default: False)
The number of corrections used in the LBFGS update. If a known updater is used for binary classification, it calls the ml implementation and this parameter will have no effect. (default: 10)
The convergence tolerance of iterations for L-BFGS. (default: 1e-6)
Boolean parameter which indicates if the algorithm should validate data before training. (default: True)
The number of classes (i.e., outcomes) a label can take in Multinomial Logistic Regression. (default: 2)
Examples
>>> data = [ ... LabeledPoint(0.0, [0.0, 1.0]), ... LabeledPoint(1.0, [1.0, 0.0]), ... ] >>> lrm = LogisticRegressionWithLBFGS.train(sc.parallelize(data), iterations=10) >>> lrm.predict([1.0, 0.0]) 1 >>> lrm.predict([0.0, 1.0]) 0