LinearRegressionModel¶
-
class
pyspark.mllib.regression.
LinearRegressionModel
(weights: pyspark.mllib.linalg.Vector, intercept: float)[source]¶ A linear regression model derived from a least-squares fit.
New in version 0.9.0.
Examples
>>> from pyspark.mllib.linalg import SparseVector >>> from pyspark.mllib.regression import LabeledPoint >>> data = [ ... LabeledPoint(0.0, [0.0]), ... LabeledPoint(1.0, [1.0]), ... LabeledPoint(3.0, [2.0]), ... LabeledPoint(2.0, [3.0]) ... ] >>> lrm = LinearRegressionWithSGD.train(sc.parallelize(data), iterations=10, ... initialWeights=np.array([1.0])) >>> abs(lrm.predict(np.array([0.0])) - 0) < 0.5 True >>> abs(lrm.predict(np.array([1.0])) - 1) < 0.5 True >>> abs(lrm.predict(SparseVector(1, {0: 1.0})) - 1) < 0.5 True >>> abs(lrm.predict(sc.parallelize([[1.0]])).collect()[0] - 1) < 0.5 True >>> import os, tempfile >>> path = tempfile.mkdtemp() >>> lrm.save(sc, path) >>> sameModel = LinearRegressionModel.load(sc, path) >>> abs(sameModel.predict(np.array([0.0])) - 0) < 0.5 True >>> abs(sameModel.predict(np.array([1.0])) - 1) < 0.5 True >>> abs(sameModel.predict(SparseVector(1, {0: 1.0})) - 1) < 0.5 True >>> from shutil import rmtree >>> try: ... rmtree(path) ... except BaseException: ... pass >>> data = [ ... LabeledPoint(0.0, SparseVector(1, {0: 0.0})), ... LabeledPoint(1.0, SparseVector(1, {0: 1.0})), ... LabeledPoint(3.0, SparseVector(1, {0: 2.0})), ... LabeledPoint(2.0, SparseVector(1, {0: 3.0})) ... ] >>> lrm = LinearRegressionWithSGD.train(sc.parallelize(data), iterations=10, ... initialWeights=np.array([1.0])) >>> abs(lrm.predict(np.array([0.0])) - 0) < 0.5 True >>> abs(lrm.predict(SparseVector(1, {0: 1.0})) - 1) < 0.5 True >>> lrm = LinearRegressionWithSGD.train(sc.parallelize(data), iterations=10, step=1.0, ... miniBatchFraction=1.0, initialWeights=np.array([1.0]), regParam=0.1, regType="l2", ... intercept=True, validateData=True) >>> abs(lrm.predict(np.array([0.0])) - 0) < 0.5 True >>> abs(lrm.predict(SparseVector(1, {0: 1.0})) - 1) < 0.5 True
Methods
load
(sc, path)Load a LinearRegressionModel.
predict
(x)Predict the value of the dependent variable given a vector or an RDD of vectors containing values for the independent variables.
save
(sc, path)Save a LinearRegressionModel.
Attributes
Intercept computed for this model.
Weights computed for every feature.
Methods Documentation
-
classmethod
load
(sc: pyspark.context.SparkContext, path: str) → pyspark.mllib.regression.LinearRegressionModel[source]¶ Load a LinearRegressionModel.
New in version 1.4.0.
-
predict
(x: Union[VectorLike, pyspark.rdd.RDD[VectorLike]]) → Union[float, pyspark.rdd.RDD[float]]¶ Predict the value of the dependent variable given a vector or an RDD of vectors containing values for the independent variables.
New in version 0.9.0.
-
save
(sc: pyspark.context.SparkContext, path: str) → None[source]¶ Save a LinearRegressionModel.
New in version 1.4.0.
Attributes Documentation
-
intercept
¶ Intercept computed for this model.
New in version 1.0.0.
-
weights
¶ Weights computed for every feature.
New in version 1.0.0.
-
classmethod