MulticlassMetrics

class pyspark.mllib.evaluation.MulticlassMetrics(predictionAndLabels: pyspark.rdd.RDD[Tuple[float, float]])[source]

Evaluator for multiclass classification.

New in version 1.4.0.

Parameters
predictionAndLabelspyspark.RDD

an RDD of prediction, label, optional weight and optional probability.

Examples

>>> predictionAndLabels = sc.parallelize([(0.0, 0.0), (0.0, 1.0), (0.0, 0.0),
...     (1.0, 0.0), (1.0, 1.0), (1.0, 1.0), (1.0, 1.0), (2.0, 2.0), (2.0, 0.0)])
>>> metrics = MulticlassMetrics(predictionAndLabels)
>>> metrics.confusionMatrix().toArray()
array([[ 2.,  1.,  1.],
       [ 1.,  3.,  0.],
       [ 0.,  0.,  1.]])
>>> metrics.falsePositiveRate(0.0)
0.2...
>>> metrics.precision(1.0)
0.75...
>>> metrics.recall(2.0)
1.0...
>>> metrics.fMeasure(0.0, 2.0)
0.52...
>>> metrics.accuracy
0.66...
>>> metrics.weightedFalsePositiveRate
0.19...
>>> metrics.weightedPrecision
0.68...
>>> metrics.weightedRecall
0.66...
>>> metrics.weightedFMeasure()
0.66...
>>> metrics.weightedFMeasure(2.0)
0.65...
>>> predAndLabelsWithOptWeight = sc.parallelize([(0.0, 0.0, 1.0), (0.0, 1.0, 1.0),
...      (0.0, 0.0, 1.0), (1.0, 0.0, 1.0), (1.0, 1.0, 1.0), (1.0, 1.0, 1.0), (1.0, 1.0, 1.0),
...      (2.0, 2.0, 1.0), (2.0, 0.0, 1.0)])
>>> metrics = MulticlassMetrics(predAndLabelsWithOptWeight)
>>> metrics.confusionMatrix().toArray()
array([[ 2.,  1.,  1.],
       [ 1.,  3.,  0.],
       [ 0.,  0.,  1.]])
>>> metrics.falsePositiveRate(0.0)
0.2...
>>> metrics.precision(1.0)
0.75...
>>> metrics.recall(2.0)
1.0...
>>> metrics.fMeasure(0.0, 2.0)
0.52...
>>> metrics.accuracy
0.66...
>>> metrics.weightedFalsePositiveRate
0.19...
>>> metrics.weightedPrecision
0.68...
>>> metrics.weightedRecall
0.66...
>>> metrics.weightedFMeasure()
0.66...
>>> metrics.weightedFMeasure(2.0)
0.65...
>>> predictionAndLabelsWithProbabilities = sc.parallelize([
...      (1.0, 1.0, 1.0, [0.1, 0.8, 0.1]), (0.0, 2.0, 1.0, [0.9, 0.05, 0.05]),
...      (0.0, 0.0, 1.0, [0.8, 0.2, 0.0]), (1.0, 1.0, 1.0, [0.3, 0.65, 0.05])])
>>> metrics = MulticlassMetrics(predictionAndLabelsWithProbabilities)
>>> metrics.logLoss()
0.9682...

Methods

call(name, *a)

Call method of java_model

confusionMatrix()

Returns confusion matrix: predicted classes are in columns, they are ordered by class label ascending, as in “labels”.

fMeasure(label[, beta])

Returns f-measure.

falsePositiveRate(label)

Returns false positive rate for a given label (category).

logLoss([eps])

Returns weighted logLoss.

precision(label)

Returns precision.

recall(label)

Returns recall.

truePositiveRate(label)

Returns true positive rate for a given label (category).

weightedFMeasure([beta])

Returns weighted averaged f-measure.

Attributes

accuracy

Returns accuracy (equals to the total number of correctly classified instances out of the total number of instances).

weightedFalsePositiveRate

Returns weighted false positive rate.

weightedPrecision

Returns weighted averaged precision.

weightedRecall

Returns weighted averaged recall.

weightedTruePositiveRate

Returns weighted true positive rate.

Methods Documentation

call(name: str, *a: Any) → Any

Call method of java_model

confusionMatrix()pyspark.mllib.linalg.Matrix[source]

Returns confusion matrix: predicted classes are in columns, they are ordered by class label ascending, as in “labels”.

New in version 1.4.0.

fMeasure(label: float, beta: Optional[float] = None) → float[source]

Returns f-measure.

New in version 1.4.0.

falsePositiveRate(label: float) → float[source]

Returns false positive rate for a given label (category).

New in version 1.4.0.

logLoss(eps: float = 1e-15) → float[source]

Returns weighted logLoss.

New in version 3.0.0.

precision(label: float) → float[source]

Returns precision.

New in version 1.4.0.

recall(label: float) → float[source]

Returns recall.

New in version 1.4.0.

truePositiveRate(label: float) → float[source]

Returns true positive rate for a given label (category).

New in version 1.4.0.

weightedFMeasure(beta: Optional[float] = None) → float[source]

Returns weighted averaged f-measure.

New in version 1.4.0.

Attributes Documentation

accuracy

Returns accuracy (equals to the total number of correctly classified instances out of the total number of instances).

New in version 2.0.0.

weightedFalsePositiveRate

Returns weighted false positive rate.

New in version 1.4.0.

weightedPrecision

Returns weighted averaged precision.

New in version 1.4.0.

weightedRecall

Returns weighted averaged recall. (equals to precision, recall and f-measure)

New in version 1.4.0.

weightedTruePositiveRate

Returns weighted true positive rate. (equals to precision, recall and f-measure)

New in version 1.4.0.