GaussianMixture#
- class pyspark.mllib.clustering.GaussianMixture[source]#
Learning algorithm for Gaussian Mixtures using the expectation-maximization algorithm.
New in version 1.3.0.
Methods
train
(rdd, k[, convergenceTol, ...])Train a Gaussian Mixture clustering model.
Methods Documentation
- classmethod train(rdd, k, convergenceTol=0.001, maxIterations=100, seed=None, initialModel=None)[source]#
Train a Gaussian Mixture clustering model.
New in version 1.3.0.
- Parameters
- rdd:
pyspark.RDD
Training points as an RDD of
pyspark.mllib.linalg.Vector
or convertible sequence types.- kint
Number of independent Gaussians in the mixture model.
- convergenceTolfloat, optional
Maximum change in log-likelihood at which convergence is considered to have occurred. (default: 1e-3)
- maxIterationsint, optional
Maximum number of iterations allowed. (default: 100)
- seedint, optional
Random seed for initial Gaussian distribution. Set as None to generate seed based on system time. (default: None)
- initialModelGaussianMixtureModel, optional
Initial GMM starting point, bypassing the random initialization. (default: None)
- rdd: