Class Hierarchy

SocketServer.BaseServer: Base class for server classes.
collections.Iterable
pyspark.mllib.util.MLUtils: Helper methods to load, save and pre-process data used in MLlib.
pyspark.mllib.random.RandomRDDs: Generator methods for creating RDDs comprised of i.i.d samples from some distribution.
pyspark.sql.SQLContext: Main entry point for SparkSQL functionality.
pyspark.storagelevel.StorageLevel: Flags for controlling the storage of an RDD.
object: The most base type
- pyspark.mllib.recommendation.ALS
- pyspark.accumulators.Accumulator: A shared variable that can be accumulated, i.e., has a commutative and associative "add" operation.
- pyspark.accumulators.AccumulatorParam: Helper object that defines how to accumulate values of a given type.
- pyspark.broadcast.Broadcast: A broadcast variable created with SparkContext.broadcast().
- pyspark.sql.DataType: Spark SQL DataType
- pyspark.mllib.tree.DecisionTree: Learning algorithm for a decision tree model for classification or regression.
- pyspark.mllib.tree.DecisionTreeModel: A decision tree model for classification or regression.
- pyspark.mllib.clustering.KMeans
- pyspark.mllib.clustering.KMeansModel: A clustering model derived from the k-means method.
- pyspark.mllib.regression.LabeledPoint: The features and labels of a data point.
- pyspark.mllib.regression.LassoWithSGD
- pyspark.mllib.regression.LinearModel: A linear model that has a vector of coefficients and an intercept.
- pyspark.mllib.regression.LinearRegressionWithSGD
- pyspark.mllib.classification.LogisticRegressionWithSGD
- pyspark.mllib.recommendation.MatrixFactorizationModel: A matrix factorisation model trained by regularized alternating least-squares.
- pyspark.mllib.stat.MultivariateStatisticalSummary: Trait for multivariate statistical summary of a data matrix.
- pyspark.mllib.classification.NaiveBayes
- pyspark.mllib.classification.NaiveBayesModel: Model for Naive Bayes classifiers.
- pyspark.rdd.RDD: A Resilient Distributed Dataset (RDD), the basic abstraction in Spark.
- pyspark.mllib.regression.RidgeRegressionWithSGD
- pyspark.mllib.classification.SVMWithSGD
- pyspark.serializers.Serializer
- pyspark.conf.SparkConf: Configuration for a Spark application.
- pyspark.context.SparkContext: Main entry point for Spark functionality.
- pyspark.files.SparkFiles: Resolves paths to files added through SparkContext.addFile().
- pyspark.mllib.linalg.SparseVector: A simple sparse vector class for passing data to MLlib.
- pyspark.statcounter.StatCounter
- pyspark.mllib.stat.Statistics
- pyspark.mllib.linalg.Vectors: Factory methods for working with vectors.
- tuple: tuple() -> empty tuple tuple(iterable) -> tuple initialized from iterable's items
- type: type(object) -> the object's type type(name, bases, dict) -> a new type