Home
Trees
Indices
Help
Spark 1.1.0 Python API Docs
[
frames
] |
no frames
]
[
Module Hierarchy
|
Class Hierarchy
]
Class Hierarchy
SocketServer.BaseServer
:
Base class for server classes.
collections.Iterable
pyspark.mllib.util.MLUtils
:
Helper methods to load, save and pre-process data used in MLlib.
pyspark.mllib.random.RandomRDDs
:
Generator methods for creating RDDs comprised of i.i.d samples from some distribution.
pyspark.sql.SQLContext
:
Main entry point for SparkSQL functionality.
pyspark.storagelevel.StorageLevel
:
Flags for controlling the storage of an RDD.
object
:
The most base type
pyspark.mllib.recommendation.ALS
pyspark.accumulators.Accumulator
:
A shared variable that can be accumulated, i.e., has a commutative and associative "add" operation.
pyspark.accumulators.AccumulatorParam
:
Helper object that defines how to accumulate values of a given type.
pyspark.broadcast.Broadcast
:
A broadcast variable created with
SparkContext.broadcast()
.
pyspark.sql.DataType
:
Spark SQL DataType
pyspark.mllib.tree.DecisionTree
:
Learning algorithm for a decision tree model for classification or regression.
pyspark.mllib.tree.DecisionTreeModel
:
A decision tree model for classification or regression.
pyspark.mllib.clustering.KMeans
pyspark.mllib.clustering.KMeansModel
:
A clustering model derived from the k-means method.
pyspark.mllib.regression.LabeledPoint
:
The features and labels of a data point.
pyspark.mllib.regression.LassoWithSGD
pyspark.mllib.regression.LinearModel
:
A linear model that has a vector of coefficients and an intercept.
pyspark.mllib.regression.LinearRegressionWithSGD
pyspark.mllib.classification.LogisticRegressionWithSGD
pyspark.mllib.recommendation.MatrixFactorizationModel
:
A matrix factorisation model trained by regularized alternating least-squares.
pyspark.mllib.stat.MultivariateStatisticalSummary
:
Trait for multivariate statistical summary of a data matrix.
pyspark.mllib.classification.NaiveBayes
pyspark.mllib.classification.NaiveBayesModel
:
Model for Naive Bayes classifiers.
pyspark.rdd.RDD
:
A Resilient Distributed Dataset (RDD), the basic abstraction in Spark.
pyspark.mllib.regression.RidgeRegressionWithSGD
pyspark.mllib.classification.SVMWithSGD
pyspark.serializers.Serializer
pyspark.conf.SparkConf
:
Configuration for a Spark application.
pyspark.context.SparkContext
:
Main entry point for Spark functionality.
pyspark.files.SparkFiles
:
Resolves paths to files added through
SparkContext.addFile()
.
pyspark.mllib.linalg.SparseVector
:
A simple sparse vector class for passing data to MLlib.
pyspark.statcounter.StatCounter
pyspark.mllib.stat.Statistics
pyspark.mllib.linalg.Vectors
:
Factory methods for working with vectors.
tuple
:
tuple() -> empty tuple tuple(iterable) -> tuple initialized from iterable's items
type
:
type(object) -> the object's type type(name, bases, dict) -> a new type
Home
Trees
Indices
Help
Spark 1.1.0 Python API Docs
Generated by Epydoc 3.0.1 on Thu Sep 11 01:19:40 2014
http://epydoc.sourceforge.net