DenseVector

class pyspark.mllib.linalg.DenseVector(ar: Union[bytes, numpy.ndarray, Iterable[float]])[source]

A dense vector represented by a value array. We use numpy array for storage and arithmetics will be delegated to the underlying numpy array.

Examples

>>> v = Vectors.dense([1.0, 2.0])
>>> u = Vectors.dense([3.0, 4.0])
>>> v + u
DenseVector([4.0, 6.0])
>>> 2 - v
DenseVector([1.0, 0.0])
>>> v / 2
DenseVector([0.5, 1.0])
>>> v * u
DenseVector([3.0, 8.0])
>>> u / v
DenseVector([3.0, 2.0])
>>> u % 2
DenseVector([1.0, 0.0])
>>> -v
DenseVector([-1.0, -2.0])

Methods

asML()

Convert this vector to the new mllib-local representation.

dot(other)

Compute the dot product of two Vectors.

norm(p)

Calculates the norm of a DenseVector.

numNonzeros()

Number of nonzero elements.

parse(s)

Parse string representation back into the DenseVector.

squared_distance(other)

Squared distance of two Vectors.

toArray()

Returns an numpy.ndarray

Attributes

values

Returns a list of values

Methods Documentation

asML()pyspark.ml.linalg.DenseVector[source]

Convert this vector to the new mllib-local representation. This does NOT copy the data; it copies references.

New in version 2.0.0.

Returns
pyspark.ml.linalg.DenseVector
dot(other: Iterable[float]) → numpy.float64[source]

Compute the dot product of two Vectors. We support (Numpy array, list, SparseVector, or SciPy sparse) and a target NumPy array that is either 1- or 2-dimensional. Equivalent to calling numpy.dot of the two vectors.

Examples

>>> dense = DenseVector(array.array('d', [1., 2.]))
>>> dense.dot(dense)
5.0
>>> dense.dot(SparseVector(2, [0, 1], [2., 1.]))
4.0
>>> dense.dot(range(1, 3))
5.0
>>> dense.dot(np.array(range(1, 3)))
5.0
>>> dense.dot([1.,])
Traceback (most recent call last):
    ...
AssertionError: dimension mismatch
>>> dense.dot(np.reshape([1., 2., 3., 4.], (2, 2), order='F'))
array([  5.,  11.])
>>> dense.dot(np.reshape([1., 2., 3.], (3, 1), order='F'))
Traceback (most recent call last):
    ...
AssertionError: dimension mismatch
norm(p: NormType) → numpy.float64[source]

Calculates the norm of a DenseVector.

Examples

>>> a = DenseVector([0, -1, 2, -3])
>>> a.norm(2)
3.7...
>>> a.norm(1)
6.0
numNonzeros() → int[source]

Number of nonzero elements. This scans all active values and count non zeros

static parse(s: str)pyspark.mllib.linalg.DenseVector[source]

Parse string representation back into the DenseVector.

Examples

>>> DenseVector.parse(' [ 0.0,1.0,2.0,  3.0]')
DenseVector([0.0, 1.0, 2.0, 3.0])
squared_distance(other: Iterable[float]) → numpy.float64[source]

Squared distance of two Vectors.

Examples

>>> dense1 = DenseVector(array.array('d', [1., 2.]))
>>> dense1.squared_distance(dense1)
0.0
>>> dense2 = np.array([2., 1.])
>>> dense1.squared_distance(dense2)
2.0
>>> dense3 = [2., 1.]
>>> dense1.squared_distance(dense3)
2.0
>>> sparse1 = SparseVector(2, [0, 1], [2., 1.])
>>> dense1.squared_distance(sparse1)
2.0
>>> dense1.squared_distance([1.,])
Traceback (most recent call last):
    ...
AssertionError: dimension mismatch
>>> dense1.squared_distance(SparseVector(1, [0,], [1.,]))
Traceback (most recent call last):
    ...
AssertionError: dimension mismatch
toArray() → numpy.ndarray[source]

Returns an numpy.ndarray

Attributes Documentation

values

Returns a list of values