DenseVector¶

class pyspark.mllib.linalg.DenseVector(ar: Union[bytes, numpy.ndarray, Iterable[float]])[source]¶

A dense vector represented by a value array. We use numpy array for storage and arithmetics will be delegated to the underlying numpy array.

Examples

>>> v = Vectors.dense([1.0, 2.0])
>>> u = Vectors.dense([3.0, 4.0])
>>> v + u
DenseVector([4.0, 6.0])
>>> 2 - v
DenseVector([1.0, 0.0])
>>> v / 2
DenseVector([0.5, 1.0])
>>> v * u
DenseVector([3.0, 8.0])
>>> u / v
DenseVector([3.0, 2.0])
>>> u % 2
DenseVector([1.0, 0.0])
>>> -v
DenseVector([-1.0, -2.0])

Methods

`asML`()	Convert this vector to the new mllib-local representation.
`dot`(other)	Compute the dot product of two Vectors.
`norm`(p)	Calculates the norm of a DenseVector.
`numNonzeros`()	Number of nonzero elements.
`parse`(s)	Parse string representation back into the DenseVector.
`squared_distance`(other)	Squared distance of two Vectors.
`toArray`()	Returns an numpy.ndarray

Attributes

values

Returns a list of values

Methods Documentation

asML() → pyspark.ml.linalg.DenseVector [source]¶

Convert this vector to the new mllib-local representation. This does NOT copy the data; it copies references.

New in version 2.0.0.

Returns

pyspark.ml.linalg.DenseVector

dot(other: Iterable[float]) → numpy.float64[source]¶

Compute the dot product of two Vectors. We support (Numpy array, list, SparseVector, or SciPy sparse) and a target NumPy array that is either 1- or 2-dimensional. Equivalent to calling numpy.dot of the two vectors.

Examples

>>> dense = DenseVector(array.array('d', [1., 2.]))
>>> dense.dot(dense)
5.0
>>> dense.dot(SparseVector(2, [0, 1], [2., 1.]))
4.0
>>> dense.dot(range(1, 3))
5.0
>>> dense.dot(np.array(range(1, 3)))
5.0
>>> dense.dot([1.,])
Traceback (most recent call last):
    ...
AssertionError: dimension mismatch
>>> dense.dot(np.reshape([1., 2., 3., 4.], (2, 2), order='F'))
array([  5.,  11.])
>>> dense.dot(np.reshape([1., 2., 3.], (3, 1), order='F'))
Traceback (most recent call last):
    ...
AssertionError: dimension mismatch

norm(p: NormType) → numpy.float64[source]¶

Calculates the norm of a DenseVector.

Examples

>>> a = DenseVector([0, -1, 2, -3])
>>> a.norm(2)
3.7...
>>> a.norm(1)
6.0

numNonzeros() → int[source]¶: Number of nonzero elements. This scans all active values and count non zeros

static parse(s: str) → pyspark.mllib.linalg.DenseVector [source]¶

Parse string representation back into the DenseVector.

Examples

>>> DenseVector.parse(' [ 0.0,1.0,2.0,  3.0]')
DenseVector([0.0, 1.0, 2.0, 3.0])

squared_distance(other: Iterable[float]) → numpy.float64[source]¶

Squared distance of two Vectors.

Examples

>>> dense1 = DenseVector(array.array('d', [1., 2.]))
>>> dense1.squared_distance(dense1)
0.0
>>> dense2 = np.array([2., 1.])
>>> dense1.squared_distance(dense2)
2.0
>>> dense3 = [2., 1.]
>>> dense1.squared_distance(dense3)
2.0
>>> sparse1 = SparseVector(2, [0, 1], [2., 1.])
>>> dense1.squared_distance(sparse1)
2.0
>>> dense1.squared_distance([1.,])
Traceback (most recent call last):
    ...
AssertionError: dimension mismatch
>>> dense1.squared_distance(SparseVector(1, [0,], [1.,]))
Traceback (most recent call last):
    ...
AssertionError: dimension mismatch

toArray() → numpy.ndarray[source]¶: Returns an numpy.ndarray

Attributes Documentation

values¶: Returns a list of values

Vector SparseVector