pyspark.mllib.linalg.
DenseVector
A dense vector represented by a value array. We use numpy array for storage and arithmetics will be delegated to the underlying numpy array.
Examples
>>> v = Vectors.dense([1.0, 2.0]) >>> u = Vectors.dense([3.0, 4.0]) >>> v + u DenseVector([4.0, 6.0]) >>> 2 - v DenseVector([1.0, 0.0]) >>> v / 2 DenseVector([0.5, 1.0]) >>> v * u DenseVector([3.0, 8.0]) >>> u / v DenseVector([3.0, 2.0]) >>> u % 2 DenseVector([1.0, 0.0]) >>> -v DenseVector([-1.0, -2.0])
Methods
asML()
asML
Convert this vector to the new mllib-local representation.
dot(other)
dot
Compute the dot product of two Vectors.
norm(p)
norm
Calculates the norm of a DenseVector.
numNonzeros()
numNonzeros
Number of nonzero elements.
parse(s)
parse
Parse string representation back into the DenseVector.
squared_distance(other)
squared_distance
Squared distance of two Vectors.
toArray()
toArray
Returns an numpy.ndarray
Attributes
values
Returns a list of values
Methods Documentation
Convert this vector to the new mllib-local representation. This does NOT copy the data; it copies references.
New in version 2.0.0.
pyspark.ml.linalg.DenseVector
Compute the dot product of two Vectors. We support (Numpy array, list, SparseVector, or SciPy sparse) and a target NumPy array that is either 1- or 2-dimensional. Equivalent to calling numpy.dot of the two vectors.
>>> dense = DenseVector(array.array('d', [1., 2.])) >>> dense.dot(dense) 5.0 >>> dense.dot(SparseVector(2, [0, 1], [2., 1.])) 4.0 >>> dense.dot(range(1, 3)) 5.0 >>> dense.dot(np.array(range(1, 3))) 5.0 >>> dense.dot([1.,]) Traceback (most recent call last): ... AssertionError: dimension mismatch >>> dense.dot(np.reshape([1., 2., 3., 4.], (2, 2), order='F')) array([ 5., 11.]) >>> dense.dot(np.reshape([1., 2., 3.], (3, 1), order='F')) Traceback (most recent call last): ... AssertionError: dimension mismatch
>>> a = DenseVector([0, -1, 2, -3]) >>> a.norm(2) 3.7... >>> a.norm(1) 6.0
Number of nonzero elements. This scans all active values and count non zeros
>>> DenseVector.parse(' [ 0.0,1.0,2.0, 3.0]') DenseVector([0.0, 1.0, 2.0, 3.0])
>>> dense1 = DenseVector(array.array('d', [1., 2.])) >>> dense1.squared_distance(dense1) 0.0 >>> dense2 = np.array([2., 1.]) >>> dense1.squared_distance(dense2) 2.0 >>> dense3 = [2., 1.] >>> dense1.squared_distance(dense3) 2.0 >>> sparse1 = SparseVector(2, [0, 1], [2., 1.]) >>> dense1.squared_distance(sparse1) 2.0 >>> dense1.squared_distance([1.,]) Traceback (most recent call last): ... AssertionError: dimension mismatch >>> dense1.squared_distance(SparseVector(1, [0,], [1.,])) Traceback (most recent call last): ... AssertionError: dimension mismatch
Attributes Documentation