Series.
corr
Compute correlation with other Series, excluding missing values.
pearson : standard correlation coefficient
spearman : Spearman rank correlation
Notes
There are behavior differences between pandas-on-Spark and pandas.
the method argument only accepts ‘pearson’, ‘spearman’
the data should not contain NaNs. pandas-on-Spark will return an error.
pandas-on-Spark doesn’t support the following argument(s).
min_periods argument is not supported
Examples
>>> df = ps.DataFrame({'s1': [.2, .0, .6, .2], ... 's2': [.3, .6, .0, .1]}) >>> s1 = df.s1 >>> s2 = df.s2 >>> s1.corr(s2, method='pearson') -0.851064...
>>> s1.corr(s2, method='spearman') -0.948683...