DataFrame.
quantile
Return value at the given quantile.
Note
Unlike pandas’, the quantile in pandas-on-Spark is an approximated quantile based upon approximate percentile computation because computing quantile across a large dataset is extremely expensive.
0 <= q <= 1, the quantile(s) to compute.
Can only be set to 0 at the moment.
If False, the quantile of datetime and timedelta data will be computed as well. Can only be set to True at the moment.
Default accuracy of approximation. Larger value means better accuracy. The relative error can be deduced by 1.0 / accuracy.
If q is an array, a DataFrame will be returned where the index is q, the columns are the columns of self, and the values are the quantiles. If q is a float, a Series will be returned where the index is the columns of self and the values are the quantiles.
Examples
>>> psdf = ps.DataFrame({'a': [1, 2, 3, 4, 5], 'b': [6, 7, 8, 9, 0]}) >>> psdf a b 0 1 6 1 2 7 2 3 8 3 4 9 4 5 0
>>> psdf.quantile(.5) a 3.0 b 7.0 Name: 0.5, dtype: float64
>>> psdf.quantile([.25, .5, .75]) a b 0.25 2.0 6.0 0.50 3.0 7.0 0.75 4.0 8.0