pyspark.sql.functions.width_bucket¶
-
pyspark.sql.functions.
width_bucket
(v: ColumnOrName, min: ColumnOrName, max: ColumnOrName, numBucket: Union[ColumnOrName, int]) → pyspark.sql.column.Column[source]¶ Returns the bucket number into which the value of this expression would fall after being evaluated. Note that input arguments must follow conditions listed below; otherwise, the method will return null.
New in version 3.5.0.
- Parameters
- Returns
Column
the bucket number into which the value would fall after being evaluated
Examples
>>> df = spark.createDataFrame([ ... (5.3, 0.2, 10.6, 5), ... (-2.1, 1.3, 3.4, 3), ... (8.1, 0.0, 5.7, 4), ... (-0.9, 5.2, 0.5, 2)], ... ['v', 'min', 'max', 'n']) >>> df.select(width_bucket('v', 'min', 'max', 'n')).show() +----------------------------+ |width_bucket(v, min, max, n)| +----------------------------+ | 3| | 0| | 5| | 3| +----------------------------+