pyspark.sql.functions.median¶
-
pyspark.sql.functions.
median
(col: ColumnOrName) → pyspark.sql.column.Column[source]¶ Returns the median of the values in a group.
New in version 3.4.0.
- Parameters
- col
Column
or str target column to compute on.
- col
- Returns
Column
the median of the values in a group.
Notes
Supports Spark Connect.
Examples
>>> df = spark.createDataFrame([ ... ("Java", 2012, 20000), ("dotNET", 2012, 5000), ... ("Java", 2012, 22000), ("dotNET", 2012, 10000), ... ("dotNET", 2013, 48000), ("Java", 2013, 30000)], ... schema=("course", "year", "earnings")) >>> df.groupby("course").agg(median("earnings")).show() +------+----------------+ |course|median(earnings)| +------+----------------+ | Java| 22000.0| |dotNET| 10000.0| +------+----------------+