pyspark.sql.functions.window_time

pyspark.sql.functions.window_time(windowColumn: ColumnOrName) → pyspark.sql.column.Column[source]

Computes the event time from a window column. The column window values are produced by window aggregating operators and are of type STRUCT<start: TIMESTAMP, end: TIMESTAMP> where start is inclusive and end is exclusive. The event time of records produced by window aggregating operators can be computed as window_time(window) and are window.end - lit(1).alias("microsecond") (as microsecond is the minimal supported event time precision). The window column must be one produced by a window aggregating operator.

New in version 3.4.0.

Changed in version 3.4.0: Supports Spark Connect.

Parameters
windowColumnColumn

The window column of a window aggregate records.

Returns
Column

the column for computed results.

Examples

>>> import datetime
>>> df = spark.createDataFrame(
...     [(datetime.datetime(2016, 3, 11, 9, 0, 7), 1)],
... ).toDF("date", "val")

Group the data into 5 second time windows and aggregate as sum.

>>> w = df.groupBy(window("date", "5 seconds")).agg(sum("val").alias("sum"))

Extract the window event time using the window_time function.

>>> w.select(
...     w.window.end.cast("string").alias("end"),
...     window_time(w.window).cast("string").alias("window_time"),
...     "sum"
... ).collect()
[Row(end='2016-03-11 09:00:10', window_time='2016-03-11 09:00:09.999999', sum=1)]