pyspark.sql.UDTFRegistration.register¶
-
UDTFRegistration.
register
(name: str, f: pyspark.sql.udtf.UserDefinedTableFunction) → pyspark.sql.udtf.UserDefinedTableFunction[source]¶ Register a Python user-defined table function as a SQL table function.
New in version 3.5.0.
- Parameters
- namestr
The name of the user-defined table function in SQL statements.
- ffunction or
pyspark.sql.functions.udtf()
The user-defined table function.
- Returns
- function
The registered user-defined table function.
Notes
Spark uses the return type of the given user-defined table function as the return type of the registered user-defined function.
To register a nondeterministic Python table function, users need to first build a nondeterministic user-defined table function and then register it as a SQL function.
Examples
>>> from pyspark.sql.functions import udtf >>> @udtf(returnType="c1: int, c2: int") ... class PlusOne: ... def eval(self, x: int): ... yield x, x + 1 ... >>> _ = spark.udtf.register(name="plus_one", f=PlusOne) >>> spark.sql("SELECT * FROM plus_one(1)").collect() [Row(c1=1, c2=2)]
Use it with lateral join
>>> spark.sql("SELECT * FROM VALUES (0, 1), (1, 2) t(x, y), LATERAL plus_one(x)").collect() [Row(x=0, y=1, c1=0, c2=1), Row(x=1, y=2, c1=1, c2=2)]