pyspark.sql.Row¶

class pyspark.sql.Row[source]¶

A row in DataFrame. The fields in it can be accessed:

like attributes (row.key)
like dictionary values (row[key])

key in row will search through row keys.

Row can be used to create a row object by using named arguments. It is not allowed to omit a named argument to represent that the value is None or missing. This should be explicitly set to None in this case.

Changed in version 3.0.0: Rows created from named arguments no longer have field names sorted alphabetically and will be ordered in the position as entered.

Examples

>>> from pyspark.sql import Row
>>> row = Row(name="Alice", age=11)
>>> row
Row(name='Alice', age=11)
>>> row['name'], row['age']
('Alice', 11)
>>> row.name, row.age
('Alice', 11)
>>> 'name' in row
True
>>> 'wrong_key' in row
False

Row also can be used to create another Row like class, then it could be used to create Row objects, such as

>>> Person = Row("name", "age")
>>> Person
<Row('name', 'age')>
>>> 'name' in Person
True
>>> 'wrong_key' in Person
False
>>> Person("Alice", 11)
Row(name='Alice', age=11)

This form can also be used to create rows as tuple values, i.e. with unnamed fields.

>>> row1 = Row("Alice", 11)
>>> row2 = Row(name="Alice", age=11)
>>> row1 == row2
True

Methods

`asDict`([recursive])	Return as a dict
`count`(value, /)	Return number of occurrences of value.
`index`(value[, start, stop])	Return first index of value.

pyspark.sql.Observation

pyspark.sql.GroupedData