pyspark.sql.DataFrame.fillna¶
-
DataFrame.
fillna
(value, subset=None)[source]¶ Replace null values, alias for
na.fill()
.DataFrame.fillna()
andDataFrameNaFunctions.fill()
are aliases of each other.New in version 1.3.1.
- Parameters
- valueint, float, string, bool or dict
Value to replace null values with. If the value is a dict, then subset is ignored and value must be a mapping from column name (string) to replacement value. The replacement value must be an int, float, boolean, or string.
- subsetstr, tuple or list, optional
optional list of column names to consider. Columns specified in subset that do not have matching data type are ignored. For example, if value is a string, and subset contains a non-string column, then the non-string column is simply ignored.
Examples
>>> df4.na.fill(50).show() +---+------+-----+ |age|height| name| +---+------+-----+ | 10| 80|Alice| | 5| 50| Bob| | 50| 50| Tom| | 50| 50| null| +---+------+-----+
>>> df5.na.fill(False).show() +----+-------+-----+ | age| name| spy| +----+-------+-----+ | 10| Alice|false| | 5| Bob|false| |null|Mallory| true| +----+-------+-----+
>>> df4.na.fill({'age': 50, 'name': 'unknown'}).show() +---+------+-------+ |age|height| name| +---+------+-------+ | 10| 80| Alice| | 5| null| Bob| | 50| null| Tom| | 50| null|unknown| +---+------+-------+