pyspark.pandas.DataFrame.first_valid_index¶
-
DataFrame.
first_valid_index
() → Union[int, float, bool, str, bytes, decimal.Decimal, datetime.date, datetime.datetime, None, Tuple[Union[int, float, bool, str, bytes, decimal.Decimal, datetime.date, datetime.datetime, None], …]]¶ Retrieves the index of the first valid value.
- Returns
- scalar, tuple, or None
Examples
Support for DataFrame
>>> psdf = ps.DataFrame({'a': [None, 2, 3, 2], ... 'b': [None, 2.0, 3.0, 1.0], ... 'c': [None, 200, 400, 200]}, ... index=['Q', 'W', 'E', 'R']) >>> psdf a b c Q NaN NaN NaN W 2.0 2.0 200.0 E 3.0 3.0 400.0 R 2.0 1.0 200.0
>>> psdf.first_valid_index() 'W'
Support for MultiIndex columns
>>> psdf.columns = pd.MultiIndex.from_tuples([('a', 'x'), ('b', 'y'), ('c', 'z')]) >>> psdf a b c x y z Q NaN NaN NaN W 2.0 2.0 200.0 E 3.0 3.0 400.0 R 2.0 1.0 200.0
>>> psdf.first_valid_index() 'W'
Support for Series.
>>> s = ps.Series([None, None, 3, 4, 5], index=[100, 200, 300, 400, 500]) >>> s 100 NaN 200 NaN 300 3.0 400 4.0 500 5.0 dtype: float64
>>> s.first_valid_index() 300
Support for MultiIndex
>>> midx = pd.MultiIndex([['lama', 'cow', 'falcon'], ... ['speed', 'weight', 'length']], ... [[0, 0, 0, 1, 1, 1, 2, 2, 2], ... [0, 1, 2, 0, 1, 2, 0, 1, 2]]) >>> s = ps.Series([None, None, None, None, 250, 1.5, 320, 1, 0.3], index=midx) >>> s lama speed NaN weight NaN length NaN cow speed NaN weight 250.0 length 1.5 falcon speed 320.0 weight 1.0 length 0.3 dtype: float64
>>> s.first_valid_index() ('cow', 'weight')