pyspark.pandas.DataFrame.last_valid_index#
- DataFrame.last_valid_index()#
Return index for last non-NA/null value.
- Returns
- scalar, tuple, or None
Notes
This API only works with PySpark >= 3.0.
Examples
Support for DataFrame
>>> psdf = ps.DataFrame({'a': [1, 2, 3, None], ... 'b': [1.0, 2.0, 3.0, None], ... 'c': [100, 200, 400, None]}, ... index=['Q', 'W', 'E', 'R']) >>> psdf a b c Q 1.0 1.0 100.0 W 2.0 2.0 200.0 E 3.0 3.0 400.0 R NaN NaN NaN
>>> psdf.last_valid_index() 'E'
Support for MultiIndex columns
>>> psdf.columns = pd.MultiIndex.from_tuples([('a', 'x'), ('b', 'y'), ('c', 'z')]) >>> psdf a b c x y z Q 1.0 1.0 100.0 W 2.0 2.0 200.0 E 3.0 3.0 400.0 R NaN NaN NaN
>>> psdf.last_valid_index() 'E'
Support for Series.
>>> s = ps.Series([1, 2, 3, None, None], index=[100, 200, 300, 400, 500]) >>> s 100 1.0 200 2.0 300 3.0 400 NaN 500 NaN dtype: float64
>>> s.last_valid_index() 300
Support for MultiIndex
>>> midx = pd.MultiIndex([['lama', 'cow', 'falcon'], ... ['speed', 'weight', 'length']], ... [[0, 0, 0, 1, 1, 1, 2, 2, 2], ... [0, 1, 2, 0, 1, 2, 0, 1, 2]]) >>> s = ps.Series([250, 1.5, 320, 1, 0.3, None, None, None, None], index=midx) >>> s lama speed 250.0 weight 1.5 length 320.0 cow speed 1.0 weight 0.3 length NaN falcon speed NaN weight NaN length NaN dtype: float64
>>> s.last_valid_index() ('cow', 'weight')