pyspark.pandas.DataFrame.xs

DataFrame.xs(key: Union[Any, Tuple[Any, …]], axis: Union[int, str] = 0, level: Optional[int] = None) → Union[DataFrame, Series][source]

Return cross-section from the DataFrame.

This method takes a key argument to select data at a particular level of a MultiIndex.

Parameters
keylabel or tuple of label

Label contained in the index, or partially in a MultiIndex.

axis0 or ‘index’, default 0

Axis to retrieve cross-section on. currently only support 0 or ‘index’

levelobject, defaults to first n levels (n=1 or len(key))

In case of a key partially contained in a MultiIndex, indicate which levels are used. Levels can be referred by label or position.

Returns
DataFrame or Series

Cross-section from the original DataFrame corresponding to the selected index levels.

See also

DataFrame.loc

Access a group of rows and columns by label(s) or a boolean array.

DataFrame.iloc

Purely integer-location based indexing for selection by position.

Examples

>>> d = {'num_legs': [4, 4, 2, 2],
...      'num_wings': [0, 0, 2, 2],
...      'class': ['mammal', 'mammal', 'mammal', 'bird'],
...      'animal': ['cat', 'dog', 'bat', 'penguin'],
...      'locomotion': ['walks', 'walks', 'flies', 'walks']}
>>> df = ps.DataFrame(data=d)
>>> df = df.set_index(['class', 'animal', 'locomotion'])
>>> df  
                           num_legs  num_wings
class  animal  locomotion
mammal cat     walks              4          0
       dog     walks              4          0
       bat     flies              2          2
bird   penguin walks              2          2

Get values at specified index

>>> df.xs('mammal')  
                   num_legs  num_wings
animal locomotion
cat    walks              4          0
dog    walks              4          0
bat    flies              2          2

Get values at several indexes

>>> df.xs(('mammal', 'dog'))  
            num_legs  num_wings
locomotion
walks              4          0
>>> df.xs(('mammal', 'dog', 'walks'))  
num_legs     4
num_wings    0
Name: (mammal, dog, walks), dtype: int64

Get values at specified index and level

>>> df.xs('cat', level=1)  
                   num_legs  num_wings
class  locomotion
mammal walks              4          0