DataFrame.
pivot_table
Create a spreadsheet-style pivot table as a DataFrame. The levels in the pivot table will be stored in MultiIndex objects (hierarchical indexes) on the index and columns of the result DataFrame.
They should be either a list less than three or a string.
If an array is passed, it must be the same length as the data. The list should contain string.
Columns used in the pivot operation. Only one column is supported and it should be a string.
If dict is passed, the key is column to aggregate and value is function or list of functions.
Value to replace missing values with.
Examples
>>> df = ps.DataFrame({"A": ["foo", "foo", "foo", "foo", "foo", ... "bar", "bar", "bar", "bar"], ... "B": ["one", "one", "one", "two", "two", ... "one", "one", "two", "two"], ... "C": ["small", "large", "large", "small", ... "small", "large", "small", "small", ... "large"], ... "D": [1, 2, 2, 3, 3, 4, 5, 6, 7], ... "E": [2, 4, 5, 5, 6, 6, 8, 9, 9]}, ... columns=['A', 'B', 'C', 'D', 'E']) >>> df A B C D E 0 foo one small 1 2 1 foo one large 2 4 2 foo one large 2 5 3 foo two small 3 5 4 foo two small 3 6 5 bar one large 4 6 6 bar one small 5 8 7 bar two small 6 9 8 bar two large 7 9
This first example aggregates values by taking the sum.
>>> table = df.pivot_table(values='D', index=['A', 'B'], ... columns='C', aggfunc='sum') >>> table.sort_index() C large small A B bar one 4.0 5 two 7.0 6 foo one 4.0 1 two NaN 6
We can also fill missing values using the fill_value parameter.
>>> table = df.pivot_table(values='D', index=['A', 'B'], ... columns='C', aggfunc='sum', fill_value=0) >>> table.sort_index() C large small A B bar one 4 5 two 7 6 foo one 4 1 two 0 6
We can also calculate multiple types of aggregations for any given value column.
>>> table = df.pivot_table(values=['D'], index =['C'], ... columns="A", aggfunc={'D': 'mean'}) >>> table.sort_index() D A bar foo C large 5.5 2.000000 small 5.5 2.333333
The next example aggregates on multiple values.
>>> table = df.pivot_table(index=['C'], columns="A", values=['D', 'E'], ... aggfunc={'D': 'mean', 'E': 'sum'}) >>> table.sort_index() D E A bar foo bar foo C large 5.5 2.000000 15 9 small 5.5 2.333333 17 13