pyspark.pandas.DataFrame.hist#
- DataFrame.hist(bins=10, **kwds)[source]#
Draw one histogram of the DataFrame’s columns. A histogram is a representation of the distribution of data. This function calls
plotting.backend.plot()
, on each series in the DataFrame, resulting in one histogram per column.- Parameters
- binsinteger or sequence, default 10
Number of histogram bins to be used. If an integer is given, bins + 1 bin edges are calculated and returned. If bins is a sequence, it gives bin edges, including left edge of first bin and right edge of last bin. In this case, bins are returned unmodified.
- **kwds
All other plotting keyword arguments to be passed to plotting backend.
- Returns
plotly.graph_objs.Figure
Return an custom object when
backend!=plotly
. Return an ndarray whensubplots=True
(matplotlib-only).
Examples
Basic plot.
For Series:
>>> s = ps.Series([1, 3, 2]) >>> s.plot.hist()
For DataFrame:
>>> df = pd.DataFrame( ... np.random.randint(1, 7, 6000), ... columns=['one']) >>> df['two'] = df['one'] + np.random.randint(1, 7, 6000) >>> df = ps.from_pandas(df) >>> df.plot.hist(bins=12, alpha=0.5)