freqItems {SparkR} | R Documentation |
Finding frequent items for columns, possibly with false positives. Using the frequent element count algorithm described in http://dx.doi.org/10.1145/762471.762473, proposed by Karp, Schenker, and Papadimitriou.
## S4 method for signature 'SparkDataFrame,character' freqItems(x, cols, support = 0.01)
x |
A SparkDataFrame. |
cols |
A vector column names to search frequent items in. |
support |
(Optional) The minimum frequency for an item to be considered 'frequent'. Should be greater than 1e-4. Default support = 0.01. |
a local R data.frame with the frequent items in each column
freqItems since 1.6.0
Other stat functions: approxQuantile
,
approxQuantile,SparkDataFrame,character,numeric,numeric-method
;
corr
, corr
,
corr
, corr,Column-method
,
corr,SparkDataFrame-method
;
cov
, cov
, cov
,
cov,SparkDataFrame-method
,
cov,characterOrColumn-method
,
covar_samp
, covar_samp
,
covar_samp,characterOrColumn,characterOrColumn-method
;
crosstab
,
crosstab,SparkDataFrame,character,character-method
;
sampleBy
, sampleBy
,
sampleBy,SparkDataFrame,character,list,numeric-method
## Not run:
##D df <- read.json("/path/to/file.json")
##D fi = freqItems(df, c("title", "gender"))
## End(Not run)