spark.lapply {SparkR} | R Documentation |
Applies a function in a manner that is similar to doParallel or lapply to elements of a list. The computations are distributed using Spark. It is conceptually the same as the following code: lapply(list, func)
Known limitations: - variable scoping and capture: compared to R's rich support for variable resolutions, the
- loading external packages: In order to use a package, you need to load it inside the closure. For example, if you rely on the MASS module, here is how you would use it: ## Not run: train <- function(hyperparam) { library(MASS) lm.ridge(“y ~ x+z”, data, lambda=hyperparam) model } ## End(Not run)
spark.lapply(sc, list, func)
sc |
Spark Context to use |
list |
the list of elements |
func |
a function that takes one argument. |
a list of results (the exact type being determined by the function)
## Not run:
##D doubled <- spark.lapply(1:10, function(x){2 * x})
## End(Not run)