sparkR.session {SparkR}R Documentation

Get the existing SparkSession or initialize a new SparkSession.

Description

Additional Spark properties can be set (...), and these named parameters take priority over over values in master, appName, named lists of sparkConfig.

Usage

sparkR.session(master = "", appName = "SparkR",
  sparkHome = Sys.getenv("SPARK_HOME"), sparkConfig = list(),
  sparkJars = "", sparkPackages = "", enableHiveSupport = TRUE, ...)

Arguments

master

The Spark master URL

appName

Application name to register with cluster manager

sparkHome

Spark Home directory

sparkConfig

Named list of Spark configuration to set on worker nodes

sparkJars

Character vector of jar files to pass to the worker nodes

sparkPackages

Character vector of packages from spark-packages.org

enableHiveSupport

Enable support for Hive, fallback if not built with Hive support; once set, this cannot be turned off on an existing session

Details

For details on how to initialize and use SparkR, refer to SparkR programming guide at http://spark.apache.org/docs/latest/sparkr.html#starting-up-sparksession.

Note

sparkR.session since 2.0.0

Examples

## Not run: 
##D sparkR.session()
##D df <- read.json(path)
##D 
##D sparkR.session("local[2]", "SparkR", "/home/spark")
##D sparkR.session("yarn-client", "SparkR", "/home/spark",
##D                list(spark.executor.memory="4g"),
##D                c("one.jar", "two.jar", "three.jar"),
##D                c("com.databricks:spark-avro_2.10:2.0.1"))
##D sparkR.session(spark.master = "yarn-client", spark.executor.memory = "4g")
## End(Not run)

[Package SparkR version 2.0.0 Index]