Class LocalHiveContext
source code
SQLContext --+
|
HiveContext --+
|
LocalHiveContext
Starts up an instance of hive where metadata is stored locally.
An in-process metadata data is created with data stored in ./metadata.
Warehouse data is stored in in ./warehouse.
>>> import os
>>> hiveCtx = LocalHiveContext(sc)
>>> try:
... supress = hiveCtx.sql("DROP TABLE src")
... except Exception:
... pass
>>> kv1 = os.path.join(os.environ["SPARK_HOME"],
... 'examples/src/main/resources/kv1.txt')
>>> supress = hiveCtx.sql(
... "CREATE TABLE IF NOT EXISTS src (key INT, value STRING)")
>>> supress = hiveCtx.sql("LOAD DATA LOCAL INPATH '%s' INTO TABLE src"
... % kv1)
>>> results = hiveCtx.sql("FROM src SELECT value"
... ).map(lambda r: int(r.value.split('_')[1]))
>>> num = results.count()
>>> reduce_sum = results.reduce(lambda x, y: x + y)
>>> num
500
>>> reduce_sum
130091
|
|
Inherited from HiveContext :
hiveql ,
hql
Inherited from SQLContext :
applySchema ,
cacheTable ,
inferSchema ,
jsonFile ,
jsonRDD ,
parquetFile ,
registerFunction ,
registerRDDAsTable ,
sql ,
table ,
uncacheTable
|
__init__(self,
sparkContext,
sqlContext=None)
(Constructor)
| source code
|
Create a new HiveContext.
- Parameters:
sparkContext - The SparkContext to wrap.
hiveContext - An optional JVM Scala HiveContext. If set, we do not instatiate a
new HiveContext in the JVM, instead we make all calls to this
object.
- Overrides:
SQLContext.__init__
- (inherited documentation)
|