pyspark.sql.
Catalog
User-facing catalog API, accessible through SparkSession.catalog.
This is a thin wrapper around its Scala implementation org.apache.spark.sql.catalog.Catalog.
Methods
cacheTable(tableName)
cacheTable
Caches the specified table in-memory.
clearCache()
clearCache
Removes all cached tables from the in-memory cache.
createExternalTable(tableName[, path, …])
createExternalTable
Creates a table based on the dataset in a data source.
createTable(tableName[, path, source, …])
createTable
currentDatabase()
currentDatabase
Returns the current default database in this session.
dropGlobalTempView(viewName)
dropGlobalTempView
Drops the global temporary view with the given view name in the catalog.
dropTempView(viewName)
dropTempView
Drops the local temporary view with the given view name in the catalog.
isCached(tableName)
isCached
Returns true if the table is currently cached in-memory.
listColumns(tableName[, dbName])
listColumns
Returns a list of columns for the given table/view in the specified database.
listDatabases()
listDatabases
Returns a list of databases available across all sessions.
listFunctions([dbName])
listFunctions
Returns a list of functions registered in the specified database.
listTables([dbName])
listTables
Returns a list of tables/views in the specified database.
recoverPartitions(tableName)
recoverPartitions
Recovers all the partitions of the given table and update the catalog.
refreshByPath(path)
refreshByPath
Invalidates and refreshes all the cached data (and the associated metadata) for any DataFrame that contains the given data source path.
refreshTable(tableName)
refreshTable
Invalidates and refreshes all the cached data and metadata of the given table.
registerFunction(name, f[, returnType])
registerFunction
An alias for spark.udf.register().
spark.udf.register()
setCurrentDatabase(dbName)
setCurrentDatabase
Sets the current default database in this session.
uncacheTable(tableName)
uncacheTable
Removes the specified table from the in-memory cache.