DataFrameWriter.
orc
Saves the content of the DataFrame in ORC format at the specified path.
DataFrame
New in version 1.5.0.
the path in any Hadoop supported file system
specifies the behavior of the save operation when data already exists.
append: Append contents of this DataFrame to existing data.
append
overwrite: Overwrite existing data.
overwrite
ignore: Silently ignore this operation if data already exists.
ignore
error or errorifexists (default case): Throw an exception if data already exists.
error
errorifexists
names of partitioning columns
compression codec to use when saving to file. This can be one of the known case-insensitive shorten names (none, snappy, zlib, and lzo). This will override orc.compress and spark.sql.orc.compression.codec. If None is set, it uses the value specified in spark.sql.orc.compression.codec.
orc.compress
spark.sql.orc.compression.codec
Examples
>>> orc_df = spark.read.orc('python/test_support/sql/orc_partitioned') >>> orc_df.write.orc(os.path.join(tempfile.mkdtemp(), 'data'))