pyspark.sql.functions.schema_of_json¶
-
pyspark.sql.functions.
schema_of_json
(json: ColumnOrName, options: Optional[Dict[str, str]] = None) → pyspark.sql.column.Column[source]¶ Parses a JSON string and infers its schema in DDL format.
New in version 2.4.0.
Changed in version 3.4.0: Supports Spark Connect.
- Parameters
- json
Column
or str a JSON string or a foldable string column containing a JSON string.
- optionsdict, optional
options to control parsing. accepts the same options as the JSON datasource. See Data Source Option for the version you use.
Changed in version 3.0.0: It accepts options parameter to control schema inferring.
- json
- Returns
Column
a string representation of a
StructType
parsed from given JSON.
Examples
>>> df = spark.range(1) >>> df.select(schema_of_json(lit('{"a": 0}')).alias("json")).collect() [Row(json='STRUCT<a: BIGINT>')] >>> schema = schema_of_json('{a: 1}', {'allowUnquotedFieldNames':'true'}) >>> df.select(schema.alias("json")).collect() [Row(json='STRUCT<a: BIGINT>')]