pyspark.sql.functions.from_json¶
-
pyspark.sql.functions.
from_json
(col, schema, options=None)[source]¶ Parses a column containing a JSON string into a
MapType
withStringType
as keys type,StructType
orArrayType
with the specified schema. Returns null, in the case of an unparseable string.New in version 2.1.0.
- Parameters
- col
Column
or str string column in json format
- schema
DataType
or str a StructType or ArrayType of StructType to use when parsing the json column.
Changed in version 2.3: the DDL-formatted string is also supported for
schema
.- optionsdict, optional
options to control parsing. accepts the same options as the json datasource. See Data Source Option in the version you use.
- col
Examples
>>> from pyspark.sql.types import * >>> data = [(1, '''{"a": 1}''')] >>> schema = StructType([StructField("a", IntegerType())]) >>> df = spark.createDataFrame(data, ("key", "value")) >>> df.select(from_json(df.value, schema).alias("json")).collect() [Row(json=Row(a=1))] >>> df.select(from_json(df.value, "a INT").alias("json")).collect() [Row(json=Row(a=1))] >>> df.select(from_json(df.value, "MAP<STRING,INT>").alias("json")).collect() [Row(json={'a': 1})] >>> data = [(1, '''[{"a": 1}]''')] >>> schema = ArrayType(StructType([StructField("a", IntegerType())])) >>> df = spark.createDataFrame(data, ("key", "value")) >>> df.select(from_json(df.value, schema).alias("json")).collect() [Row(json=[Row(a=1)])] >>> schema = schema_of_json(lit('''{"a": 0}''')) >>> df.select(from_json(df.value, schema).alias("json")).collect() [Row(json=Row(a=None))] >>> data = [(1, '''[1, 2, 3]''')] >>> schema = ArrayType(IntegerType()) >>> df = spark.createDataFrame(data, ("key", "value")) >>> df.select(from_json(df.value, schema).alias("json")).collect() [Row(json=[1, 2, 3])]