I have json events of the following structure:
{ 'atlas_ts': '', 'ev': { '_a': { 'build': '', 'name': '', 'version': '' },
'_abtest': '',
'_c': {
'referrer': '',
'utm_campaign': '',
'utm_content': '',
'utm_medium': '',
'utm_source': '',
'utm_term': ''
}
}
}
I want to rename the columns of the dataframe created. I want to replace ev._a
to ev.app
; ev._c
to ev.campaign
; ev._a.name
to ev.app_name
. How can I do this.
Answer by Miklos · Dec 04, 2015 at 05:50 PM
You specify a schema when created the DataFrame if you want to retain the structure of the json records. If you're planning to flatten the records, you can use the .withColumnRenamed() api.
How can I create a DataFrame from a nested array struct elements? 1 Answer
Conversion of a StructType column to MapType column inside a DataFrame? 1 Answer
org.apache.spark.SparkException: Task not serializable : Case class serialization issue may be? 1 Answer
How to calculate Percentile of column in a DataFrame in spark? 2 Answers
Apply a logic for a particular column in dataframe in spark 0 Answers
Databricks Inc.
160 Spear Street, 13th Floor
San Francisco, CA 94105
info@databricks.com
1-866-330-0121