Hi,
I have created a dataframe which contains a sql query. I wanted to use the query in sqlContext. My code is as below:
query = auto.select('column1')
sqlContext.sql(query)
But It throws an error as follows
Could someone guide on it. Thanks in advance!
Answer by Balavigneshnv · Feb 05, 2018 at 01:30 PM
This works fine
qvar = ''.join(query)
sqlContext.sql("%s" % qvar)
Answer by careenjoseph · Feb 06, 2018 at 09:44 AM
Spark supports SQL queries on top of RDDs/ DataFrames. For an application that requires complex data processing, SQLs may very well be the best way to process data. But writing queries that span multiple lines may make the spark code less readable and difficult to debug (had a tough time doing it in our project). Or we can as well do the following:
Answer by toxic wapour · Feb 06, 2018 at 10:58 AM
Just use the below code . First you need to create the view from the dataframe , then you can use SparkSession object to query from the view . This will return a dataframe .
dataset.createOrReplaceTempView("dataset") val result:Dataset[Row]=spark.sql("select * from dataset")
How to read a existing Hive Table which is in ORC format in Spark 1.6.2?? 0 Answers
Spark from lab to production ? 0 Answers
Escape option is not working while writing dataframe. 0 Answers
Save mongoDB data to parquet file format usign Apache spark 1 Answer
How can I create CassandraSQlContext in a Notebook using SparKContext.getOrCreate() ? 1 Answer
Databricks Inc.
160 Spear Street, 13th Floor
San Francisco, CA 94105
info@databricks.com
1-866-330-0121