• Create
    • Ask a question
    • Create an article
    • Topics
    • Questions
    • Articles
    • Users
    • Badges
  • Sign in

Dataset

Cancel

Cancel

All Posts

  • Updated
  • Created
  • Hottest
  • Votes
  • Most viewed

where's my data

0 Answers

0 Votes

4 Views

published by avatar image David Katz on 2 days ago
dataframe·dataset

Improve performance of groupByKey for a large dataset

1 Answer

0 Votes

1.5k Views

published by avatar image Valery Khamenya on Jun 19, '18
dataset·reducebykey·groupbykey

why there is no reduceByKey in dataset?

0 Answers

0 Votes

292 Views

asked by avatar image Sukumaar Mane on May 16, '18
spark·dataset·reducebykey·groupbykey·paired rdd

How to maintain large hive query in Spark

0 Answers

0 Votes

55 Views

published by avatar image michellema on Apr 10, '18
spark sql·spark-sql·dataset

How to maintain large hive query in Spark

0 Answers

0 Votes

53 Views

asked by avatar image michellema on Apr 10, '18
spark sql·spark-sql·dataset

maximum fields allowed in a dataset that comes form Kafka topic

0 Answers

0 Votes

36 Views

asked by avatar image Harindar on Mar 19, '18
dataset·when i try to read a kafka topic that has 500 fields i am not able to write it to console

window function for DataSet

0 Answers

0 Votes

109 Views

asked by avatar image mans4singh on Dec 25, '17
structured streaming·dataset·window functions

what are the alternatives for a Collect_set in Datasets for better performance ?

0 Answers

0 Votes

597 Views

edited by avatar image panos on Dec 4, '17
spark sql·performance·dataset·groupby

CSV data source does support array data type, but not when downloading full results

0 Answers

1 Votes

2.4k Views

commented by avatar image yoderj on Aug 16, '17
csv·dataset·export·download
austin.powell

using DataSet.repartition in Spark 2 - several tasks handle more than one partition

0 Answers

1 Votes

468 Views

asked by avatar image eeldor on Jul 31, '17
spark streaming·dataset·repartitioning·javardd
Pravin Agrawal

In Pyspark how do we differentiate Dataset from DataFrame?

1 Answer

0 Votes

241 Views

answered by avatar image jules on Jul 28, '17
pyspark·dataframe·dataset

How to split/merge records using sparksql/dataframe/dataset

0 Answers

0 Votes

413 Views

asked by avatar image jxoneplus on Jun 26, '17
spark sql·dataframe·sparksql·dataset

Kryo Serialization for Dataset

0 Answers

0 Votes

242 Views

asked by avatar image yasin on Jun 24, '17
dataset·kryo·spark 2.x

The data set size for SVM

0 Answers

0 Votes

114 Views

published by avatar image Bigevilking123 on Jun 22, '17
dataset·sklearn·svm

Read a Sorted File

0 Answers

0 Votes

121 Views

asked by avatar image yasin on Jun 18, '17
csv·dataset·hdfs·repartitioning·sorting
  • 1
  • 2
  • 3
  • ›
51 Posts
33 Users
3 Followers

Topic Experts

avatar image
bizworld
0Points

Related Topics

spark sql dataframe spark reducebykey repartitioning scala csv groupby encoder apache spark datasets java spark-sql groupbykey table example dataframes spark streaming notebook word count. spark 1.6.1 javardd spark-1.6 export case class
  • Product
    • Databricks Cloud
    • FAQ
  • Spark
    • About Spark
    • Developer Resources
    • Community + Events
  • Services
    • Certification
    • Spark Support
    • Spark Training
  • Company
    • About Us
    • Team
    • News
    • Contact
  • Careers
  • Blog

Databricks Inc.
160 Spear Street, 13th Floor
San Francisco, CA 94105

info@databricks.com
1-866-330-0121

  • Twitter
  • LinkedIn
  • Facebook
  • Facebook

© Databricks 2015. All rights reserved. Apache Spark and the Apache Spark Logo are trademarks of the Apache Software Foundation.

  • Anonymous
  • Sign in
  • Create
  • Ask a question
  • Create an article
  • Explore
  • Topics
  • Questions
  • Articles
  • Users
  • Badges