Newest 'scala+apache-spark' Questions

0 votes

1 answer

33 views

Spark-Scala vs Pyspark Dag is different?

I am converting pyspark job to Scala and jobs executes in emr. The parameter and data and code is same. However I see the run time is different and so also the dag getting created is different. Here I ...

user3858193

1,438

asked yesterday

1 vote

0 answers

19 views

Encrypt Spark Libsvm Dataframe

I have a libsvm file that I want to load into Spark and then encrypt it. I want to iterate over every element in the features to apply my encrypt function, but there doesn't seem to be any way to ...

Landor3000

41

asked Jul 17 at 9:40

0 votes

1 answer

17 views

Adding new Rows to Spark Partition while using forEachPartition

I am trying to add a new Row to each Partition in my Spark Job. I am using the following code to achieve this: StructType rowType = new StructType(); rowType.add(DataTypes.createStructField("...

Sateesh K

1,081

asked Jul 16 at 20:25

0 votes

0 answers

21 views

Scala Spark Dataframe creation from Seq of tuples doesn't work in Scala 3, but does in Scala 2

When trying to test something locally with Scala Spark, I noticed the following problem and was wondering what causes it, and whether there exists a workaround. Consider the following build ...

Maurycyt

720

asked Jul 15 at 16:48

-1 votes

0 answers

42 views

Using spark 3.4.1 lib in Java when extending StringRegexExpression to a java class

I am using spark 3.4.1 in maven project where I am configured scala (2.13.8) lang as well. I am trying to create a class Like.java in project by extending spark's StringRegexExpression package com....

Manoj Kumar

66

asked Jul 15 at 14:03

1 vote

1 answer

27 views

Can I use same SparkSession in different threads

In my spark app I use many temp views to read datasets and then use it in huge sql expression, like that: for (view < cfg.views) spark.read.format(view.format).load(view.path).createTempView(view....

Vladimir Shadrin

364

asked Jul 11 at 9:04

1 vote

0 answers

27 views

Spark scala transformations

I have spark input dataframe like below. Emp_ID Cricket Chess Swim 11 Y N N 12 Y Y Y 13 N N Y Need Out Dataframe like below. Hobbies Emp_id_list Cricket 11,12 Chess 12 Swim 12,13 Any way to ...

srinivas gowda

11

asked Jul 10 at 17:02

-1 votes

0 answers

26 views

udf to transform a json string into multiple rows based on first level of nesting

I am trying to transform a df based on the first level nesting in the json string. input dataframe +------+------------------------------------+---------------------------------------------------------...

Shibu

1,490

asked Jul 9 at 14:37

0 votes

1 answer

51 views

spark.sql() giving error : org.apache.spark.sql.catalyst.parser.ParseException: Syntax error at or near '('(line 2, pos 52)

I have class LowerCaseColumn.scala where one function is defined as below : override def registerSQL(): Unit = spark.sql( """ |CREATE OR REPLACE TEMPORARY ...

Chandra Prakash

1

asked Jul 9 at 13:54

0 votes

1 answer

59 views

+50

How to create data-frame on rocks db (SST files)

We hold our documents in rocks-db. We will be syncing these rocks-db sst files to S3. I would like to create a dataframe on the SST files and later run an sql. When googled, I was not able to find any ...

chendu

729

asked Jul 8 at 6:31

0 votes

0 answers

22 views

Flattening nested json with back slash in apache spark scala Dataframe

{ "messageBody": "{\"task\":{\"taskId\":\"c6d9fb0e-42ba-4a3e-bd39-f2a32a6958c1\",\"serializedTaskData\":\"{\\\"clientId\\\":\\\&...

Vanshaj Singh

19

asked Jul 5 at 4:48

0 votes

0 answers

33 views

Spark : Read special characters from the content of dat file without corrupting it in scala

I have to read all the special characters in some dat file (e.g.- testdata.dat) without being corrupted and initialise it into a dataframe in scala using spark. I have one dat file (eg - testdata.dat),...

Prantik Banerjee

1

asked Jul 3 at 11:36

1 vote

0 answers

31 views

Creating a custom aggregator in spark with window rowsBetween?

What I'm trying to do is use a window function to get the last and current row and do some computation on a couple of the columns with a custom aggregator. I have time series data with points that are ...

Adrian Corey

91

asked Jul 3 at 2:37

1 vote

0 answers

34 views

More Parallelism Than Expected in Glue ETL Spark Job

I am using Glue ETL Spark jobs to run some tests. I am trying to understand why I am getting more parallel processing than the available cores on a single executor. Here's my job config: I setting ...

Yar

7,328

asked Jul 1 at 21:24

0 votes

2 answers

38 views

Determine if a condition is ever true in an aggregated dataset with Scala spark sql library

I'm trying to aggregate a dataset and determine if a condition is ever true for a row in the dataset. Suppose I have a dataset with these values cust_id travel_type distance_travelled 1 car 10 1 ...

Darragh.McL

107

asked Jun 27 at 10:06

Collectives™ on Stack Overflow

All Questions

Spark-Scala vs Pyspark Dag is different?

Encrypt Spark Libsvm Dataframe

Adding new Rows to Spark Partition while using forEachPartition

Scala Spark Dataframe creation from Seq of tuples doesn't work in Scala 3, but does in Scala 2

Using spark 3.4.1 lib in Java when extending StringRegexExpression to a java class

Can I use same SparkSession in different threads

Spark scala transformations

udf to transform a json string into multiple rows based on first level of nesting

spark.sql() giving error : org.apache.spark.sql.catalyst.parser.ParseException: Syntax error at or near '('(line 2, pos 52)

How to create data-frame on rocks db (SST files)

Flattening nested json with back slash in apache spark scala Dataframe

Spark : Read special characters from the content of dat file without corrupting it in scala

Creating a custom aggregator in spark with window rowsBetween?

More Parallelism Than Expected in Glue ETL Spark Job

Determine if a condition is ever true in an aggregated dataset with Scala spark sql library

Hot Network Questions

Collectives™ on Stack Overflow

All Questions

Related Tags