Transformation Errors¶
Error: Caused By Already closed files for partition: month¶
Message
...62 more
Caused by: java.lang.IllogicalStateException: Already closed files for partition: month=2019-04
at org.apache.iceberg.io.PartitionedWriter.write(PartitionedWriter.java:69)
at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$.$anonfun$run$run$1(WriteT.....
What went wrong?
This basically happens when the partition is not done on the correct column or if the data is large we need to sort it by that column to avoid this error.
Solution
To rectify the issue, the partition on the date/time
column should be done like this
saveMode: overwrite
sort:
mode: partition
columns:
- name: gcd_modified_utc
order: asc
iceberg:
properties:
overwrite-mode: dynamic
write.format.default: parquet
write.metadata.compression-codec: gzip
partitionSpec:
- type: day
column: gcd_modified_utc
name: day
Error: Caused by Cannot write incompatible data to table¶
Message
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1052)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: org.apache.spark.sql.AnalysisException: Cannot write incompatible data to table
/gcdcore_bronze/gcdcore_fiscal_period':
- Cannot safely cast 'start_date': string to timestamp
- Cannot safely cast 'end_date': string to timestamp
- Cannot safely cast 'gcd_modified_utc': string to timestamp
at org.apache.spark.sql.errors.QueryCompilationErrors$.cannotWriteIncompatibleDataToTable...
What went wrong?
Due to incompatible data in table
Solution
Cast the date/time
column as timestamp
Convert date/time
column to timestamp
Error: Job finished with error = java.lang.string Cannot be cast to java.lang.boolean¶
Message
Stopping Spark Application
Flare: Job finished with error=java.lang.String cannot be cast to java.lang.Boolean
...os.flare.exceptions.FlareException: java.lang.String cannot be cast to java.lang.Boolean
..ingContext.error(ProcessingContext.scala:87)
..addShutdownHook$1(Flare.scala:79)
..on$1.run(ShutdownHookThread.scala:37)
<-0] o.a.s.u.ShutDownHookManager: Shutdown hook called
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala)
Caused by: java.lang.ClassCastException: **java.lang.String cannot be cast to java.lang.Boolean**
at scala.runtime.BoxesRunTime.unboxToBoolean(BoxesRunTime.java:87)
at io.dataos.flare.configurations.job.input.File.getReader(File.scala:42)
at io.dataos.flare.configurations.job.input.DatasetInput.getReader(Input.scala:167)
at io.dataos.flare.configurations.job.input.Input.getReader(Input.scala:61)
at io.dataos.flare.configurations.job.JobConfiguration.$anonfun$readers$1(JobConfiguration...
What went wrong?
The spelling of false
was wrong in the batch mode.