zl程序教程

您现在的位置是:首页 >  数据库

当前栏目

【大数据】spark.sql.autoBroadcastJoinThreshold

2023-09-27 14:29:28 时间

spark.sql.autoBroadcastJoinThreshold

22/05/25 08:06:47 INFO cluster.YarnClusterScheduler: Removed TaskSet 15.0, whose tasks have a11 completed from pool
22/05/25 08:06:47 INFO schedulerDAGScheduler:ResultStage 15 (run at ThreadPoolExecutoriava:1149)finished in0924s 
22/05/25 08:06:47 INFO schedulerDAGScheduler:Tob11 finishe i:run at ThreadPoolExecutorjava:1149took59881642 s
22/05/25 08:06:47 INFO memoryMemoryStore: Block broadcast_29 stored as values in memory (estimated size 81 MBfree 30GB)
22/05/25 08:06:47 INFO memoryMemoryStore:Block broadcast_29 piece0 stored as bytes in memory (estimated size 544 KB,free 3.0 GB)
22/05/25 08:06:47 INFO storage.BlockManagerInfo:Added broadc ast 29 piece0 in memory on 102513551:34057(size:544 KB,free:30 GB)22/05/25 08:06:47 INFO sparkSparkContext: Created broadcast29 from run at ThreadPoolExecutorjava:114922/05/25 08:06:47 INFO codegenCodeGenerator:Code generatedi in 43295293 ms22/05/25 08:06:47 ERROR datasourcesFileFormatWriter:Aborting g job nul1. org.apache.sparkSparkException: Exception thrown in awaitRes ult:
at org.apache.sparkutilThreadUtils$awaitResult(Thr eadUtils.scala:205)
at org.apachesparksq1.execution.exchangeBroadcastE xchangeExec.doExecuteBroadcast(BroadcastExchangeExecscala:123 at org.apachesparksq1executionInputAdapterdoExec uteBroadcast(WholeStageCodegenExecscala:248)

img

意思说这个配置的最大字节大小是用于当执行连接时,该表将广播到所有工作节点。通过将此值设置为 - 1,广播可以被禁用。

于是将此配置设整下,结果任务正常跑完。