Spark源码解析 - spark-submit提交过程
以Spark 3.2.0版本为基准,yarn-cluster 模式的 spark application 提交过程如下
Client 提交 AM 到 yarn 过程
通过 spark-submit 提交 spark 任务之后,client 端将 ApplicationMaster 提交到 yarn 的过程如下
1 | org.apache.spark.deploy.SparkSubmit#main:1052 |
sequenceDiagram
participant SparkSubmit
participant YarnClusterApplication
participant Client as org.apache.spark.deploy.yarn.Client
SparkSubmit->>SparkSubmit: main(1052)
SparkSubmit->>SparkSubmit: doSubmit(1043)
SparkSubmit->>SparkSubmit: doSubmit(90)
SparkSubmit->>SparkSubmit: submit(203)
SparkSubmit->>SparkSubmit: submit(165)
SparkSubmit->>SparkSubmit: runMain(898)
SparkSubmit->>SparkSubmit: prepareSubmitEnvironment(748)
SparkSubmit->>SparkSubmit: runMain(939)
SparkSubmit->>SparkSubmit: runMain(955)
Note right of SparkSubmit: 启动YarnClusterApplication
SparkSubmit->>YarnClusterApplication: start(1675)
YarnClusterApplication->>Client: run(1268)
Client->>Client: submitApplication(203)
Client->>Client: createApplicationSubmissionContext(1032)
Client->>Client: submitApplication(207)
Driver 启动过程
1 | org.apache.spark.deploy.yarn.ApplicationMaster#main:913 |
sequenceDiagram
participant ApplicationMaster
ApplicationMaster->>ApplicationMaster: main(913)
ApplicationMaster->>ApplicationMaster: run(273)
ApplicationMaster->>ApplicationMaster: runDriver(501)
后续在Driver上,用户程序会调用spark的能力做执行,如SparkSession#sql,DataFrame#show
Spark源码解析 - spark-submit提交过程
https://jszero.github.io/2025/08/09/Spark源码解析-spark-submit提交过程/