Spark Execution Flow:
Below are the 3 Stages of Spark Execution Model:
1. Create DAG of RDDs to represent
computation . - RDD Lineage creation
2. Create logical execution plan for DAG. -
Split into “stages” based on need to
reorganize data
Stage 1 HadoopRDD map()
Stage 2 groupBy()
mapValues()
collect()
3. Schedule and execute individual tasks.
Split each stage into tasks
• A task is data + computation
• Execute all tasks within a stage
For more info Refer "https://spark-summit.org/2014/wp-content/uploads/2014/07/A-Deeper-Understanding-of-Spark-Internals-Aaron-Davidson.pdf"
Good Post! Thank you so much for sharing this pretty post, it was so good to read and useful to improve my knowledge as updated one, keep blogging.
ReplyDeletehttps://www.emexotechnologies.com/online-courses/big-data-hadoop-training-in-electronic-city/
very nice blog.,keep sharing more blogs with us.
ReplyDeleteThank you..
Intrested candidate visit now:big data and hadoop online training