Spark Operator with YuniKorn

Introduction

The EKS Cluster design for the Data on EKS blueprint is optimized for running Spark applications with Spark Operator and Apache YuniKorn as the batch scheduler. This blueprint shows both options of leveraging Cluster Autoscaler and Karpenter for Spark Workloads. AWS for FluentBit is employed for logging, and a combination of Prometheus, Amazon Managed Prometheus, and open source Grafana are used for observability. Additionally, the Spark History Server Live UI is configured for monitoring running Spark jobs through an NLB and NGINX ingress controller.

Spark workloads with Karpenter

👈

Spark workloads with ClusterAutoscaler and Managed NodeGroups

👈

NVMe SSD Instance Storage for Spark Shuffle data

👈

Spark Operator

👈

Deploying the Solution

👈

Execute Sample Spark job with Karpenter

👈

Execute Sample Spark job with Cluster Autoscaler and Managed Node groups

👈

Example for TPCDS Benchmark test

👈

Cleanup

👈

caution

To avoid unwanted charges to your AWS account, delete all the AWS resources created during this deployment