Introduction
Data on Amazon EKS(DoEKS) - A tool for building aws managed and self-managed scalable data platforms on Amazon EKS. With DoEKS, You have access to:
- Robust Deployment Infrastructure as Code (IaC) Templates using Terraform and AWS CDK, among other
- Best Practices for Deploying Data Solutions on Amazon EKS
- Detailed Performance Benchmark Reports
- Hands-on Samples of Apache Spark/ML Jobs and various other frameworks
- In-depth Reference Architectures and Data Blogs to keep you ahead of the curve
Architecture
The diagram displays the open source data tools, k8s operators and frameworks that runs on Kubernetes covered in DoEKS. AWS Data Analytics managed services integration with Data on EKS OSS tools.
Main Features
🚀 Custom Kubernetes Schedulers (e.g., Apache YuniKorn, Volcano)
🚀 Job Schedulers (e.g., Apache Airflow, Argo Workflows)
🚀 AI/ML on Kubernetes (e.g., KubeFlow, MLFlow, Tensorflow, PyTorch, etc.)
🚀 Distributed Databases (e.g., Cassandra, CockroachDB, MongoDB etc.)
🚀 Streaming Platforms (e.g., Apache Kafka, Apache Flink, Apache Beam etc.)
Getting Started
Checkout the documentation for each section to deploy infrastructure and run sample Spark/ML jobs.