Running a Kubernetes Job in Amazon EKS on AWS Fargate Using AWS StepFunctions

In a previous AWS Blog, I shared an application orchestration process to run Amazon ECS Tasks using AWS Step Functions.This blog will be similar continuation but here we will be running the same application on Amazon EKS as a Kubernetes job on Fargate using StepFunctions.

Amazon EKS provides flexibility to develop many container use cases like long running jobs, web application, micro-services architecture, on-demand job execution, batch processing, machine learning applications with seamless integration in conjunction with other AWS services. Kubernetes is an open source container orchestration engine for automating deployment, scaling and management of containerized applications. The open source project is hosted by the Cloud Native Computing Foundation(CNCF).

Read more here.


An example of running Amazon ECS tasks on AWS Fargate – Terraform (By HashiCorp)

AWS Fargate is a a serverless compute engine that supports several common container use cases, like running micro-services architecture applications, batch processing, machine learning applications, and migrating on premise applications to the cloud without having to manage servers or clusters of Amazon EC2 instances. 

In this blog, we will walk you through a use case of running an Amazon ECS Task on AWS Fargate that can be initiated using AWS Step Functions. We will use Terraform to model the AWS infrastructure. The example solution leverages Amazon ECS a scalable, high performance container management service that supports Docker containers that are provisioned by Fargate to automatically scale, load balance, and manage scheduling of your containers for availability. For defining the infrastructure, you can use AWS CloudFormationAWS CDK or Terraform by HashiCorp. In the solution presented in this post, we use  Terraform by HashiCorp, an AWS Partner Network (APN) Advanced Technology Partner and member of the AWS DevOps Competency.

Read More here


AWS Certified Advanced Networking Specialty!

Excited to start this year with AWS Certified Advanced Networking Specialty!

As per a couplet in Tamil language, “katrathu kaiman alavu kallathathu ulagalavu”, a raw transcribe could be something like
“known is a droplet and unknown is an ocean”.

Started this year with this droplet in my chase of learning new!

Refer my notes and learnings @

Thanks to


AWS Lambda PowerMockITO – mocking static methods & DI

Thought of writing an article about mocking static methods in Java (there are quite a few available in the web). Just plugged in the combination of DI using Dagger and PowerMockIto

An example implementation of uploading and downloading/get files from AWS S3. Save and read rows from DynamoDB. The example provides JUnit testing using Mockito & PowerMockITO

1. Two sample lambdas (save and get) - to S3 and DynamodB
2. Terraform implementation for deploying the lambda
3. JUnit for S3 and DynamoDB testing
4. Sample CLI commands to test

Read more


Amazon S3 Expiration

Deleting large S3 files using AWS Lambda & Amazon S3 Expiration policies reliably

Deleting larger objects in AWS S3 can be cumbersome and sometimes demands repetition and retries possibly due to the nature of large size (& more number) of the files. In one such scenario is handled today using hourly EMR job operations (one master and multi nodes) on these S3 files. Due to the nature of large size/number of files, the process runs for longer hours and occasionally had to be retried due to failures.

To mitigate the deletion process more reliably and cheaper, we leveraged shorter compute execution using AWS Lambda in combination with AWS S3 lifecycle policies. The lambda can be scheduled using an CloudWatch Event rule (or using AWS StepFunctions or Apache Airflow etc.,). In this situation the SLA for the deletion of the files can be upto 48 hours.

This example provides infrastructure and sample Java code (for the Lambda) to delete the s3 files using lifecycle policy leveraging object expiration techniques

Please refer for more details


AWS Lambda Alias Routing

Lambda Version Alias

This example provides house alias routing can be used in API Gateway and in Kinesis Streams (lambda event mapping to alias)

High level Steps

  • The solution has “src” and “templates”. src has the java sample lambda. templates have the .tf files.
  • Upon “mvn clean package” “target” folder will have the jar generated.
  • “” has the aws cli/bash commands to deploy the lambda function and modify the alias accordingly
  1. Initial Setup
    • Build java source
    • Run terraform templates to create API Gateway/Lambda. This pushes the above jar to the lambda
    • Created API points to lambda:active alias. “red” and “black” aliases are also created at this point
  2. Modify the code
    • Run the “”. This is similar to GIT pipelines deploying the function
    • At this point the the cli sets current version as red and new code being deployed as black
    • cli sets the routing pointing to black and additional routing config weightage for the fallback/red

AWS Certified Data Analytics – Specialty

whoooh.. Knocked one more certification. real tough one and practical Streaming and ETLing

Captured my learning here:

Thanks to Jayendra Patil, Siddharth Mehta, Tom Carpenter for their invaluable posts and lessons (Udemy)


AWS Certified Database – Specialty

whooh. Did my AWS Certified Database – Specialty Certificate on 10/15/2020. That’s quite an exam!!!


CDK & cdk8s

Build and Deploy .Net Core WebAPI Container to Amazon EKS using CDK & cdk8s

Sep 4 2020, wrote a blog where we leveraged the development capabilities of the CDK for Kubernetes framework also known as cdk8s along with the AWS Cloud Development Kit (AWS CDK) framework to provision infrastructure through AWS CloudFormation.

cdk8s allows us to define Kubernetes apps and components using familiar languages. cdk8s is an open-source software development framework for defining Kubernetes applications and reusable abstractions using familiar programming languages and rich object-oriented APIs. cdk8s apps synthesize into standard Kubernetes manifests which can be applied to any Kubernetes cluster. cdk8s lets you define applications using Typescript, JavaScript, and Python. In this blog we will use Python.

The AWS CDK is an open source software development framework to model and provision your cloud application resources using familiar programming languages, including TypeScript, JavaScript, Python, C# and Java.

To read more:


AWS Database (Purpose Built)

AWS offers many purpose built Databases. Note this is an immense wide topic. Thought of writing this for my own refresher.

AWS RDS – Relational Database:
– data is actually relation, ACID (atomic, consistent, integrity, durable) compliant
– referential integrity
– static and unchanging
– ubiquitous – easy and available in many flavors. can handle different types of workloads
– RDS read replicas for 6/6 DB engines available
ex: shopper to read history. query from read db instead of main db

Anti pattern: lot of loads can run but are not good for JSon object or if there is no well defined schema

DynamoDB – NO SQL
– fully managed, multi region
– multi primary, no SQL
– built in security backup and restore
low latency key baseeed query. fast performance handle throughput maintaining consistency
– ex: product desc

for Hot data -  
    DAX Caching - DAX - items to cache in a matter of microseconds 
    reduces response times of eventually consistent read workloads from single digit milliseconds to microseconds
    devs need not modify their app logic

    Elasticache for redis and ElasticCache for MemChached

Amazon Redshift – Dataware House
for cold data – for analytics, columnar database services help
columnar storage tech in order to improve I/O efficiency and parallelized queries across multiple nodes for fast query performance
Analytics, trend monitoring and gaming insights to know what’s ging on in the system
ex: daily report,allow pulling from datalake
use standard SQL.

- AQUA - Advanced query accelerated cache. that does substantial share of data processsing in place on the cache,
  enabling Redshift to run upto 10x faster

Other options/flavors are
financial requirements, doc storage reqruiement or mobile game specific community

Below are some more purpose built database services options available in AWS

Amazon QLDB – ledger database transparent cryptographically verifiable transaction log. Immutable
when you need to keep track of financial activity in an organization

Amazon Document DB for MongoDB for storage of JSON – fault tolerant, self healing storage that supports automatic data scaling
allows customer to scale from 10GB all the way upto 64TB per database cluster

Amazon Neptune – Graph – database option
Propert Graph and W3C’s RDF along with respective query
ex: Games – connectivity between each player
Supports Apache TinkerPop Gremlin and SPARQL

Amazon Keyspaces – AWS Managed Cassandra compatible service
if customer needs wide column key store –
automatically supports three replicas that can be distributed across different AZs

Amazon Timestream – Timeseries database. need to analyze billions/trillions of data
ex: user activity log. ex: trivago
automates rollups, retention, tiering and compression of data
ex: from device sensors constantly

Data patterns

-read heavy
    lot of incoming data. more query processing on that. there is a read contention/querying due to lot of records
    querying over and over again
    Option - materialized view is a database object 

-micro service
    avoid sharing data between micro services

- break monolith into small domains which could be microservice or a collection of microservices

- isolate bounded context - part of domain driven design - back your data based on functionality

- Event sourcing and CQRS (command query responsibilty seggregation)
    changig the parts of the app to define which is responsible of reading vs writing

- SAGA Pattern - sequence of events that happen in a Distributed transactions
    orchestrate/choreograph using orchestrators like step functions to coordinate activities to perform/trigger a transactional boundary to commit/rollback across distributed design
    only works when we can embrace eventual consistency

AWS Database Migrations

Consider a source (on prem) –> RDS / EC2 migration.

Types of migration

  • like to like (homogenous) – moving between database instances using the same db tech
  • like to unlike (heterogenous) – migratnig to a completely different database technology

Options available for migration

  • DMS – Database management service
  • third party
  • partner
  • backup and restore

for like to like – DMS uses replication instance and keeps the source available/online. DB engines are compatible

for like to unlike pattern – SCT (Schema conversion tool) + DMS can help
if the datastructure are not same they need to be converted to a compatible type
ex: once-off migration, source to target, consolidate databases, dev and test

Snowball edge:
if the bandwidth, infrastructure is not available, snowball edge can be used to copy the data into S3 and can be migrated