AWS – CodeAndTech

September 12, 2024September 13, 2024

AWS Certified AI Practitioner – Tips

Howdy! I was able to crack AWS Certified AI Practitioner Certification | AWS Certification | AWS (amazon.com) recently.

Udemy course by Stephane Maarek – https://www.udemy.com/course/aws-ai-practitioner-certified
AWS Skill Builder course for AP Practitioner – https://explore.skillbuilder.aws/learn/public/learning_plan/view/2194/enhanced-exam-prep-plan-aws-certified-ai-practitioner-aif-c01
AWS Documentation
- Nothing comes closer to AWS Overview and FAQs
- – https://aws.amazon.com/sagemaker/faqs
- – https://aws.amazon.com/q/faqs/
- – https://aws.amazon.com/polly/faqs/
- – https://aws.amazon.com/transcribe/faqs
- – Amazon Textract FAQs | AWS
- – Amazon Kendra FAQs – Amazon Web Services
- – Amazon Personalize FAQs – Amazon Web Services
- – https://aws.amazon.com/rekognition/faqs
- – https://aws.amazon.com/lex
- – Amazon Comprehend – FAQs
- – AWS Translate FAQs – Amazon Web Services (AWS)

AWS Services Notes

1	SNo	Service	Details	Comments
2	1	SageMaker	prepare data and build, train, and deploy machine learning (ML) models	End to end Managed service
3		SageMaker Studio	single, web-based visual interface to perform all ML development steps	prepare data and build, train, and deploy model, upload data, create new notebooks, train and tune models, move back and forth between steps to adjust experiments, compare results, and deploy models to production All ML development activities including notebooks, experiment management, automatic model creation, debugging and profiling, and model drift detection can be performed within the unified SageMaker Studio visual interface.
4
5
6		SageMaker Data Wrangler	For data preparation, transformation and feature engineering Prep tabular and image data for ML Single interface for data selection, cleansing, exploration, visualization and processing Sql support and Data Quality tool	Use case – music dataset, song ratings, listening duration
7
8
9
10		SageMaker Canvas	No code interface Build/tune/train model using a visual interface Build your own custom model using automl Leverage data wrangler	visual drag-and-drop service that allows business analysts to build ML models and generate accurate predictions without writing any code or requiring ML expertise. Use case Sentiment analysis
11		SageMaker Clarify	For data preparation. Evaluate foundation models – compare Model A vs Model B Evaluate using human factors Use built in datasets or bring your own dataset Built inn metrics and algorithms Model Explainability – debug predictions. To increase the trust and understanding of the model	To identify potential bias Bring your own employee or aws employee Detect Bias (human) Specify input features and bias will be automatically detected
12		SageMaker Feature Store	Store, share and manage features of ML models
13		SageMaker Ground Truth, SageMaker Ground Truth Plus	Labeling For RLHF – reinforcement learning from human feedback Model review customization and evaluation	identify raw data, such as images, text files, and videos, and add informative labels to create high-quality training datasets for your ML model
14		SageMaker Studio Notebooks	Jupyter notebooks in SageMaker for the complete ML development
15		SageMaker Studio Lab	ML development environment	that provides the compute, storage (up to 15 GB), and security
16		SageMaker HyperPod	Train models	purpose-built to accelerate foundation model (FM) training
17		SageMaker Experiments	organize and track iterations to ML models
18		SageMaker Debugger	captures real-time metrics during training	monitors CPUs, GPUs, network, and memory
19		SageMaker Serverless Inference	deploy and scale ML models
20		SageMaker Edge Manager	Optimize, secure, monitor, and maintain ML models on fleets of edge devices	smart cameras, robots, personal computers, and mobile devices
21		SageMaker Neo	After training, use Neo to compile the model train once and run anywhere in the cloud and at the edge	supports the most popular DL models – AlexNet, ResNet, VGG, Inception, MobileNet, SqueezeNet, and DenseNet models trained in MXNet and TensorFlow, and classification and random cut forest models trained in XGBoost
22		SageMaker Model Monitor	monitors the quality of Amazon SageMaker machine learning models Monitors data/model quality, bias drift for models, feature attribution drift for models Get alert for deviations. Either fix or retrain	Continuous – real-time endpoint Continuous – batch transform job Scheduled – asynchronous batch transform jobs
23		SageMaker Model Registry	Centralized repository allows to track/manage and version models Catalog models, manage model versions, associate metadata with a model Manage approval status of a model, automate model deployment, share models
24		SageMaker Pipelines	Process of building training and deploy CICD for ML Steps Processing, training, tuning, automl,model, clarifycheck, quality check
25		SageMaker Feature Store	for sharing and managing variables (features) across multiple teams during model development Ingests feature from variety of sources Can publish directly from sagemaker data wrangler into feature store
26		SageMaker Model Cards	Provide model documentation, not feature management.	Use case – intended uses, risk ratings and training details
27
28		SageMaker JumpStart	ML hub to find pretrained foundation model, computer vision models or nlp models for quickly deploying and consuming a foundation model (FM) within a team’s VPC. Models can be fully customized or access prebuilt solutions and deployed	Provides access to a wide range of pre-trained models and solutions that can be easily deployed and consumed within a VPC. Designed to simplify and accelerate the deployment of machine learning models, including foundation models.
29		SageMaker Role Manager	Define role for personas	Ex: data scientist, analyst
30
31	2	Amazon Bedrock
32	3	Q	generative AI–powered assistant for accelerating software development and leveraging companies’ internal data
33		Q Developer	Coding, testing, and upgrading applications, to diagnosing errors, performing security scanning and fixes, and optimizing AWS resources
34		Q business	generative AI–powered assistant that can answer questions, provide summaries, generate content, and securely complete tasks based on data and information in your enterprise systems
35		Q for QuickSight	unified business intelligence (BI) multi-visual Q&A responses, get AI-driven executive summaries of dashboards, and create detailed and customizable data stories highlighting key insights, trends, and drivers	customers get a Generative BI assistant that allows business analysts to use natural language to build BI dashboards in minutes and easily build visualizations and complex calculations
36
37
38		Q for Connect	real-time conversation with the customer along with relevant company content to automatically recommend what to say or what actions an agent should take to better assist customers.
39		Q for Supply Chain	inventory managers, supply and demand planners, and others will be able to ask and get intelligent answers about what is happening in their supply chain, why it is happening, and what actions to take. They will also be able to explore what-if scenarios to understand the trade-offs between different supply chain choices
40
41	3	Amazon Comprehend	For NLP Language, extracts key phrases, Custom classifier — organize documents into categories Analyzes text using tokenization Supports text/pdf/word/images etc.,	Text and Documents Ex: analyze email, create group articles that comprehend will uncover Use case Custom entities – analyze text for specific terms, list of entities Sentiment analysis
42		Amazon Translate	Natural and accurate translate languages Custom terminology – csv/tsv/tmx	Text and Documents Use cases – websites and applications, for international users Html/text documents from S3
43		Amazon Textract	Extract text. Handwriting and data from any scanned documents using AI/ML Ex: scan a image and read the text	Text and Documents Use cases – financial services. Health care, public sector (health forms etc.,_
44
45	4	Amazon Rekognition	Find objects, people, text, scenes in images and videos Custom labels – identify/find your own pics/logos. Ex: NFL Content moderation – detect inappropriate, unwanted, offensive content Custom Moderation Adapters – extend rek capabilities by providing your own labeled set of images	Vision Use cases – labeling, content moderation, text detection, face detection and analysis (gender) Celebrity recognition Filter out harmful images
46
47	5	Amazon Kendra	Document search service Extract answers from docs – text/pdf/html/ppt/word etc., Natural language search capabilities Creates knowledge index/powered by ML internally	Search
48
49	6	Amazon Lex	Using voice and text Conversational ai with multiple languages Integrates with lambda, Connect, comprehend, kendra	Chatbots
50
51	7	Amazon Polly	Convert text to speech Lexicons – – define how to read certain pieces of text ex: AWS => Amazon Web Services SSML – Speech synthesis markup language – markup how the text should be pronounced Voice engine – generative, neural standard Speech mark – ex: lip syncing or highlight word as they are spoken encode where a sentence/word starts or ends in an audio	Speech Generative Long form Neural standard
52		Amazon Transcribe	Convert speech to text Deep learning process called automatic speech recognition Removes PII using redaction Supports automatic language identification for multi lingual audio Custom Vocabularies – Can capture domain specific/non-standard terms Provide hints to increase recognition Custom language models (for context) – for domain specific	Speech Use cases – customer service calls, automate closed captioning/subtitling, generate meta data for media assets to create a fully searchable archive Can transcribe multiple languages at the same time
53
54	8	Amazon Personalize	Ex: retail stores, media and entertainment	Recommendation
55
56	9	AWS DeepRacer	Console to train and evaluate deep RL
57
58	10	Amazon Forecast	ML to deliver highly accurate forecasts	Use case – predict future sales Product demand planning, financial planning, resource planning
59
60
61	11	Amazon Mechanical Turk	Crowdsourcing marketplace Distributed virtual workforce Integrates with Amazon A2I, SageMaker Ground Truth etc.,	Use case – label 1000000 images Data collection, business processing etc.,
62
63	12	Amazon Augmented AI	Human oversight of machine learning predictions in production	Can be own employees or AWS/contractors
64
65	13	Amazon Comprehend Medical and Transcribe
66
67	14	Amazon’s Hardware for AI	AWS Trainium – Trn1 instance AWS Inferentia – ML chip built to deliver inference 4x throughput, 70% cost reduction EC2, EBS,EFS, ELB, ASG EC2 user data/firewall	EC2 GPU – P3, P4, P5,…. G3,.. G6
68

Machine Learning Notes

1	SNo	Type	Used for	Name	Details	Use Cases	Comment
2	1	Supervised Learning		Linear Regression	Model relationship between one or more input features One output variable — target	Historical sales data, output – no of units to be produced Predict House prices, stocks prices, sales volume etc.,
3	2	Supervised		Binary classification	Binary outcome yes/no, true/false, +/-
4	3	Supervised		Time series prediction	forecasts future values based on past and present data
5	4	Supervised		Regression	estimates a continuous numerical value based on the input features
6	5	Supervised		recurrent neural network (RNN)	type of neural network that can process sequential data. suited for predicting future events based on past observations NOTE: CNN is for images and RNN is for timeseries.	forecasting engine failures based on sensor readings	TensorFlow, PyTorch, Keras, MXNet
7				convolutional neural network (CNN)	Classify an object amongst a group NOTE: CNN is for images and RNN is for timeseries.	an animal image as input and identify probability distribution of how likely amongst 10 types of animals	Softmax function transforms a a arbitrary real values into a range of (0,1) TensorFlow, PyTorch, Keras, MXNet
8				WaveNet	Generative model for raw audio WaveNet is a deep autoregressive CNN with stacked layers of dilated convolution, used for generating speech. To deliver a more human-like voice, WaveNet: • 𝗠𝗼𝗱𝗲𝗹𝘀 𝘁𝗵𝗲 𝗿𝗮𝘄 𝘄𝗮𝘃𝗲𝗳𝗼𝗿𝗺 𝗼𝗳 𝗮𝘂𝗱𝗶𝗼 𝘀𝗶𝗴𝗻𝗮𝗹𝘀, making the voice sound more natural and expressive In WaveNet, the CNN takes a raw signal as an input and synthesises an output one sample at a time
9			classification	KNN (K nearest neighbor)	finding the k most similar instances in the training data to a given query instance, and then predicting the output based on the average or majority of the outputs of the k nearest neighbors handle time series data	Ex: air quality data and predict for next 2 days based on last 2 year data Identify if imge has a logo amongst a larger group	can perform both classification and regression tasks
10				Latent Dirichlet Allocation (LDA)	suitable for topic modeling tasks (in NLP) discover the hidden topics and their proportions in a collection of text documents,	news articles, tweets, reviews, etc	Gensim, Scikitlearn, Mallet Not valid for images
11				Factorization Machines (FM) Algorithm	used for tasks dealing with high dimensional sparse datasets
12		Unsupervised		Topic Modeling	Topic modeling is a type of statistical modeling that uses unsupervised Machine Learning to identify clusters or groups of similar words within a body of text
13				BERT based models	Google developed BERT to serve as a bidirectional transformer model that examines words within text by considering both left-to-right and right-to-left contexts	Missing words in text
14
15
16
17		Unsupervised		Principal component analysis (PCA)	reduce the dimensionality (number of features) within a dataset while still retaining as much information as possible Used when the features are highly correlated with each other		Using finding a new set of features called components
18	6	Unsupervised		Random Cut Forest (RCF)	assigns an anomaly score to each data point based on how different it is from the rest of the data	Ex: realtime ingestion, identify anamoly/malicious events
19	7	Unsupervised		Anomaly detection	identifies outliers or abnormal patterns in the data
20	8	Unsupervised		K-means clustering	randomly assigning data points to a number of clusters, then iteratively updating the cluster centers and reassigning the data points until the clusters are stable result is a partition of the data into distinct and homogeneous groups	exploratory data analysis, data compression, anomaly detection, and feature extraction
21				RMSE Root mean square error	Goal – to predict a continuous value measures the average difference between the predicted and the actual values	Price of a house, temperature of a city	Good for regression NOT good fo classification
22			regression	MAPE Mean absolute percentage error	Used for regression
23				ROC receiver operating characteristic (ROC) curve	used to understand how different classification thresholds will impact the models performance. A ROC curve can show the trade-off between the True positive rate TPR and the FPR for different thresholds	predict whether or not a person will order a pizza
24	9		Classification (binary)	Area Under ROC Curve (AOC)	Compare/evaluate ML models AUC is calculated based on the Receiver Operating Characteristic (ROC) curve, which is a plot that shows the trade-off between the true positive rate (TPR) and the false positive rate (FPR) of the classifier as the decision threshold is varied. The TPR, also known as recall or sensitivity	Credit card transactions – identify 99k valid vs 1k fraudulent
25				Residual plots	used to understand whether a regression model is more frequently overestimating or underestimating the target
26				Confusion matrix	table that shows the counts of true positives, false positives, true negatives, and false negatives for each class indicate the accuracy, precision, recall, and F1-score of the model for each class,		only applicable for classification models, not regression models. A confusion matrix cannot show the magnitude or direction of the errors made by the model.
27				Precision	proportion of predicted positive cases that are actually positive. Precision is a useful metric when the cost of a false positive is high Recall is not a good metric for imbalanced classification problems	fraudulent transactions spam detection or medical diagnosis
28			Classification	Recall	Same as TPR (true positive rate) Recall is a useful metric when the cost of a false negative is high Recall = True Positives / (True Positives + False Negatives)	fraud detection or cancer diagnosis
29		Supervised	Classification (multi-class)	XGBoost	Can handle multiple features and multiple classes	Categorize new products when a dataset/features is provided Ex: with 15 features (title/weight/price) categorize books/games/movies etc from a dataset of 1200 products Credit card fraud detection (ex: with a large dataset of historical data, find/predict new txns)	can be used for classification, regression, ranking, and other tasks. It is based on the gradient boosting algorithm, which builds an ensemble of weak learners (usually decision trees) to produce a strong learner
30			classification	Term frequency-inverse document frequency (TF-IDF)	assigns a weight to each word in a document based on how important it is to the meaning of the document NOTE: The term frequency (TF) measures how often a word appears in a document, while the inverse document frequency (IDF) measures how rare a word is across a collection of documents.
31			Classification/ categorize	Word2vec	technique that can learn distributed representations of words, also known as word embeddings, from large amounts of text data	when tuning parameters doesn’t help a lot. Transfer learning would be better solution
32				Collaborative Filtering	recommends products or services to users based on the ratings or preferences of other users	customer shopping patterns and preferences based on demographics, past visits, and locality information
33				Decision tree	perform classification tasks by splitting the data into smaller and purer subsets based on a series of rules or conditions	binary classifier based on two features: age of account and transaction month	both linear and non-linear data, and can capture complex patterns and interactions among the features
34
35		Preprocessing technique		Data normalization	Scale the feature to a common range (0,1) or (-1,)		min-max scaling, z-score standardization, or unit vector normalization
36		Preprocessing technique		Dimensionality reduction	Reduce number of features
37		Preprocessing technique		Model regularization	adds a penalty term to the cost function to prevent overfitting
38				L1/L2 regularization	Overfitting problem can be addressed by applying regularization techniques such as L1 or L2 regularization and dropouts. Regularization techniques add a penalty term to the cost function of the model, which helps to reduce the complexity of the model and prevent it from overfitting to the training data. Dropouts randomly turn off some of the neurons during training, which also helps to prevent overfitting
39		Preprocessing technique		Data augmentation	increases the amount of data by creating synthetic samples
40				Poisson distribution	suitable for modeling the number of events that occur in a fixed interval of time or space, given a known average rate of occurrence	waiting for a bus, the interval is 10 minutes, and the average rate is 3 minutes
41				Normal distribution
42				Binomial distribution
43				Uniform distribution
44
45	10

Other Notes

Data preprocessing – is the process of generating raw data for machine learning models

Feature engineering – refers to manipulation — addition, deletion, combination, mutation — of your data set to

improve machine learning model training, leading to better performance and greater accuracy.

Exploratory data analysis (EDA) – is used by data scientists to analyze and investigate data sets and summarize
their main characteristics, often employing data visualization methods.

Hyperparameter tuning – is the process of selecting the optimal values for a machine learning model’s
hyperparameters

Transfer learningis a strategy for adapting pre-trained models for new, related tasks without creating models
from scratch.

Epochs – helps to improve accuracy

– Increasing the number of epochs during model training allows the model to learn from the data over more

iterations, potentially improving its accuracy up to a certain point. This is a common practice when attempting
to reach a specific level of accuracy. Increasing epochs allows the model to learn more from the data, which can
lead to higher accuracy.

– Decreasing the epochs would reduce the training time, possibly preventing the model from reaching the desired
accuracy.

Batch Size – Affects training speed

– Decrease the batch size affects training speed and may lead to overfitting but does not directly relate to
achieving a specific accuracy level.

Temperature – Affects randomness of predictions

– Increase the temperature parameter affects the randomness of predictions, not model accuracy.

– Decrease the temperature to produce more consistent responses to the same input prompts

February 22, 2023February 23, 2023

Run a data processing job on Amazon EMR Serverless with AWS Step Functions

Update Feb 2023: AWS Step Functions adds direct integration for 35 services including Amazon EMR Serverless. In the current version of this blog, we are using an AWS Lambda function to submit the job to EMR Serverless. Now, you can submit an EMR Serverless job by invoking the APIs directly from a Step Functions workflow. Read more about this here.

There are several infrastructure as code (IaC) frameworks available today, to help you define your infrastructure, such as the AWS Cloud Development Kit (AWS CDK) or Terraform by HashiCorp. Terraform, an AWS Partner Network (APN) Advanced Technology Partner and member of the AWS DevOps Competency, is an IaC tool similar to AWS CloudFormation that allows you to create, update, and version your AWS infrastructure. Terraform provides friendly syntax (similar to AWS CloudFormation) along with other features like planning (visibility to see the changes before they actually happen), graphing, and the ability to create templates to break infrastructure configurations into smaller chunks, which allows better maintenance and reusability. We use the capabilities and features of Terraform to build an API-based ingestion process into AWS. Let’s get started!

In this post, we showcase how to build and orchestrate a Scala Spark application using Amazon EMR Serverless, AWS Step Functions, and Terraform. In this end-to-end solution, we run a Spark job on EMR Serverless that processes sample clickstream data in an Amazon Simple Storage Service (Amazon S3) bucket and stores the aggregation results in Amazon S3.

With EMR Serverless, you don’t have to configure, optimize, secure, or operate clusters to run applications. You will continue to get the benefits of Amazon EMR, such as open source compatibility, concurrency, and optimized runtime performance for popular data frameworks. EMR Serverless is suitable for customers who want ease in operating applications using open-source frameworks. It offers quick job startup, automatic capacity management, and straightforward cost controls.

Please read here more on the blog

February 6, 2023

Build and Deploy a Microsoft .NET Core Web API application to AWS App Runner using CloudFormation

Container workload management tasks, such as managing deployments, scaling infrastructure, or keeping it updated, can get cumbersome. AWS App Runner is a great alternative for customers without any prior containers or infrastructure experience, as it is a fully managed service that takes care of building and deploying your application, load balancing traffic, or autoscaling up or down per your application needs. App Runner retrieves your source code from GitHub or source image from Amazon ECR repository in your AWS account, and creates and maintains a running web service for you in the AWS Cloud.

In this blog we show you how to build a Microsoft.NET Web API application with Amazon Aurora Database using AWS App Runner. AWS App Runner makes it easy for developers to quickly deploy containerized web applications and APIs, and helps us start with our source code or a container image.

AWS Batch Application Orchestration using AWS Fargate

Many customers prefer to use Docker images with AWS Batch and AWS Cloudformation for cost-effective and faster processing of complex jobs. To run batch workloads in the cloud, customers have to consider various orchestration needs, such as queueing workloads, submitting to a compute resource, prioritizing jobs, handling dependencies and retries, scaling compute, and tracking utilization and resource management. While AWS Batch simplifies all the queuing, scheduling, and lifecycle management for customers, and even provisions and manages compute in the customer account, customers continue to look for even more time-efficient and simpler workflows to get their application jobs up and running in minutes.

Running a Kubernetes Job in Amazon EKS on AWS Fargate Using AWS StepFunctions

In a previous AWS Blog, I shared an application orchestration process to run Amazon ECS Tasks using AWS Step Functions.This blog will be similar continuation but here we will be running the same application on Amazon EKS as a Kubernetes job on Fargate using StepFunctions.

Amazon EKS provides flexibility to develop many container use cases like long running jobs, web application, micro-services architecture, on-demand job execution, batch processing, machine learning applications with seamless integration in conjunction with other AWS services. Kubernetes is an open source container orchestration engine for automating deployment, scaling and management of containerized applications. The open source project is hosted by the Cloud Native Computing Foundation(CNCF).

An example of running Amazon ECS tasks on AWS Fargate – Terraform (By HashiCorp)

AWS Fargate is a a serverless compute engine that supports several common container use cases, like running micro-services architecture applications, batch processing, machine learning applications, and migrating on premise applications to the cloud without having to manage servers or clusters of Amazon EC2 instances.

In this blog, we will walk you through a use case of running an Amazon ECS Task on AWS Fargate that can be initiated using AWS Step Functions. We will use Terraform to model the AWS infrastructure. The example solution leverages Amazon ECS a scalable, high performance container management service that supports Docker containers that are provisioned by Fargate to automatically scale, load balance, and manage scheduling of your containers for availability. For defining the infrastructure, you can use AWS CloudFormation, AWS CDK or Terraform by HashiCorp. In the solution presented in this post, we use Terraform by HashiCorp, an AWS Partner Network (APN) Advanced Technology Partner and member of the AWS DevOps Competency.

AWS Certified Advanced Networking Specialty!

Excited to start this year with AWS Certified Advanced Networking Specialty!

As per a couplet in Tamil language, “katrathu kaiman alavu kallathathu ulagalavu”, a raw transcribe could be something like
“known is a droplet and unknown is an ocean”.

Started this year with this droplet in my chase of learning new!

Refer my notes and learnings @ https://www.cloudopsguru.com/#/Network

Thanks to

Course by Zeal Vora’s on Udemy – AWS Certified Advanced Networking – Specialty 2020 | Udemy
JayendraPatil.com AWS Certified Advanced Networking – Speciality (ANS-C00) Exam Learning Path | Jayendra’s Blog (jayendrapatil.com)
Practice exams on whizlabs & online did help to timebox the exam

December 23, 2020December 23, 2020

AWS Lambda PowerMockITO – mocking static methods & DI

Thought of writing an article about mocking static methods in Java (there are quite a few available in the web). Just plugged in the combination of DI using Dagger and PowerMockIto

An example implementation of uploading and downloading/get files from AWS S3. Save and read rows from DynamoDB. The example provides JUnit testing using Mockito & PowerMockITO

1. Two sample lambdas (save and get) - to S3 and DynamodB
2. Terraform implementation for deploying the lambda
3. JUnit for S3 and DynamoDB testing
4. Sample CLI commands to test

December 4, 2020

Amazon S3 Expiration

Deleting large S3 files using AWS Lambda & Amazon S3 Expiration policies reliably

Deleting larger objects in AWS S3 can be cumbersome and sometimes demands repetition and retries possibly due to the nature of large size (& more number) of the files. In one such scenario is handled today using hourly EMR job operations (one master and multi nodes) on these S3 files. Due to the nature of large size/number of files, the process runs for longer hours and occasionally had to be retried due to failures.

To mitigate the deletion process more reliably and cheaper, we leveraged shorter compute execution using AWS Lambda in combination with AWS S3 lifecycle policies. The lambda can be scheduled using an CloudWatch Event rule (or using AWS StepFunctions or Apache Airflow etc.,). In this situation the SLA for the deletion of the files can be upto 48 hours.

This example provides infrastructure and sample Java code (for the Lambda) to delete the s3 files using lifecycle policy leveraging object expiration techniques

Please refer https://github.com/shivaramani/lambda-s3-lifecycle-deletion-process for more details

December 2, 2020

AWS Lambda Alias Routing

Lambda Version Alias

This example provides house alias routing can be used in API Gateway and in Kinesis Streams (lambda event mapping to alias)

https://github.com/shivaramani/lambdaversion

High level Steps

The solution has “src” and “templates”. src has the java sample lambda. templates have the .tf files.
Upon “mvn clean package” “target” folder will have the jar generated.
“exec.sh” has the aws cli/bash commands to deploy the lambda function and modify the alias accordingly

Initial Setup
- Build java source
- Run terraform templates to create API Gateway/Lambda. This pushes the above jar to the lambda
- Created API points to lambda:active alias. “red” and “black” aliases are also created at this point
Modify the code
- Run the “exec.sh”. This is similar to GIT pipelines deploying the function
- At this point the the cli sets current version as red and new code being deployed as black
- cli sets the routing pointing to black and additional routing config weightage for the fallback/red