Howdy! I was able to crack AWS Certified AI Practitioner Certification | AWS Certification | AWS (amazon.com) recently.
- Udemy course by Stephane Maarek – https://www.udemy.com/course/aws-ai-practitioner-certified
- AWS Skill Builder course for AP Practitioner – https://explore.skillbuilder.aws/learn/public/learning_plan/view/2194/enhanced-exam-prep-plan-aws-certified-ai-practitioner-aif-c01
- AWS Documentation
- Nothing comes closer to AWS Overview and FAQs
- – https://aws.amazon.com/sagemaker/faqs
- – https://aws.amazon.com/q/faqs/
- – https://aws.amazon.com/polly/faqs/
- – https://aws.amazon.com/transcribe/faqs
- – Amazon Textract FAQs | AWS
- – Amazon Kendra FAQs – Amazon Web Services
- – Amazon Personalize FAQs – Amazon Web Services
- – https://aws.amazon.com/rekognition/faqs
- – https://aws.amazon.com/lex
- – Amazon Comprehend – FAQs
- – AWS Translate FAQs – Amazon Web Services (AWS)
AWS Services Notes
1 | SNo | Service | Details | Comments |
2 | 1 | SageMaker | prepare data and build, train, and deploy machine learning (ML) models | End to end Managed service |
3 | SageMaker Studio | single, web-based visual interface to perform all ML development steps | prepare data and build, train, and deploy model, upload data, create new notebooks, train and tune models, move back and forth between steps to adjust experiments, compare results, and deploy models to production All ML development activities including notebooks, |
|
4 | ||||
5 | ||||
6 | SageMaker Data Wrangler | For data preparation, transformation and feature engineering Prep tabular and image data for ML Single interface for data selection, cleansing, exploration, visualization and processing Sql support and Data Quality tool |
Use case – music dataset, song ratings, listening duration | |
7 | ||||
8 | ||||
9 | ||||
10 | SageMaker Canvas | No code interface Build/tune/train model using a visual interface Build your own custom model using automl Leverage data wrangler |
visual drag-and-drop service that allows business analysts to build ML models and generate accurate predictions without writing any code or requiring ML expertise. Use case |
|
11 | SageMaker Clarify | For data preparation. Evaluate foundation models – compare Model A vs Model B Evaluate using human factors Use built in datasets or bring your own dataset Built inn metrics and algorithms Model Explainability – debug predictions. To increase the trust and |
To identify potential bias Bring your own employee or aws employee Detect Bias |
|
12 | SageMaker Feature Store | Store, share and manage features of ML models | ||
13 | SageMaker Ground Truth, SageMaker Ground Truth Plus |
Labeling For RLHF – reinforcement learning from human feedback Model review customization and evaluation |
identify raw data, such as images, text files, and videos, and add informative labels to create high-quality training datasets for your ML model |
|
14 | SageMaker Studio Notebooks | Jupyter notebooks in SageMaker for the complete ML development | ||
15 | SageMaker Studio Lab | ML development environment | that provides the compute, storage (up to 15 GB), and security | |
16 | SageMaker HyperPod | Train models | purpose-built to accelerate foundation model (FM) training | |
17 | SageMaker Experiments | organize and track iterations to ML models | ||
18 | SageMaker Debugger | captures real-time metrics during training | monitors CPUs, GPUs, network, and memory | |
19 | SageMaker Serverless Inference | deploy and scale ML models | ||
20 | SageMaker Edge Manager | Optimize, secure, monitor, and maintain ML models on fleets of edge devices | smart cameras, robots, personal computers, and mobile devices | |
21 | SageMaker Neo | After training, use Neo to compile the model
train once and run anywhere in the cloud |
supports the most popular DL models – AlexNet, ResNet, VGG, Inception, MobileNet, SqueezeNet, and DenseNet models trained in MXNet and TensorFlow, and classification and random cut forest models trained in XGBoost |
|
22 | SageMaker Model Monitor | monitors the quality of Amazon SageMaker machine learning models
Monitors data/model Get alert for |
Continuous – real-time endpoint Continuous – batch transform job Scheduled – asynchronous batch transform jobs |
|
23 | SageMaker Model Registry | Centralized repository allows to track/manage and version models Catalog models, manage model versions, associate metadata with a model Manage approval status of a model, automate model deployment, share models |
||
24 | SageMaker Pipelines | Process of building training and deploy CICD for ML Steps Processing, training, tuning, automl,model, clarifycheck, quality check |
||
25 | SageMaker Feature Store | for sharing and managing variables (features) across multiple teams during model development Ingests feature from variety of sources |
||
26 | SageMaker Model Cards | Provide model documentation, not feature management. | Use case – intended uses, risk ratings and training details | |
27 | ||||
28 | SageMaker JumpStart | ML hub to find pretrained foundation model, computer vision models or nlp models for quickly deploying and consuming a foundation model (FM) within a team’s VPC. Models can be fully customized or access prebuilt solutions and deployed |
Provides access to a wide range of pre-trained models and solutions that can be easily deployed and consumed within a VPC. Designed to simplify and accelerate the deployment of machine learning models, including foundation models. |
|
29 | SageMaker Role Manager | Define role for personas | Ex: data scientist, analyst | |
30 | ||||
31 | 2 | Amazon Bedrock | ||
32 | 3 | Q | generative AI–powered assistant for accelerating software development and leveraging companies’ internal data |
|
33 | Q Developer | Coding, testing, and upgrading applications, to diagnosing errors, performing security scanning and fixes, and optimizing AWS resources |
||
34 | Q business | generative AI–powered assistant that can answer questions, provide summaries, generate content, and securely complete tasks based on data and information in your enterprise systems |
||
35 | Q for QuickSight | unified business intelligence (BI)
multi-visual Q&A responses, get AI-driven |
customers get a Generative BI assistant that allows business analysts to use natural language to build BI dashboards in minutes and easily build visualizations and complex calculations |
|
36 | ||||
37 | ||||
38 | Q for Connect | real-time conversation with the customer along with relevant company content to automatically recommend what to say or what actions an agent should take to better assist customers. |
||
39 | Q for Supply Chain | inventory managers, supply and demand planners, and others will be able to ask and get intelligent answers about what is happening in their supply chain, why it is happening, and what actions to take. They will also be able to explore what-if scenarios to understand the trade-offs between different supply chain choices |
||
40 | ||||
41 | 3 | Amazon Comprehend | For NLP Language, extracts key phrases, Custom classifier — organize documents into categories Analyzes text using tokenization |
Text and Documents
Ex: analyze email, create group articles that comprehend will Sentiment analysis |
42 | Amazon Translate | Natural and accurate translate languages
Custom terminology – csv/tsv/tmx |
Text and Documents
Use cases – websites and applications, for international users |
|
43 | Amazon Textract | Extract text. Handwriting and data from any scanned documents using AI/ML
Ex: scan a |
Text and Documents
Use cases – financial services. Health care, public sector (health |
|
44 | ||||
45 | 4 | Amazon Rekognition | Find objects, people, text, scenes in images and videos
Custom labels – identify/find Content moderation – detect inappropriate, unwanted, Custom Moderation Adapters – extend rek capabilities by providing your |
Vision
Use cases – labeling, content moderation, text detection, face detection and Filter out harmful images |
46 | ||||
47 | 5 | Amazon Kendra | Document search service Extract answers from docs – text/pdf/html/ppt/word etc., Natural language search capabilities Creates knowledge index/powered by ML internally |
Search |
48 | ||||
49 | 6 | Amazon Lex | Using voice and text Conversational ai with multiple languages Integrates with lambda, Connect, comprehend, kendra |
Chatbots |
50 | ||||
51 | 7 | Amazon Polly | Convert text to speech
Lexicons – SSML Voice engine Speech mark |
Speech
Generative |
52 | Amazon Transcribe | Convert speech to text Deep learning process called automatic speech recognition Removes PII using redaction Supports automatic language identification for multi lingual audio Custom Vocabularies – Can capture domain specific/non-standard terms Custom language models (for context) – for domain specific |
Speech
Use cases – Can transcribe |
|
53 | ||||
54 | 8 | Amazon Personalize | Ex: retail stores, media and entertainment | Recommendation |
55 | ||||
56 | 9 | AWS DeepRacer | Console to train and evaluate deep RL | |
57 | ||||
58 | 10 | Amazon Forecast | ML to deliver highly accurate forecasts | Use case – predict future sales Product demand planning, financial planning, resource planning |
59 | ||||
60 | ||||
61 | 11 | Amazon Mechanical Turk | Crowdsourcing marketplace Distributed virtual workforce Integrates with Amazon A2I, SageMaker Ground Truth etc., |
Use case – label 1000000 images Data collection, business processing etc., |
62 | ||||
63 | 12 | Amazon Augmented AI | Human oversight of machine learning predictions in production | Can be own employees or AWS/contractors |
64 | ||||
65 | 13 | Amazon Comprehend Medical and Transcribe | ||
66 | ||||
67 | 14 | Amazon’s Hardware for AI | AWS Trainium – Trn1 instance
AWS Inferentia – ML chip built to deliver inference EC2, EBS,EFS, ELB, ASG |
EC2 GPU – P3, P4, P5,…. G3,.. G6 |
68 | ||||
Machine Learning Notes
1 | SNo | Type | Used for | Name | Details | Use Cases | Comment |
2 | 1 | Supervised Learning | Linear Regression | Model relationship between one or more input features One output variable — target |
Historical sales data, output – no of units to be produced Predict House prices, stocks prices, sales volume etc., |
||
3 | 2 | Supervised | Binary classification | Binary outcome yes/no, true/false, +/- | |||
4 | 3 | Supervised | Time series prediction | forecasts future values based on past and present data | |||
5 | 4 | Supervised | Regression | estimates a continuous numerical value based on the input features | |||
6 | 5 | Supervised | recurrent neural network (RNN) | type of neural network that can process sequential data. suited for predicting future events based on past observations NOTE: CNN is for images and RNN is for timeseries. |
forecasting engine failures based on sensor readings | TensorFlow, PyTorch, Keras, MXNet | |
7 | convolutional neural network (CNN) | Classify an object amongst a group NOTE: CNN is for images and RNN is for timeseries. |
an animal image as input and identify probability distribution of how likely amongst 10 types of animals |
Softmax function transforms a a arbitrary real values into a range of (0,1) TensorFlow, PyTorch, Keras, MXNet |
|||
8 | WaveNet | Generative model for raw audio WaveNet is a deep autoregressive CNN with stacked layers of dilated convolution, used for generating speech. To deliver a more human-like voice, WaveNet: • 𝗠𝗼𝗱𝗲𝗹𝘀 𝘁𝗵𝗲 𝗿𝗮𝘄 𝘄𝗮𝘃𝗲𝗳𝗼𝗿𝗺 𝗼𝗳 𝗮𝘂𝗱𝗶𝗼 𝘀𝗶𝗴𝗻𝗮𝗹𝘀, making the voice sound more natural and expressive In WaveNet, the CNN takes a raw signal as an input and synthesises an output one sample at a time |
|||||
9 | classification | KNN (K nearest neighbor) | finding the k most similar instances in the training data to a given query instance, and then predicting the output based on the average or majority of the outputs of the k nearest neighbors handle time series data |
Ex: air quality data and predict for next 2 days based on last 2 year data Identify if imge has a logo amongst a larger group |
can perform both classification and regression tasks | ||
10 | Latent Dirichlet Allocation (LDA) | suitable for topic modeling tasks (in NLP) discover the hidden topics and their proportions in a collection of text documents, |
news articles, tweets, reviews, etc | Gensim, Scikitlearn, Mallet Not valid for images |
|||
11 | Factorization Machines (FM) Algorithm | used for tasks dealing with high dimensional sparse datasets | |||||
12 | Unsupervised | Topic Modeling | Topic modeling is a type of statistical modeling that uses unsupervised Machine Learning to identify clusters or groups of similar words within a body of text |
||||
13 | BERT based models | Google developed BERT to serve as a bidirectional transformer model that examines words within text by considering both left-to-right and right-to-left contexts |
Missing words in text | ||||
14 | |||||||
15 | |||||||
16 | |||||||
17 | Unsupervised | Principal component analysis (PCA) | reduce the dimensionality (number of features) within a dataset while still retaining as much information as possible Used when the features are highly correlated with each other |
Using finding a new set of features called components | |||
18 | 6 | Unsupervised | Random Cut Forest (RCF) | assigns an anomaly score to each data point based on how different it is from the rest of the data |
Ex: realtime ingestion, identify anamoly/malicious events | ||
19 | 7 | Unsupervised | Anomaly detection | identifies outliers or abnormal patterns in the data | |||
20 | 8 | Unsupervised | K-means clustering | randomly assigning data points to a number of clusters, then iteratively updating the cluster centers and reassigning the data points until the clusters are stable |
exploratory data analysis, data compression, anomaly detection, and feature extraction | ||
21 | RMSE Root mean square error | Goal – to predict a continuous value measures the average difference between the predicted and the actual values |
Price of a house, temperature of a city | Good for regression NOT good fo classification |
|||
22 | regression | MAPE Mean absolute percentage error | Used for regression | ||||
23 | ROC receiver operating characteristic (ROC) curve | used to understand how different classification thresholds will impact the models performance. A ROC curve can show the trade-off between the True positive rate TPR and the FPR for different thresholds |
predict whether or not a person will order a pizza | ||||
24 | 9 | Classification (binary) | Area Under ROC Curve (AOC) | Compare/evaluate ML models AUC is calculated based on the Receiver Operating Characteristic (ROC) curve, which is a plot that shows the trade-off between the true positive rate (TPR) and the false positive rate (FPR) of the classifier as the decision threshold is varied. The TPR, also known as recall or sensitivity |
Credit card transactions – identify 99k valid vs 1k fraudulent | ||
25 | Residual plots | used to understand whether a regression model is more frequently overestimating or underestimating the target |
|||||
26 | Confusion matrix | table that shows the counts of true positives, false positives, true negatives, and false negatives for each class indicate the accuracy, precision, recall, and F1-score of the model for each class, |
only applicable for classification models, not regression models. A confusion matrix cannot show the magnitude or direction of the errors made by the model. |
||||
27 | Precision | proportion of predicted positive cases that are actually positive. Precision is a useful metric when the cost of a false positive is high Recall is not a good metric for imbalanced classification problems |
fraudulent transactions spam detection or medical diagnosis |
||||
28 | Classification | Recall | Same as TPR (true positive rate) Recall is a useful metric when the cost of a false negative is high Recall = True Positives / (True Positives + False Negatives) |
fraud detection or cancer diagnosis | |||
29 | Supervised | Classification (multi-class) | XGBoost | Can handle multiple features and multiple classes | Categorize new products when a dataset/features is provided Ex: with 15 features (title/weight/price) categorize books/games/movies etc from a dataset of 1200 products Credit card fraud detection (ex: with a large dataset of historical data, find/predict new txns) |
can be used for classification, regression, ranking, and other tasks. It is based on the gradient boosting algorithm, which builds an ensemble of weak learners (usually decision trees) to produce a strong learner |
|
30 | classification | Term frequency-inverse document frequency (TF-IDF) | assigns a weight to each word in a document based on how important it is to the meaning of the document NOTE: The term frequency (TF) measures how often a word appears in a document, while the inverse document frequency (IDF) measures how rare a word is across a collection of documents. |
||||
31 | Classification/ categorize | Word2vec | technique that can learn distributed representations of words, also known as word embeddings, from large amounts of text data |
when tuning parameters doesn’t help a lot. Transfer learning would be better solution | |||
32 | Collaborative Filtering | recommends products or services to users based on the ratings or preferences of other users | customer shopping patterns and preferences based on demographics, past visits, and locality information |
||||
33 | Decision tree | perform classification tasks by splitting the data into smaller and purer subsets based on a series of rules or conditions |
binary classifier based on two features: age of account and transaction month | both linear and non-linear data, and can capture complex patterns and interactions among the features |
|||
34 | |||||||
35 | Preprocessing technique | Data normalization | Scale the feature to a common range (0,1) or (-1,) | min-max scaling, z-score standardization, or unit vector normalization |
|||
36 | Preprocessing technique | Dimensionality reduction | Reduce number of features | ||||
37 | Preprocessing technique | Model regularization | adds a penalty term to the cost function to prevent overfitting | ||||
38 | L1/L2 regularization | Overfitting problem can be addressed by applying regularization techniques such as L1 or L2 regularization and dropouts. Regularization techniques add a penalty term to the cost function of the model, which helps to reduce the complexity of the model and prevent it from overfitting to the training data. Dropouts randomly turn off some of the neurons during training, which also helps to prevent overfitting |
|||||
39 | Preprocessing technique | Data augmentation | increases the amount of data by creating synthetic samples |
||||
40 | Poisson distribution | suitable for modeling the number of events that occur in a fixed interval of time or space, given a known average rate of occurrence |
waiting for a bus, the interval is 10 minutes, and the average rate is 3 minutes | ||||
41 | Normal distribution | ||||||
42 | Binomial distribution | ||||||
43 | Uniform distribution | ||||||
44 | |||||||
45 | 10 |
Data preprocessing – is the process of generating raw data for machine learning models
Feature engineering – refers to manipulation — addition, deletion, combination, mutation — of your data set to
improve machine learning model training, leading to better performance and greater accuracy.
Exploratory data analysis (EDA) – is used by data scientists to analyze and investigate data sets and summarize
their main characteristics, often employing data visualization methods.
Hyperparameter tuning – is the process of selecting the optimal values for a machine learning model’s
hyperparameters
Transfer learningis a strategy for adapting pre-trained models for new, related tasks without creating models
from scratch.
Epochs – helps to improve accuracy
– Increasing the number of epochs during model training allows the model to learn from the data over more
iterations, potentially improving its accuracy up to a certain point. This is a common practice when attempting
to reach a specific level of accuracy. Increasing epochs allows the model to learn more from the data, which can
lead to higher accuracy.
– Decreasing the epochs would reduce the training time, possibly preventing the model from reaching the desired
accuracy.
Batch Size – Affects training speed
– Decrease the batch size affects training speed and may lead to overfitting but does not directly relate to
achieving a specific accuracy level.
Temperature – Affects randomness of predictions
– Increase the temperature parameter affects the randomness of predictions, not model accuracy.
– Decrease the temperature to produce more consistent responses to the same input prompts