A Tour of GenAI
🚀

There and back again...

João Galego

$$\left|\text{🧠}\right>$$

Contents 📓

  1. GenAI in a nutshell 🌰
  2. Building GenAI on AWS
  3. GenAI in practice
  4. References

Warning ⚠️

This deck is a work in progress…

and always will be

Feel free to search around 🔎

Cite this presentation 📑

@misc{a-tour-of-genai-jgalego,
    title = {A Tour of GenAI},
    author = {Galego, João},
    howpublished = \url{jgalego.github.io/GenAI},
    year = {2023}
}

Note on implementation 👨‍💻

The slides were created using reveal.js

and the presentation is hosted on GitHub Pages

Want to contribute? ✨

Just open an issue/PR for this project

github.com/JGalego/GenAI

GenAI in a nutshell

In the beginning
there was nothing *,
which exploded…

― Terry Pratchett, Lords and Ladies (1992)


💥

* Well, not exactly…

History has a way of repeating itself...

Prompt: The history of AI in hieroglyphs

1960s: The ELIZA Effect

Source: Wikipedia

2022: The ChatGPT Effect

Source: Reddit

Can you spot the differences?

👀

Let's back up a little…

2017: Hello Transformers!

Attention Is All You Need introduces the

Transformer architecture

Source: TechTalks

Motivation

Previous seq2seq models were

SLOW 🐌 and FORGETFUL 🤔

Source: KiKaBeN

Transformers use
an encoder-decoder architecture…

… made of many building blocks 🧱

Source: NLP Course | For You

Zooming in on Attention 🔎

Source: Lilian Weng

Visualizing Attention 👁️

Source: BertViz

Deconstructing Attention

Source: Medium

Multi-head Attention

Source: BertViz

History of Attention

Source: Adapted from Stanford

Transformers have driven
significant progress in AI

Decoder-only: GPT-1..4

Source: Radford et al. (2018)

Encoder-only: BERT and its progeny

Source: Devlin et al. (2018)

Encoder-decoder: T5, BART

Source: Raffel et al. (2019)

How is this all connected with GenAI?

Let's focus on the 'generative' part

There are 2 main classes of statistical models…

Discriminative models

draw boundaries in data space

Source: Adapted from Medium

Example: Van Gogh or not Van Gogh? 👂🏻

Source: Data Analytics

Generative models

describe how data is placed
throughout the data space

Source: Adapted from Medium

Example: Picture me a 🐴

Since the model is probabilistic,
we can just sample from it to create new data

Source: Data Analytics

We can connect the two using Bayes' Rule

Source: Adapted from UMichigan

There are many types of generative models…

Overview of Generative Models

Source: Lilian Weng

2013: Variational Autoencoders (VAE)

Source: LearnOpenCV

2014: Generative Adversarial Networks (GAN)

Source: d2l.ai

GAN Samples

Source: Goodfellow et al. (2014)

2015: Diffusion Models

Source: Sohl-Dickstein et al. (2015)

Diffusion takes a signal and turns it into noise

Signal $\rightarrow$ … $\rightarrow$ Noise

Diffusion models are trained to denoise noisy images

Source: Keras

New images are created by
iteratively denoising pure noise

Noise $\rightarrow$ … $\rightarrow$ Signal

Source: Keras

January 2021: OpenAI releases DALL-E

Source: Dale on AI

DALL-E Samples Comparison

Source: Ramesh et al. (2021)

July 2022: Midjourney enters open beta

Source: Bloomberg

AI-generated paintings as digital art

Théâtre d'Ópera Spatial (Midjourney + Gigapixel AI)

Source: NYTimes

SPOILER ALERT
GenAI and Hollywood 2.0

Learn how Runway helped create the rock scene 🪨 in
'Everything Everywhere All at Once.'

August 2022: Stability AI releases Stable Diffusion

Source: Stability AI

Latent Diffusion Model Architecture

Source: CompVis Lab

Stable Diffusion Components

  1. CLIP (GPT-based) or BERT: Text Encoder

  2. UNet + Scheduler: Image Information Creator

  3. Autoencoder Decoder: Image Decoder

January 2022: InstructGPT

Learning to follow instructions
from human preferences

Source: Ouyang et al. (2022)

November 2022: OpenAI releases ChatGPT

Source: OpenAI

... and we're back!

So, where are we now...

and what's coming next?

GenAI is the fastest growing trend in AI

📈

The "Cambrian Explosion" of GenAI

Source: Yang et al. (2022)

Developer Adoption

Stable Diffusion accumulated 40k stars
on GitHub in its first 90 days

Source: Twitter

Consumer Adoption

ChatGPT reached the 1M users mark
in just 5 days

Source: Adapted from LinkedIn

GenAI according to GenAI

Prompt: What is Generative AI?

Let's break it down…

# 1

GenAI can generate new content

similar to what of a human would produce

Pop Quiz

Which painting was generated with AI?

Source: Tidio

If you answered A, you're in big trouble…

or was it the other way around? 🤔

# 2

GenAI is powered by Foundation Models

or FMs for short

These are really large models…

trained on massive amounts of unlabeled data…

that can be adapted to a wide range of tasks

Traditional vs Foundation Models

Source: Adapted from AWS ML Blog

When dealing with natural language,
we usually talk about Large Language Models

or LLMs for short

Source: Amazon Science

Language modelling has been around for a while...

Source: Kuenzig Books

Word frequency vs Word Order

Source: Shannon (1951)

Series of approximations to English

0th-order approximation XFOML RXKHRJFFJUJ ALPWXFWJXYJ FFJEYVJCQSGHYD QPAAMKBZAACIBZLKJQD
1st-order approximation OCRO HLO RGWR NMIELWIS EU LL NBNESEBYA TH EEI ALHENHTTPA OOBTTVA NAH BRL
2nd-order approximation ON IE ANTSOUTINYS ARE T INCTORE ST BE S DEAMY ACHIN D ILONASIVE TUCOOWE AT TEASONARE FUSO TIZIN ANDY TOBE SEACE CTISBE
1st-order word approximation REPRESENTING AND SPEEDILY IS AN GOOD APT OR COME CAN DIFFERENT NATURAL HERE HE THE A IN CAME THE TO OF TO EXPERT GRAY COME TO FURNISHES THE LINE MESSAGE HAD BE THESE
2nd-order word approximation THE HEAD AND IN FRONTAL ATTACK ON AN ENGLISH WRITER THAT THE CHARACTER OF THIS POINT IS THEREFORE ANOTHER METHOD FOR THE LETTERS THAT THE TIME OF WHO EVER TOLD THE PROBLEM FOR AN UNEXPECTED

Source: Adapted from Shannon & Weaver(1963)

LLMs are really just a proper subset of FMs

Language modeling is compression

Source: Delétang et al. (2023)

How do these models work?

If the input is text-based, we call it a Prompt

Source: cohere

The Anatomy of a Prompt 💀

Prompts can be as simple as an instruction/question,
or as complex as huge chunks of text.

You can tell it who to be (role),
what it needs to know (context)
what to do (task) and how (instructions),
what it should avoid (constraints)
or you can show it what to do and how (examples)

There are no rules!

Role Prompting

Assign a role to the model
to provide some context

Source: Learn Prompting

Few-Shot Prompting

Show the model a few examples
of what you want it to do

Source: Learn Prompting

Crafting Prompts: Design Principles

  1. Clear and specific instructions
  2. Simple and clear wording
  3. Avoid complex sentence types
  4. Avoid ambiguity
  5. Use keywords
  6. Consider the audience
  7. Test and refine

Prompt Engineering 👩‍💻

Talking a model into doing
what you want it to do

Prompt Hacking 🐱‍💻

Talking a model into doing
something it's not supposed to
(either good or bad)

The hottest new programming language is English

Andrej Karpathy, AI Researcher

How fluent are these models in other languages? 🗣️

Source: Wired

# 3

GenAI applies to many use cases

Some examples include…

Productivity (Text Generation) 💬

Chat (Virtual Assistant) 💁

Summarization (Text Extraction) 📖

Search 🔎

Code Generation 👨‍💻

Music Creation 🎶

Video Editing 🎥

GenAI is rapidly transforming AI

Source: Reddit

Text-to-Image (txt2img)

Source: 🤗

Image-to-Text (img2txt)

Source: 🤗

Image-to-Image (img2img)

Source: 🤗

Text-to-GIF (txt2gif)

Source: GIFfusion 💥

Text-to-Video (txt2video)

Source: Meta AI

Text-to-Code (txt2code)

Source: AWS News Blog

Generative Agents

Source: Park et al. (2023)

Emergent Abilities: Just a Mirage? 🏝️

"(...) for a fixed task and a fixed model family, the researcher can choose a metric
to create an emergent ability or choose a metric to ablate an emergent ability"

Source: Schaeffer, Miranda & Koyejo (2023)

GenAI is taking over the world

"Every industry that requires
humans to create original work (…)
is up for reinvention."

― Sequoia Capital, Generative AI: A Creative New World

But there are "some" challenges…

Hallucinations 🍄

Source: TheNewStack

Example: Backwards Epigenetic Inheritance

A causally impossible scientific theory
just a prompt away

Source: Extracted from ChatGPT/LLM Errors

"(…) they used to lie and say terrible things.
Now they just lie and that's interesting enough"

― Gary Marcus

Actually…

Philosophically speaking,

LLMs are 🐂💩ers not liars

def. Bullshit

Any statement produced without
particular concern for reality and truth

"Bullshit is a greater enemy of truth than lies are."

― Harry Frankfurt, On Bullshit (2005)

Why do these models hallucinate?

LLMs as "compact" and "lossy"
representations of knowledge

Source: Adapted from Designing with Machine Learning

How can we prevent/reduce hallucinations?

RL with Human Feedback (RLHF)

Human evaluators review the model's responses and pick the most appropriate for the users' prompts

Source: 🤗

Can we take RL out of RLHF?

Direct Policy Optimization (DPO) bypasses both explicit reward estimation and RL and optimizes the language model directly using preference data.

Source: Rafailov et al. (2023)

Early Detection

Identify hallucinated content and
use it during training

Source: Zhou et al. (2021)

Regularization

Often overlooked, these techniques
can help alleviate overfitting

Source: Xue et al. (2023)

Temperature Tuning 🌡️

Temperature regulates the randomness
or creativity of the responses

$\uparrow T \Rightarrow \uparrow \texttt{Hallucinations}$
Source: Designing with Machine Learning

Chain-of-Thought (CoT) Reasoning 🤔

Using CoT prompting we can improve a model's ability to perform complex reasoning

Source: Wei et al. (2022)

External Data Sources

Provide access to relevant data from a knowledge base

Treat the task as a search problem grounded in data

Retrieval Augmented Generation (RAG)

Dense Passage Retrieval

Hypothetical Document Embedding (HyDE)

Security 🛡️

Source: Greshake et al. (2023)

Training LLMs on untrusted data
has become the norm rather than the exception

According to Wan et al. (2023), launching a successful data poisoning attack during instruction tuning

takes only a few hundred 'poisoned apples' 🍎☠️🤢

Source: GitHub

Prompt Attack

Attack that exploits the vulnerabilities of LLMs,
by manipulating their prompts

Source: Learn Prompting

Adversarial Prompts

Source: Zou et al. (2023)

OWASP Top 10 for LLM Applications

Source: OWASP

Sustainability 🌱

Source: Rilling et al. (2023)

Optimize workloads for environmental sustainability

Source: AWS ML Blog

The open source community
has a major role to play...

Let me tell you the story of Llama 🦙

February 24th 2023: Meta releases Llama 🦙

March 3rd 2023: LlaMALeaks 🤫

Source: Vice

March 10th 2023: llama.cpp - initial release

Source: GGML

March 12th 2023: LlaMA runs on a Raspberry Pi

Source: GitHub

March 13th 2023: Stanford releases Alpaca

Source: Stanford

Training recipe

Source: Stanford

Alpaca meets LoRA

Source: GitHub

Low-Rank Adaptation (LoRA)

Source: 🤗

Quantization

Rounding off one data type to another

Source: 🤗

Example: int8 absmax quantization

Source: 🤗

Quantized LoRA (QLoRA)

Introduces a number of innovations incl. Double Quantization and Paged Optimizers that save memory without compromising performance

Source: Dettmers et al. (2023)

March 14th 2023 🥧: LlaMA runs on a Pixel 6

Source: Twitter

March 19th 2023: LMSYS Org releases Vicuna

Source: LMSYS Org

State of Llama in 2023/Q1

Source: Medium

May 4th 2023 🌌🔫: Moats, moats, moats

Source: Adapted from WebDonuts

July 18th 2023: Meta + Microsoft release Llama 2

Source: Wired

August 24th 2023: Code Llama 👨‍💻🦙

Source: Meta

September 27th 2023: AWS becomes the first managed API partner for Llama 2

Is Llama 2 open source?

No, and we need a new definition of open

No, but that's OK

Yes (citation needed)

NeurIPS LLM Efficiency Challenge

1 LLM 💬 + 1 GPU ⚡🌱 + 1 day ⏳

🐧

"I often compare open source to science. Science took this whole notion of developing ideas in the open and improving on other people's ideas. It made science what it is today and made the incredible advances that we have had possible."

― Linus Torvalds

What comes next?

Building GenAI on AWS

For more information, visit aws.amazon.com/generative-ai

GenAI Workloads

The AWS AI/ML Stack (Redux)

AWS supports GenAI in all layers of the stack

Let's start by looking at the bottom layer…

ML Frameworks & Infrastructure

There's some evidence that large-scale models
lead to better results*

Source: Kaplan et al. (2020)

* Read the fine print!

AI models are getting bigger…

Source: Nature

… a lot bigger!

Source: LifeArchitect.ai

How do we train a large model like, say…

Stable Diffusion?

Let's check out Stability's HPC cluster 🦮

💻 github.com/Stability-AI/stability-hpc

Training large-scale models comes
with a lot of challenges

Hardware 💻

Health Checks 👨‍⚕️⚠️

Orchestration 🎻🎶

Data 💾

Scale 📈

Cost 💰

HPC ML cluster for distributed training

Source: Adapted from AWS re:Invent 2022

1-click HPC

Source: GitHub

Compute: EC2 UltraClusters

Source: AWS

Current NVIDIA A100 GPU Count

Source: State of AI Report 2022

Compute: EC2 Trn1/Trn1n Instances

Source: iThome

Neuron on Trn1 Instances

Source: AWS News Blog

Annapurna Labs

The 'Secret Sauce' behind AWS's success

Source: Amazon Science

Networking: Elastic Fabric Adapter (EFA)

Source: Adapted from Shalev et al. (2020)

Storage: ML training storage hierarchy

Source: Adapted from AWS re:Invent 2022

Orchestration: AWS Parallel Cluster

Source: Adapted from AWS re:Invent 2022

How to create a pcluster

pcluster create-cluster -f config.yaml ...
Source: Adapted from AWS re:Invent 2022

Is there a better way?

#1 Train Stable Diffusion on Amazon SageMaker

Source: Medium

Step-by-Step Guide

  1. Prepare Data

  2. Train Model

  3. Evaluate Model

    • 1 epoch on 50M image/text pairs with ~200 GPUs? 15 mins!
  4. Deploy Model

Amazon SageMaker Studio

Source: Amazon SageMaker Developer Guide

Amazon SageMaker Notebook Instance

Source: Adapted from SQLShack

FAQ: How much compute do I need?

Run the math! 🧮

Transformer FLOPS equation:
$6 \times \# parameters \times \# tokens$

Training

Transformer Math 101 by EleutherAI

Inference

Transformer Inference Arithmetic by Kipply

Weight FLOPS Equation

Source: Medium

FAQ: Which GPU is right for me?

Look at the whole system

Choose the right "GPU instance" not just the right "GPU"
Source: Medium

Accelerate Transformers on AmazonSageMaker
with AWS Trainium and AWS Inferentia

Inf2 on Amazon SageMaker

Source: AWS ML Blog

Jupyter AI: Bring GenAI to Jupyter Notebooks

Source: Adapted from GitHub

#2 Use a pre-trained FM
from Amazon SageMaker JumpStart

Source: Adapted from AWS

Models on Amazon SageMaker JumpStart can be accessed in 3 ways

Source: Adapted from AWS

Fine-tune Stable Diffusion
with Amazon SageMaker JumpStart

Source: AWS ML Blog

Benchmarking Stable Diffusion Fine-tuning Methods

Source: Reddit

Benefits of pre-trained FMs

  • Pre-trained models for each use case

  • Easy to customize + manage models at scale

  • Data is kept secure and private on AWS

  • Responsible AI support across ML lifecycle

  • Fully integrated with Amazon SageMaker

#3 Call Amazon Bedrock! ⛰️

Source: Stability.AI

Amazon Bedrock

API-level access to FMs

Source: Amazon

For more information, visit
aws.amazon.com/bedrock

Key Benefits

Source: Adapted from AWS re:Inforce 2023

Bedrock supports a wide range of FMs

Source: AWS

You are always in control of your data 🎛️

Source: Adapted from AWS re:Inforce 2023

Bedrock/LangChain Integration ⛰️🦜🔗

Source: GitHub

Amazon CodeWhisperer

Build apps faster and more securely
with an AI coding companion

Source: AWS News Blog

Open-source reference tracking

Source: AWS News Blog

Security scanning

Source: AWS News Blog

Multiple language and IDE support

Source: AWS

"(…) participants who used CodeWhisperer were 27% more likely to complete tasks successfully and did so an average of 57% faster than those who didn't use CodeWhisperer."

― AWS News Blog

Build GenAI the easy way with managed services

Ready to learn how?

GenAI in Practice

Use Cases, Patterns & Solutions

🛠️

In just a few months, GenAI has exploded…

GenAI Landscape

Source: Medium

By the time you read this,
the last slide will be completely…

Yet, some common patterns are starting to emerge…

Emerging LLM Patterns

Source: Eugene Yan

Retrieval Augmented Generation (RAG)

Source: Amazon SageMaker Developer Guide

Document Summarization

Source: Streamlit

Document Generation with Facts

Source: Streamlit

Emerging architectures for LLM applications

Source: Adapted from Andreessen Horowitz

We can build all of these on AWS

RAG-based LLM-powered Q&A Bot

Source: AWS ML Blog

RAG workflow with Amazon Kendra and LangChain

Source: AWS ML Blog

Conversational Experience

Source: AWS ML Blog

Image-to-Speech app using
Amazon SageMaker and 🤗

Source: AWS ML Blog

Virtual fashion styling
using Amazon SageMaker 👒

Source: AWS ML Blog

Vector Databases

Source: Pinecone

Vector databases are useful for storing embeddings

since they treat vectors as first class citizens

A Quick Primer on Embeddings

A numerical representation of a piece of information

Source: Adapted from Arize and Medium

Example: Embedding Wikipedia

What if you had the embeddings of ALL of Wikipedia?

Source: Adapted from Cohere

How can AWS support your
vector database?

#1a RDS for PgSQL + pgvector extension

CREATE TABLE test_embeddings(product_id bigint, embeddings vector(3) );
        
INSERT INTO test_embeddings VALUES
(1, '[1, 2, 3]'), (2, '[2, 3, 4]'), (3, '[7, 6, 8]'), (4, '[8, 6, 9]');

SELECT product_id, embeddings, embeddings <-> '[3,1,2]' AS distance
FROM test_embeddings 
ORDER BY embeddings <-> '[3,1,2]';

/*
    product_id | embeddings |     distance
------------+------------+-------------------
            1 | [1,2,3]    | 2.449489742783178
            2 | [2,3,4]    |                 3
            3 | [7,6,8]    | 8.774964387392123
            4 | [8,6,9]    |   9.9498743710662
*/
        

Use Case: Using a similarity search for enhancing
product catalog search in an online retail store

Source: AWS Database Blog

#1b Aurora PgSQL + pgvector extension

# Adapted from
# https://github.com/aws-samples/aurora-postgresql-pgvector/tree/main/apgpgvector-streamlit
import streamlit as st
from dotenv import load_dotenv
from PyPDF2 import PdfReader
from langchain.embeddings import HuggingFaceInstructEmbeddings
from langchain.llms import HuggingFaceHub
from langchain.vectorstores.pgvector import PGVector
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationalRetrievalChain
from htmlTemplates import css, bot_template, user_template
from langchain.text_splitter import RecursiveCharacterTextSplitter
import os

# Load PDFs and split them into chunks
def get_pdf_text(pdf_docs):
    text = ""
    for pdf in pdf_docs:
        pdf_reader = PdfReader(pdf)
        for page in pdf_reader.pages:
            text += page.extract_text()
    return text

def get_text_chunks(text):
    text_splitter = RecursiveCharacterTextSplitter(
        separators=["\n\n", "\n", ".", "!", "?", ",", " ", ""],
        chunk_size=1000,
        chunk_overlap=200,
        length_function=len
        )

    chunks = text_splitter.split_text(text)
    return chunks


# Load the embeddings into Aurora PostgreSQL DB cluster
CONNECTION_STRING = PGVector.connection_string_from_db_params(                                                  
    driver = os.environ.get("PGVECTOR_DRIVER"),
    user = os.environ.get("PGVECTOR_USER"),                                      
    password = os.environ.get("PGVECTOR_PASSWORD"),                                  
    host = os.environ.get("PGVECTOR_HOST"),                                            
    port = os.environ.get("PGVECTOR_PORT"),                                          
    database = os.environ.get("PGVECTOR_DATABASE")                                       
)       

def get_vectorstore(text_chunks):
    embeddings = HuggingFaceInstructEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2")
    vectorstore = PGVector.from_texts(texts=text_chunks, embedding=embeddings,connection_string=CONNECTION_STRING)
    return vectorstore


# Load the LLM and start a conversation chain
def get_conversation_chain(vectorstore):
    llm = HuggingFaceHub(repo_id="google/flan-t5-xxl", model_kwargs={"temperature":0.5, "max_length":1024})

    memory = ConversationBufferMemory(
        memory_key='chat_history', return_messages=True)
    conversation_chain = ConversationalRetrievalChain.from_llm(
        llm=llm,
        retriever=vectorstore.as_retriever(),
        memory=memory
    )
    return conversation_chain


# Handle user input and perform Q&A
def handle_userinput(user_question):
    response = st.session_state.conversation({'question': user_question})
    st.session_state.chat_history = response['chat_history']

    for i, message in enumerate(st.session_state.chat_history):
        if i % 2 == 0:
            st.write(user_template.replace(
                "{{MSG}}", message.content), unsafe_allow_html=True)
        else:
            st.write(bot_template.replace(
                "{{MSG}}", message.content), unsafe_allow_html=True)


# Create Streamlit app
def main():
    load_dotenv()
    st.set_page_config(page_title="Streamlit Question Answering App",
                        page_icon=":books::parrot:")
    st.write(css, unsafe_allow_html=True)

    st.sidebar.markdown(
    """
    ### Instructions:
    1. Browse and upload PDF files
    2. Click Process
    3. Type your question in the search bar to get more insights
    """
)

    if "conversation" not in st.session_state:
        st.session_state.conversation = None
    if "chat_history" not in st.session_state:
        st.session_state.chat_history = None

    st.header("GenAI Q&A with pgvector and Amazon Aurora PostgreSQL :books::parrot:")
    user_question = st.text_input("Ask a question about your documents:")
    if user_question:
        handle_userinput(user_question)

    with st.sidebar:
        st.subheader("Your documents")
        pdf_docs = st.file_uploader(
            "Upload your PDFs here and click on 'Process'", accept_multiple_files=True)
        if st.button("Process"):
            with st.spinner("Processing"):
                # get pdf text
                raw_text = get_pdf_text(pdf_docs)

                # get the text chunks
                text_chunks = get_text_chunks(raw_text)

                # create vector store
                vectorstore = get_vectorstore(text_chunks)

                # create conversation chain
                st.session_state.conversation = get_conversation_chain(
                    vectorstore)

if __name__ == '__main__':
    main()
        

Use Case: AI-powered Chatbot

Source: AWS ML Blog

#2a OpenSearch

Source: AWS Big Data Blog

Use Case: Amazon Music

Source: AWS Big Data Blog

#2b OpenSearch Serverless

Source: AWS Big Data Blog

#3 DynamoDB + Faiss

def save_faiss_model(self, text_list, id_list):
    # Convert abstracts to vectors
    embeddings = model.encode(text_list, show_progress_bar=False)
    # Step 1: Change data type
    embeddings32 = np.array(
        [embedding for embedding in embeddings]).astype("float32")
    # Step 4: Add vectors and their IDs
    index_start_id = self.index.ntotal # inclusive
    self.index.add(embeddings32)
    index_end_id = self.index.ntotal # exclsuive
    # serialize index
    Path(f"{FAISS_DIR}/{self.content_group}").mkdir(parents=True, exist_ok=True)
    faiss.write_index(self.index, f"{FAISS_DIR}/{self.content_group}/{self.content_group}_faiss_index.bin")
    return (embeddings32, range(index_start_id, index_end_id))
        

Vector Library vs. Vector Database

Vector libraries are used to perform similarity search.

Examples: Facebook FAISS, Spotify Annoy, Google ScaNN, NMSLIB, Hnswlib

Vector databases are used to store and update data.

Examples: Pinecone, Weaviate, Milvus

#4 Neptune ML

Source: AWS ML Blog

#5 AWS Marketplace Solution

#6 Other Solutions

Source: Medium

It's really hard to take an application
from prototype to production.

Model Compression

Source: Zhu et al. (2023)

LLM Evaluation

What, when and how to evaluate

Source: Chang et al. (2023)

LLMOps Workflow

Source: Fiddler AI

Monitoring GenAI Models

Source: Fiddler AI

GenAI applications can be very powerful,
but also very vulnerable.

How can we protect users
against manipulation and abuse

while creating a safe and positive experience?

GenAI meets Amazon AI content moderation services

Source: AWS ML Blog

What comes next?

LLM OS

Source: Karpathy (2023)

LLM Compiler

Source: Kim et al. (2023)

What if I want to explore
more use cases?

AI Use Case Explorer

Find the most relevant AI use cases with
related content and guidance to make them real

Generative AI Atlas

Public repository with the newest content
released by AWS on GenAI

github.com/aws-samples/gen-ai-atlas

Machine Learning University 🎓

For a set of fun and interactive explanations
of core ML concepts, check out MLU Explain

Source: Amazon Science

Hello World: Meet Generative AI

Werner Vogels and Swami Sivasubramanian
sit down to discuss GenAI and why it's not a hype

AWS DeepComposer: AI Music Composer

Follow the Music 🎶

Hands-On GenAI with LLMs Course

Learn the fundamentals of how GenAI works and
how to deploy it in real-world applications

AWS GenAI Chatbot Platform

Deploy a multi-LLM and multi-RAG powered chatbot using AWS CDK

AWS Generative AI Accelerator

Accelerate your GenAI startup in 10 weeks

References 📚

General

Transformers 🚗🤖⚔️

Diffusers 🧨

Courses 👩‍🏫

  • ALAFF: Advanced Linear Algebra - Foundations to Frontiers
  • Statistics 110: Probability
  • CS221: Artificial Intelligence - Principles and Techniques
  • CS25: Transformers United
  • COS597G: Understanding Large Language Models
  • CS224N: Natural Language Processing with Deep Learning
  • CS224U: Natural Language Understanding
  • CS324: Large Language Models
  • CS685: Advanced Natural Language Processing
  • 263-5354-00L: Large Language Models

Miscellaneous 👾

Note: The emphasis is on breadth, not depth

Meta ♾️

Disclaimer: I take no responsibility for the content available through these links