Lecture 1: Introduction to CPSC 330

Firas Moosvi (Slides adapted from Varada Kolhatkar)

🤝 Introductions ! 🤝

About your instructor

About my research interests

Group work in this class

This term we will try to work in “Pods” of 3-5 …

Research shows that there is tremendous benefits in students working (and struggling) together!

Students ask better and more insightful questions, engage more deeply with the work, and it adds a social element to class.

We will try this in CPSC 330 this term!

Group work in this class

Understandably, not everyone is a fan of group work - I understand that!

So you will never be forced to work in groups If you would like to opt-out, move to the far left and far right sides of the room so we know you prefer to work individually.

If everyone moves to the side of the room, we will re-evaluate this approach 😂

There are no marks or points associated with these groups, and everyone should work on their own laptops as well

Group work: Pods

Form a Pod of 3-5 people sitting close to you.

Each person should answer the following questions:

Preferred Name,
Year,
(intended) Major
Why are you taking CPSC 330?

Then, as a group, answer the following question:

What is the most interesting (good or bad) example of Machine Learning in society?

Meet Eva (a fictitious persona)!

Eva is among one of you. She has some experience in Python programming. She knows machine learning as a buzz word. During her recent internship, she has developed some interest and curiosity in the field. She wants to learn what is it and how to use it. She is a curious person and usually has a lot of questions!

🎯 Learning Outcomes

By the end of this lesson, you will be able to:

Explain the difference between AI, ML, and DL
Describe what machine learning is and when it is appropriate to use ML-based solutions.
Briefly describe supervised learning.
Differentiate between traditional programming and machine learning.
Evaluate whether a machine learning solution is suitable for your problem or whether a rule-based or human-expert solution is more appropriate.
Navigate the course materials and get familiar with the course syllabus and policies.

About this course

CPSC 330 website

Course Jupyter book: https://ubc-cs.github.io/cpsc330-2025W2
Course GitHub repository: https://github.com/UBC-CS/cpsc330-2025W2

Important

Course website: https://ubc-cs.github.io/cpsc330-2025W2 is the most important link. You can access the course website from Canvas.

Please read everything on there!

You can find the source code for everything we do here: https://ubc-cs.github.io/cpsc330-2025W2.

Important

Make sure you go through the syllabus thoroughly and complete the syllabus quiz before next class!

Asking questions during class

You are welcome to ask questions by raising your hand!

If you would prefer to write notes and ask questions later, you are more than welcome to do that also! Use Ed Discussion.

Registration, waitlist and prerequisites

Important

Please go through this document carefully before contacting your instructors about these issues. Even then, we are very unlikely to be able to help with registration, waitlist or prerequisite issues.

We are expecting that all students registered on the waitlist have already, or will soon get a notification to join the course!
The waitlist will close on Friday at 3 PM and no more students will be able to register after that!
It is your responsibility to catch up on any missed work.

Lecture format

In person lectures Tuesday and Thursday from 3:30 PM - 5 PM
There will be videos to watch before almost every lecture. You will find the list of pre-watch videos in the schedule on the course webpage.
We will also try to work on some questions and exercises together during the class.
All materials will be posted in this GitHub repository.
- You may attend any tutorials or office hours your want, regardless of in which/whether you’re registered.

Home work assignments

First homework assignment is due soon, it will be released to you today.
This is a relatively straightforward assignment on Python. If you struggle with this assignment then that could be a sign that you will struggle later on in the course.
You must do the first two homework assignments on your own.

Exams

We’ll have two self-scheduled midterms over a few day window and one final exam in Computer-based Testing Facility (CBTF).

Course structure

Part I: Introduction, ML fundamentals, preprocessing, midterm 1
Part II: Unsupervised learning, transfer learning, common special cases, midterm 1
Part III: Communication and ethics
- ML skills are not beneficial if you can’t use them responsibly and communicate your results. In this module we’ll talk about these aspects. ## Code of conduct
Our main forum for getting help will be Ed Discussion.

Important

Please read this entire document about asking for help. TLDR: Be nice.

Setting up your computer for the course

Tools used in this course

We will use the following tools throughout the course:

Coding: Python, with either Jupyter Lab or VS Code
Version Control: git and GitHub
Assignment Submission: Gradescope
Discussion Forum: Piazza
Exams and Final Grades: PrairieLearn and Canvas
Recommended Browsers: Google Chrome or Mozilla Firefox

Course `conda` environment

Follow the setup instructions here to create a course conda environment on your computer.
If you do not have your computer with you, you can partner up with someone and set up your own computer later.

Python requirements/resources

We will primarily use Python in this course.

Here is the basic Python knowledge you’ll need for the course:

Basic Python programming
Numpy
Pandas
Basic matplotlib

Homework 1 is all about Python.

Note

We do not have time to teach all the Python we need but you can find some useful Python resources here.

Workload

What does a typical week look like?

Before class: Watch pre-lecture videos or preview notes
In class: Two 80-minute lectures with iClicker questions, activities, and live demos
Support: Weekly tutorials and office hours
Practice: Weekly assignments (except exam weeks)

Tips for success:

Attend lectures regularly and ask questions
Start homework early. Hands-on practice is essential
Use Generative AI tools responsibly. No blind copy-pasting
Always question your data, methods, and results — justify your choices

Homework format: Jupyter lab notebooks

Our notes are created in a Jupyter notebook, with file extension .ipynb.
Also, you will complete your homework assignments using Jupyter notebooks.
Confusingly, “Jupyter notebook” is also the original application that opens .ipynb files - but has since been replaced by Jupyter lab.
- I am using Jupyter lab, some things might not work with the Jupyter notebook application.
- You can also open these files in Visual Studio Code.

Jupyter lab notebooks

Notebooks contain a mix of code, code output, markdown-formatted text (including LaTeX equations), and more.
When you open a Jupyter notebook in one of these apps, the document is “live”, meaning you can run the code.

For example:

1 + 1

x = [1, 2, 3]
x[0] = 9999
x

[9999, 2, 3]

More about Jupyter lab

By default, Jupyter prints out the result of the last line of code, so you don’t need as many print statements.
In addition to the “live” notebooks, Jupyter notebooks can be statically rendered in the web browser, e.g. this.
- This can be convenient for quick read-only access, without needing to launch the Jupyter notebook/lab application.
- But you need to launch the app properly to interact with the notebooks.

Lecture notes

All the lectures from last year are available here.
We cannot promise anything will stay the same from last year to this year, so read them in advance at your own risk.
A “finalized” version will be pushed to GitHub and the Jupyter book right before each class.
Each instructor will have slightly adapted versions of notes to present slides during lectures.
You will find the link to these slides in our repository: https://github.com/UBC-CS/cpsc330-2025W2/tree/main/lectures/103-Firas-lectures

Grades

The grading breakdown is here.
The policy on challenging grades is here.

Setting up your computer for the course

Recommended browser and tools

You can install Chrome here.
You can install Firefox here.

In this course, we will primarily be using Python , git, GitHub, Canvas, Gradescope, Ed DIscussion, and PrairieLearn.

Course `conda` environment

Follow the setup instructions here to create a course conda environment on your computer.
If you do not have your computer with you, you can partner up with someone and set up your own computer later.

Python requirements/resources

We will primarily use Python in this course.

Here is the basic Python knowledge you’ll need for the course:

Basic Python programming
Numpy
Pandas
Basic matplotlib
Sparse matrices

Homework 1 is all about Python.

Note

We do not have time to teach all the Python we need but you can find some useful Python resources here.

CPSC 330 vs. 340

Read https://ubc-cs.github.io/cpsc330-2025W2/docs/330_vs_340.html which explains the difference between two courses.

TLDR:

340: how do ML models work?
330: how do I use ML models?
CPSC 340 has many prerequisites.
CPSC 340 goes deeper but has a more narrow scope.
I think CPSC 330 will be more useful if you just plan to apply basic ML.

Break

Activity 1

Discuss with you neighbour

What do you know about machine learning?
What would you like to get out this course?
Are there any particular topics or aspects of this course that you are especially excited or anxious about? Why?

What is machine learning?

Which cat do you think is AI-generated?

Source

A
B
Both
None

What clues did you use to decide?

What are AI, ML, DL?

Artificial Intelligence (AI): Making computers act smart
- Examples: Deep Blue, early spell checkers
Machine Learning (ML): Learning patterns from data
- Example: Spam filtering in Gmail
Deep Learning (DL): Using neural networks to learn complex patterns
- Examples: Face recognition in your phone, voice assistants

Let’s walk through an example

Have you used search in Google Photos? You can search for “cat” and it will retrieve photos from your libraries containing cats.
This can be done using image classification.

Image classification

Imagine we want a system that can tell cats and foxes apart.
How might we do this with traditional programming? With ML?

Image ID	Whiskers Present	Ear Size	Face Shape	Fur Color	Eye Shape	Label
1	Yes	Large	Round	Mixed	Round	Cat
2	Yes	Medium	Round	Brown	Almond	Cat
3	Yes	Large	Pointed	Red	Narrow	Fox
4	Yes	Large	Pointed	Red	Narrow	Fox
5	Yes	Small	Round	Mixed	Round	Cat
6	Yes	Large	Pointed	Red	Narrow	Fox
7	Yes	Small	Round	Grey	Round	Cat
8	Yes	Small	Round	Black	Round	Cat
9	Yes	Large	Pointed	Red	Narrow	Fox

Traditional programming: example

You hard-code rules. If all of the following satisfy, it’s a fox.
- pointed face ✅
- red fur ✅
- narrow eyes ✅
This works for normal cases, but what if there are exceptions

ML approach: example

We don’t tell the model the exact rule. Instead, we give it many labeled images, and it learns probabilistic patterns across multiple features, not rigid rules.
- If fur is red \(\rightarrow\) 90% chance of Fox.

DL approach: example

A neural network automatically learns which features to look at (edges \(\rightarrow\) textures \(\rightarrow\) objects).
No need to even specify face shape or fur colour. It learns relevant features on its own.

What is ML?

ML uses algorithms to learn patterns from data and build models.
These models can:
- Make predictions on new data
- Support complex decisions
- Generate new content
ML systems can improve when trained on more data.
There is no one-size-fits-all model. The right choice depends on the problem.

When to use ML?

When the problem can’t be solved with a fixed set of rules
When you have lots of data and complex relationships
When human decision-making is too slow or inconsistent

Approach	Best for
Traditional Programming	Rules are known, data is clean/predictable
Machine Learning	Rules are complex/unknown, data is noisy

When to use Machine Learning (ML) solutions?

Example: Supervised classification

We want to predict liver disease from tabular features:

Age	Total_Bilirubin	Direct_Bilirubin	Alkaline_Phosphotase	Alamine_Aminotransferase	Aspartate_Aminotransferase	Total_Protiens	Albumin	Albumin_and_Globulin_Ratio	Target
40	14.5	6.4	358	50	75	5.7	2.1	0.50	Disease
33	0.7	0.2	256	21	30	8.5	3.9	0.80	Disease
24	0.7	0.2	188	11	10	5.5	2.3	0.71	No Disease
60	0.7	0.2	171	31	26	7.0	3.5	1.00	No Disease
18	0.8	0.2	199	34	31	6.5	3.5	1.16	No Disease

Model training

from lightgbm.sklearn import LGBMClassifier
model = LGBMClassifier(random_state=123, verbose=-1)
model.fit(X_train, y_train)

LGBMClassifier(random_state=123, verbose=-1)

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

New examples

Given features of new patients below we’ll use this model to predict whether these patients have the liver disease or not.

Age	Total_Bilirubin	Direct_Bilirubin	Alkaline_Phosphotase	Alamine_Aminotransferase	Aspartate_Aminotransferase	Total_Protiens	Albumin	Albumin_and_Globulin_Ratio
19	1.4	0.8	178	13	26	8.0	4.6	1.30
12	1.0	0.2	719	157	108	7.2	3.7	1.00
60	5.7	2.8	214	412	850	7.3	3.2	0.78
42	0.5	0.1	162	155	108	8.1	4.0	0.90

Model predictions on new examples

Let’s examine predictions

pred_df = pd.DataFrame({"Predicted_target": model.predict(X_test).tolist()})
df_concat = pd.concat([pred_df, X_test.reset_index(drop=True)], axis=1)
HTML(df_concat.to_html(index=False))

Predicted_target	Age	Total_Bilirubin	Direct_Bilirubin	Alkaline_Phosphotase	Alamine_Aminotransferase	Aspartate_Aminotransferase	Total_Protiens	Albumin	Albumin_and_Globulin_Ratio
No Disease	19	1.4	0.8	178	13	26	8.0	4.6	1.30
Disease	12	1.0	0.2	719	157	108	7.2	3.7	1.00
Disease	60	5.7	2.8	214	412	850	7.3	3.2	0.78
Disease	42	0.5	0.1	162	155	108	8.1	4.0	0.90

Example: Supervised regression

Suppose we want to predict housing prices given a number of attributes associated with houses. The target here is continuous and not discrete.

target	bedrooms	bathrooms	sqft_living	sqft_lot	floors	condition	grade	sqft_above	sqft_basement	yr_built	zipcode	lat	long	sqft_living15	sqft_lot15
509000.0	2	1.50	1930	3521	2.0	3	8	1930	0	1989	98007	47.6092	-122.146	1840	3576
675000.0	5	2.75	2570	12906	2.0	3	8	2570	0	1987	98075	47.5814	-122.050	2580	12927
420000.0	3	1.00	1150	5120	1.0	4	6	800	350	1946	98116	47.5588	-122.392	1220	5120
680000.0	8	2.75	2530	4800	2.0	4	7	1390	1140	1901	98112	47.6241	-122.305	1540	4800
357823.0	3	1.50	1240	9196	1.0	3	8	1240	0	1968	98072	47.7562	-122.094	1690	10800

Building a regression model

from lightgbm.sklearn import LGBMRegressor

X_train, y_train = train_df.drop(columns= ["target"]), train_df["target"]
X_test, y_test = test_df.drop(columns= ["target"]), train_df["target"]

model = LGBMRegressor()
model.fit(X_train, y_train);

Predicting prices of unseen houses

pred_df = pd.DataFrame(
    {"Predicted_target": model.predict(X_test[0:4]).tolist()}
)
df_concat = pd.concat([pred_df, X_test[0:4].reset_index(drop=True)], axis=1)
HTML(df_concat.to_html(index=False))

Predicted_target	bedrooms	bathrooms	sqft_living	sqft_lot	floors	view	condition	grade	sqft_above	sqft_basement	yr_built	zipcode	lat	long	sqft_living15	sqft_lot15
345831.740542	4	2.25	2130	8078	1.0	0	4	7	1380	750	1977	98055	47.4482	-122.209	2300	8112
601042.018745	3	2.50	2210	7620	2.0	0	3	8	2210	0	1994	98052	47.6938	-122.130	1920	7440
311310.186024	4	1.50	1800	9576	1.0	0	4	7	1800	0	1977	98045	47.4664	-121.747	1370	9576
597555.592401	3	2.50	1580	1321	2.0	2	3	8	1080	500	2014	98107	47.6688	-122.402	1530	1357

We are predicting continuous values here as apposed to discrete values in disease vs. no disease example.

Text data

Example: Text classification

Suppose you are given some data with labeled spam and non-spam messages and you want to predict whether a new message is spam or not spam.

Code
Output

sms_df = pd.read_csv(DATA_DIR + "spam.csv", encoding="latin-1")
sms_df = sms_df.drop(columns = ["Unnamed: 2", "Unnamed: 3", "Unnamed: 4"])
sms_df = sms_df.rename(columns={"v1": "target", "v2": "sms"})
train_df, test_df = train_test_split(sms_df, test_size=0.10, random_state=42)

target	sms
spam	LookAtMe!: Thanks for your purchase of a video clip from LookAtMe!, you've been charged 35p. Think you can do better? Why not send a video in a MMSto 32323.
ham	Aight, I'll hit you up when I get some cash
ham	Don no da:)whats you plan?
ham	Going to take your babe out ?
ham	No need lar. Jus testing e phone card. Dunno network not gd i thk. Me waiting 4 my sis 2 finish bathing so i can bathe. Dun disturb u liao u cleaning ur room.

Let’s train a model

X_train, y_train = train_df["sms"], train_df["target"]
X_test, y_test = test_df["sms"], test_df["target"]
clf = make_pipeline(CountVectorizer(max_features=5000), LogisticRegression(max_iter=5000))
clf.fit(X_train, y_train) # Training the model

Pipeline(steps=[('countvectorizer', CountVectorizer(max_features=5000)),
                ('logisticregression', LogisticRegression(max_iter=5000))])

Unseen messages

Now use the trained model to predict targets of unseen messages:

	sms
3245	Funny fact Nobody teaches volcanoes 2 erupt, tsunamis 2 arise, hurricanes 2 sway aroundn no 1 teaches hw 2 choose a wife Natural disasters just happens
944	I sent my scores to sophas and i had to do secondary application for a few schools. I think if you are thinking of applying, do a research on cost also. Contact joke ogunrinde, her school is one m...
1044	We know someone who you know that fancies you. Call 09058097218 to find out who. POBox 6, LS15HB 150p
2484	Only if you promise your getting out as SOON as you can. And you'll text me in the morning to let me know you made it in ok.

Predicting on unseen data

The model is accurately predicting labels for the unseen text messages above!

	sms	spam_predictions
3245	Funny fact Nobody teaches volcanoes 2 erupt, tsunamis 2 arise, hurricanes 2 sway aroundn no 1 teaches hw 2 choose a wife Natural disasters just happens	ham
944	I sent my scores to sophas and i had to do secondary application for a few schools. I think if you are thinking of applying, do a research on cost also. Contact joke ogunrinde, her school is one me the less expensive ones	ham
1044	We know someone who you know that fancies you. Call 09058097218 to find out who. POBox 6, LS15HB 150p	spam
2484	Only if you promise your getting out as SOON as you can. And you'll text me in the morning to let me know you made it in ok.	ham

Examplel: Text classification with LLMs

LLMs = Large Language Models

from transformers import pipeline, AutoModelForTokenClassification, AutoTokenizer
# Sentiment analysis pipeline
analyzer = pipeline("sentiment-analysis", model='distilbert-base-uncased-finetuned-sst-2-english')
analyzer(["I asked my model to predict my future, and it said '404: Life not found.'",
          '''Machine learning is just like cooking—sometimes you follow the recipe, 
            and other times you just hope for the best!.'''])

[{'label': 'NEGATIVE', 'score': 0.995707631111145},
 {'label': 'POSITIVE', 'score': 0.9994770884513855}]

Zero-shot learning

Now suppose you want to identify the emotion expressed in the text rather than just positive or negative.

['im feeling rather rotten so im not very ambitious right now',
 'im updating my blog because i feel shitty',
 'i never make her separate from me because i don t ever want her to feel like i m ashamed with her',
 'i left with my bouquet of red and yellow tulips under my arm feeling slightly more optimistic than when i arrived',
 'i was feeling a little vain when i did this one',
 'i cant walk into a shop anywhere where i do not feel uncomfortable',
 'i felt anger when at the end of a telephone call',
 'i explain why i clung to a relationship with a boy who was in many ways immature and uncommitted despite the excitement i should have been feeling for getting accepted into the masters program at the university of virginia',
 'i like to have the same breathless feeling as a reader eager to see what will happen next',
 'i jest i feel grumpy tired and pre menstrual which i probably am but then again its only been a week and im about as fit as a walrus on vacation for the summer']

Zero-shot learning for emotion detection

from transformers import AutoTokenizer
from transformers import pipeline 
import torch

#Load the pretrained model
model_name = "facebook/bart-large-mnli"
classifier = pipeline('zero-shot-classification', model=model_name)
exs = dataset["test"]["text"][10:20]
candidate_labels = ["sadness", "joy", "love","anger", "fear", "surprise"]
outputs = classifier(exs, candidate_labels)

Zero-shot learning for emotion detection

	sequence	labels	scores
0	i don t feel particularly agitated	[surprise, anger, joy, sadness, fear, love]	[0.3600873053073883, 0.3019026815891266, 0.11901309341192245, 0.11381532996892929, 0.060391392558813095, 0.04479021206498146]
1	i feel beautifully emotional knowing that these women of whom i knew just a handful were holding me and my baba on our journey	[joy, love, surprise, fear, sadness, anger]	[0.36994317173957825, 0.28871580958366394, 0.25607895851135254, 0.042923394590616226, 0.03344894200563431, 0.008889704011380672]
2	i pay attention it deepens into a feeling of being invaded and helpless	[fear, surprise, sadness, anger, joy, love]	[0.3414689302444458, 0.30880674719810486, 0.2561693787574768, 0.07989845424890518, 0.00784482154995203, 0.005811676848679781]
3	i just feel extremely comfortable with the group of people that i dont even need to hide myself	[joy, surprise, love, sadness, anger, fear]	[0.33052244782447815, 0.29472336173057556, 0.15343077480793, 0.07691455632448196, 0.07596749812364578, 0.06844136863946915]
4	i find myself in the odd position of feeling supportive of	[surprise, joy, fear, love, sadness, anger]	[0.8287991881370544, 0.043179433792829514, 0.039773616939783096, 0.03141303360462189, 0.031412284821271896, 0.025422412902116776]
5	i was feeling as heartbroken as im sure katniss was	[sadness, surprise, fear, love, anger, joy]	[0.7667970657348633, 0.18184703588485718, 0.025871269404888153, 0.011756835505366325, 0.00817161239683628, 0.00555615546181798]
6	i feel a little mellow today	[surprise, joy, love, fear, sadness, anger]	[0.4937363266944885, 0.2632198929786682, 0.11367864906787872, 0.06402146071195602, 0.050954896956682205, 0.014388857409358025]
7	i feel like my only role now would be to tear your sails with my pessimism and discontent	[sadness, anger, surprise, fear, joy, love]	[0.6992800235748291, 0.20048724114894867, 0.06185886263847351, 0.03287408500909805, 0.0036468561738729477, 0.0018528576474636793]
8	i feel just bcoz a fight we get mad to each other n u wanna make a publicity n let the world knows about our fight	[anger, surprise, sadness, fear, joy, love]	[0.6029898524284363, 0.19827152788639069, 0.10198832303285599, 0.08116993308067322, 0.010117105208337307, 0.005463324021548033]
9	i feel like reds and purples are just so rich and kind of perfect	[joy, surprise, love, anger, fear, sadness]	[0.3644145131111145, 0.3051210939884186, 0.19462482631206512, 0.055566418915987015, 0.05413556843996048, 0.02613767609000206]

Image data

Example: Predicting labels of a given image

Suppose you have a bunch of animal images. You do not have any labels associated with them and you want to predict labels of these images.
We can use machine learning to predict labels of these images using a technique called transfer learning.

                         Class  Probability score
                     tiger cat              0.636
              tabby, tabby cat              0.174
Pembroke, Pembroke Welsh corgi              0.081
               lynx, catamount              0.011
--------------------------------------------------------------

                                     Class  Probability score
         cheetah, chetah, Acinonyx jubatus              0.994
                  leopard, Panthera pardus              0.005
jaguar, panther, Panthera onca, Felis onca              0.001
       snow leopard, ounce, Panthera uncia              0.000
--------------------------------------------------------------

                                   Class  Probability score
                                 macaque              0.885
patas, hussar monkey, Erythrocebus patas              0.062
      proboscis monkey, Nasalis larvatus              0.015
                       titi, titi monkey              0.010
--------------------------------------------------------------

                        Class  Probability score
Walker hound, Walker foxhound              0.582
             English foxhound              0.144
                       beagle              0.068
                  EntleBucher              0.059
--------------------------------------------------------------

:::

Clustering images

Finding groups in food images

K-Means on food dataset

densenet = models.densenet121(weights="DenseNet121_Weights.IMAGENET1K_V1")
densenet.classifier = torch.nn.Identity()  # remove that last "classification" layer

Z_food = get_features_unsup(densenet, food_inputs)
k = 5
km = KMeans(n_clusters=k, n_init='auto', random_state=123)
km.fit(Z_food)

KMeans(n_clusters=5, random_state=123)

Examining food clusters

for cluster in range(k):
    get_cluster_images(km, Z_food, X_food, cluster, n_img=6)

39
Image indices:  [ 39 197  12  14 138 181]

228
Image indices:  [228  65 128  54 175 260]

138
Image indices:  [138  54 185 278  39  89]

193
Image indices:  [193  39 145 212 169 108]

120
Image indices:  [120 268 244  94  72  87]

Interactive: Is ML appropriate?

❓❓ Questions for you

Select all that apply: Which problems are suitable for ML?

1. Checking if a UBC email address ends with @student.ubc.ca before allowing login
1. Deciding which students should be awarded a scholarship based on their personal essays
1. Predicting which songs you’ll like based on your Spotify listening history
1. Detecting plagiarism by checking if two essays are exactly identical
1. Automatically tagging photos of your friends on Instagram

Summary: When is ML suitable?

Approach	Best Used When…
Machine Learning	The dataset is large and complex, and the decision rules are unknown, fuzzy, or too complex to define explicitly
Rule-based System	The logic is clear and deterministic, and the rules or thresholds are known and stable
Human Expert	The problem involves ethics, creativity, emotion, or ambiguity that can’t be formalized easily

Activity 2

Think of a problem you have come across in the past which could be solved using machine learning.

What would be the input and output?
How do humans solve this now? Are there heuristics or rules?
What kind of data do you have or could you collect?

Types of machine learning

Here are some typical learning problems.

Supervised learning (Gmail spam filtering)
Unsupervised learning (Google News)
Reinforcement learning (AlphaGo)
Generative AI (ChatGPT)
Recommendation systems (Amazon item recommendation system)

What is supervised learning?

Training data comprises a set of observations (\(X\)) and their corresponding targets (\(y\)).
We wish to find a model function \(f\) that relates \(X\) to \(y\).
We use the model function to predict targets of new examples.

🤔 Eva’s questions

At this point, Eva is wondering about many questions.

How are we exactly “learning” whether a message is spam and ham?
Are we expected to get correct predictions for all possible messages? How does it predict the label for a message it has not seen before?
What if the model mis-labels an unseen example? For instance, what if the model incorrectly predicts a non-spam as a spam? What would be the consequences?
How do we measure the success or failure of spam identification?
If you want to use this model in the wild, how do you know how reliable it is?
Would it be useful to know how confident the model is about the predictions rather than just a yes or a no?

It’s great to think about these questions right now. But Eva has to be patient. By the end of this course you’ll know answers to many of these questions!

What is Machine Learning (ML)?

Spam prediction

Suppose you are given some data with labelled spam and non-spam messages

Code
Output

sms_df = pd.read_csv(DATA_DIR + "spam.csv", encoding="latin-1")
sms_df = sms_df.drop(columns = ["Unnamed: 2", "Unnamed: 3", "Unnamed: 4"])
sms_df = sms_df.rename(columns={"v1": "target", "v2": "sms"})
train_df, test_df = train_test_split(sms_df, test_size=0.10, random_state=42)

target	sms
spam	LookAtMe!: Thanks for your purchase of a video clip from LookAtMe!, you've been charged 35p. Think you can do better? Why not send a video in a MMSto 32323.
ham	Aight, I'll hit you up when I get some cash
ham	Don no da:)whats you plan?
ham	Going to take your babe out ?
ham	No need lar. Jus testing e phone card. Dunno network not gd i thk. Me waiting 4 my sis 2 finish bathing so i can bathe. Dun disturb u liao u cleaning ur room.

Traditional programming vs. ML

Imagine writing a Python program for spam identification, i.e., whether a text message or an email is spam or non-spam.
Traditional programming
- Come up with rules using human understanding of spam messages.
- Time consuming and hard to come up with robust set of rules.
Machine learning
- Collect large amount of data of spam and non-spam emails and let the machine learning algorithm figure out rules.

Let’s train a model

There are several packages that help us perform machine learning.

X_train, y_train = train_df["sms"], train_df["target"]
X_test, y_test = test_df["sms"], test_df["target"]
clf = make_pipeline(CountVectorizer(max_features=5000), LogisticRegression(max_iter=5000))
clf.fit(X_train, y_train); # Training the model

Unseen messages

Now use the trained model to predict targets of unseen messages:

	sms
3245	Funny fact Nobody teaches volcanoes 2 erupt, tsunamis 2 arise, hurricanes 2 sway aroundn no 1 teaches hw 2 choose a wife Natural disasters just happens
944	I sent my scores to sophas and i had to do secondary application for a few schools. I think if you are thinking of applying, do a research on cost also. Contact joke ogunrinde, her school is one m...
1044	We know someone who you know that fancies you. Call 09058097218 to find out who. POBox 6, LS15HB 150p
2484	Only if you promise your getting out as SOON as you can. And you'll text me in the morning to let me know you made it in ok.

Predicting on unseen data

The model is accurately predicting labels for the unseen text messages above!

	sms	spam_predictions
3245	Funny fact Nobody teaches volcanoes 2 erupt, tsunamis 2 arise, hurricanes 2 sway aroundn no 1 teaches hw 2 choose a wife Natural disasters just happens	ham
944	I sent my scores to sophas and i had to do secondary application for a few schools. I think if you are thinking of applying, do a research on cost also. Contact joke ogunrinde, her school is one me the less expensive ones	ham
1044	We know someone who you know that fancies you. Call 09058097218 to find out who. POBox 6, LS15HB 150p	spam
2484	Only if you promise your getting out as SOON as you can. And you'll text me in the morning to let me know you made it in ok.	ham

A different way to solve problems

Machine learning uses computer programs to model data. It can be used to extract hidden patterns, make predictions in new situation, or generate novel content.

A field of study that gives computers the ability to learn without being explicitly programmed.
– Arthur Samuel (1959)

ML vs. traditional programming

With machine learning, you’re likely to
- Save time
- Customize and scale products

Prevalence of ML

Let’s look at some examples.

Activity: For what type of problems ML is appropriate? (~5 mins)

Discuss with your neighbour for which of the following problems you would use machine learning

Finding a list of prime numbers up to a limit
Given an image, automatically identifying and labeling objects in the image
Finding the distance between two nodes in a graph

Types of machine learning

Here are some typical learning problems.

Supervised learning (Gmail spam filtering)
- Training a model from input data and its corresponding targets to predict targets for new examples.
Unsupervised learning (Google News)
- Training a model to find patterns in a dataset, typically an unlabeled dataset.
Reinforcement learning (AlphaGo)
- A family of algorithms for finding suitable actions to take in a given situation in order to maximize a reward.
Recommendation systems (Amazon item recommendation system)
- Predict the “rating” or “preference” a user would give to an item.

What is supervised learning?

Training data comprises a set of observations (\(X\)) and their corresponding targets (\(y\)).
We wish to find a model function \(f\) that relates \(X\) to \(y\).
We use the model function to predict targets of new examples.

🤔 Eva’s questions

At this point, Eva is wondering about many questions.

How are we exactly “learning” whether a message is spam and ham?
Are we expected to get correct predictions for all possible messages? How does it predict the label for a message it has not seen before?
What if the model mis-labels an unseen example? For instance, what if the model incorrectly predicts a non-spam as a spam? What would be the consequences?
How do we measure the success or failure of spam identification?
If you want to use this model in the wild, how do you know how reliable it is?
Would it be useful to know how confident the model is about the predictions rather than just a yes or a no?

It’s great to think about these questions right now. But Eva has to be patient. By the end of this course you’ll know answers to many of these questions!

Looking ahead to next class

It is very important that you watch the assigned pre-lecture videos before class!

	boosting_type	'gbdt'
	num_leaves	31
	max_depth	-1
	learning_rate	0.1
	n_estimators	100
	subsample_for_bin	200000
	objective	None
	class_weight	None
	min_split_gain	0.0
	min_child_weight	0.001
	min_child_samples	20
	subsample	1.0
	subsample_freq	0
	colsample_bytree	1.0
	reg_alpha	0.0
	reg_lambda	0.0
	random_state	123
	n_jobs	None
	importance_type	'split'
	verbose	-1

	steps steps: list of tuples List of (name of step, estimator) tuples that are to be chained in sequential order. To be compatible with the scikit-learn API, all steps must define `fit`. All non-last steps must also define `transform`. See :ref:`Combining Estimators ` for more details.	[('countvectorizer', ...), ('logisticregression', ...)]
	transform_input transform_input: list of str, default=None The names of the :term:`metadata` parameters that should be transformed by the pipeline before passing it to the step consuming it. This enables transforming some input arguments to ``fit`` (other than ``X``) to be transformed by the steps of the pipeline up to the step which requires them. Requirement is defined via :ref:`metadata routing `. For instance, this can be used to pass a validation set through the pipeline. You can only set this if metadata routing is enabled, which you can enable using ``sklearn.set_config(enable_metadata_routing=True)``. .. versionadded:: 1.6	None
	memory memory: str or object with the joblib.Memory interface, default=None Used to cache the fitted transformers of the pipeline. The last step will never be cached, even if it is a transformer. By default, no caching is performed. If a string is given, it is the path to the caching directory. Enabling caching triggers a clone of the transformers before fitting. Therefore, the transformer instance given to the pipeline cannot be inspected directly. Use the attribute ``named_steps`` or ``steps`` to inspect estimators within the pipeline. Caching the transformers is advantageous when fitting is time consuming. See :ref:`sphx_glr_auto_examples_neighbors_plot_caching_nearest_neighbors.py` for an example on how to enable caching.	None
	verbose verbose: bool, default=False If True, the time elapsed while fitting each step will be printed as it is completed.	False

	input input: {'filename', 'file', 'content'}, default='content' - If `'filename'`, the sequence passed as an argument to fit is expected to be a list of filenames that need reading to fetch the raw content to analyze. - If `'file'`, the sequence items must have a 'read' method (file-like object) that is called to fetch the bytes in memory. - If `'content'`, the input is expected to be a sequence of items that can be of type string or byte.	'content'
	encoding encoding: str, default='utf-8' If bytes or files are given to analyze, this encoding is used to decode.	'utf-8'
	decode_error decode_error: {'strict', 'ignore', 'replace'}, default='strict' Instruction on what to do if a byte sequence is given to analyze that contains characters not of the given `encoding`. By default, it is 'strict', meaning that a UnicodeDecodeError will be raised. Other values are 'ignore' and 'replace'.	'strict'
	strip_accents strip_accents: {'ascii', 'unicode'} or callable, default=None Remove accents and perform other character normalization during the preprocessing step. 'ascii' is a fast method that only works on characters that have a direct ASCII mapping. 'unicode' is a slightly slower method that works on any characters. None (default) means no character normalization is performed. Both 'ascii' and 'unicode' use NFKD normalization from :func:`unicodedata.normalize`.	None
	lowercase lowercase: bool, default=True Convert all characters to lowercase before tokenizing.	True
	preprocessor preprocessor: callable, default=None Override the preprocessing (strip_accents and lowercase) stage while preserving the tokenizing and n-grams generation steps. Only applies if ``analyzer`` is not callable.	None
	tokenizer tokenizer: callable, default=None Override the string tokenization step while preserving the preprocessing and n-grams generation steps. Only applies if ``analyzer == 'word'``.	None
	stop_words stop_words: {'english'}, list, default=None If 'english', a built-in stop word list for English is used. There are several known issues with 'english' and you should consider an alternative (see :ref:`stop_words`). If a list, that list is assumed to contain stop words, all of which will be removed from the resulting tokens. Only applies if ``analyzer == 'word'``. If None, no stop words will be used. In this case, setting `max_df` to a higher value, such as in the range (0.7, 1.0), can automatically detect and filter stop words based on intra corpus document frequency of terms.	None
	token_pattern token_pattern: str or None, default=r"(?u)\\b\\w\\w+\\b" Regular expression denoting what constitutes a "token", only used if ``analyzer == 'word'``. The default regexp select tokens of 2 or more alphanumeric characters (punctuation is completely ignored and always treated as a token separator). If there is a capturing group in token_pattern then the captured group content, not the entire match, becomes the token. At most one capturing group is permitted.	'(?u)\\b\\w\\w+\\b'
	ngram_range ngram_range: tuple (min_n, max_n), default=(1, 1) The lower and upper boundary of the range of n-values for different word n-grams or char n-grams to be extracted. All values of n such such that min_n <= n <= max_n will be used. For example an ``ngram_range`` of ``(1, 1)`` means only unigrams, ``(1, 2)`` means unigrams and bigrams, and ``(2, 2)`` means only bigrams. Only applies if ``analyzer`` is not callable.	(1, ...)
	analyzer analyzer: {'word', 'char', 'char_wb'} or callable, default='word' Whether the feature should be made of word n-gram or character n-grams. Option 'char_wb' creates character n-grams only from text inside word boundaries; n-grams at the edges of words are padded with space. If a callable is passed it is used to extract the sequence of features out of the raw, unprocessed input. .. versionchanged:: 0.21 Since v0.21, if ``input`` is ``filename`` or ``file``, the data is first read from the file and then passed to the given callable analyzer.	'word'
	max_df max_df: float in range [0.0, 1.0] or int, default=1.0 When building the vocabulary ignore terms that have a document frequency strictly higher than the given threshold (corpus-specific stop words). If float, the parameter represents a proportion of documents, integer absolute counts. This parameter is ignored if vocabulary is not None.	1.0
	min_df min_df: float in range [0.0, 1.0] or int, default=1 When building the vocabulary ignore terms that have a document frequency strictly lower than the given threshold. This value is also called cut-off in the literature. If float, the parameter represents a proportion of documents, integer absolute counts. This parameter is ignored if vocabulary is not None.	1
	max_features max_features: int, default=None If not None, build a vocabulary that only consider the top `max_features` ordered by term frequency across the corpus. Otherwise, all features are used. This parameter is ignored if vocabulary is not None.	5000
	vocabulary vocabulary: Mapping or iterable, default=None Either a Mapping (e.g., a dict) where keys are terms and values are indices in the feature matrix, or an iterable over terms. If not given, a vocabulary is determined from the input documents. Indices in the mapping should not be repeated and should not have any gap between 0 and the largest index.	None
	binary binary: bool, default=False If True, all non zero counts are set to 1. This is useful for discrete probabilistic models that model binary events rather than integer counts.	False
	dtype dtype: dtype, default=np.int64 Type of the matrix returned by fit_transform() or transform().	<class 'numpy.int64'>

	penalty penalty: {'l1', 'l2', 'elasticnet', None}, default='l2' Specify the norm of the penalty: - `None`: no penalty is added; - `'l2'`: add a L2 penalty term and it is the default choice; - `'l1'`: add a L1 penalty term; - `'elasticnet'`: both L1 and L2 penalty terms are added. .. warning:: Some penalties may not work with some solvers. See the parameter `solver` below, to know the compatibility between the penalty and solver. .. versionadded:: 0.19 l1 penalty with SAGA solver (allowing 'multinomial' + L1) .. deprecated:: 1.8 `penalty` was deprecated in version 1.8 and will be removed in 1.10. Use `l1_ratio` instead. `l1_ratio=0` for `penalty='l2'`, `l1_ratio=1` for `penalty='l1'` and `l1_ratio` set to any float between 0 and 1 for `'penalty='elasticnet'`.	'deprecated'
	C C: float, default=1.0 Inverse of regularization strength; must be a positive float. Like in support vector machines, smaller values specify stronger regularization. `C=np.inf` results in unpenalized logistic regression. For a visual example on the effect of tuning the `C` parameter with an L1 penalty, see: :ref:`sphx_glr_auto_examples_linear_model_plot_logistic_path.py`.	1.0
	l1_ratio l1_ratio: float, default=0.0 The Elastic-Net mixing parameter, with `0 <= l1_ratio <= 1`. Setting `l1_ratio=1` gives a pure L1-penalty, setting `l1_ratio=0` a pure L2-penalty. Any value between 0 and 1 gives an Elastic-Net penalty of the form `l1_ratio * L1 + (1 - l1_ratio) * L2`. .. warning:: Certain values of `l1_ratio`, i.e. some penalties, may not work with some solvers. See the parameter `solver` below, to know the compatibility between the penalty and solver. .. versionchanged:: 1.8 Default value changed from None to 0.0. .. deprecated:: 1.8 `None` is deprecated and will be removed in version 1.10. Always use `l1_ratio` to specify the penalty type.	0.0
	dual dual: bool, default=False Dual (constrained) or primal (regularized, see also :ref:`this equation `) formulation. Dual formulation is only implemented for l2 penalty with liblinear solver. Prefer `dual=False` when n_samples > n_features.	False
	tol tol: float, default=1e-4 Tolerance for stopping criteria.	0.0001
	fit_intercept fit_intercept: bool, default=True Specifies if a constant (a.k.a. bias or intercept) should be added to the decision function.	True
	intercept_scaling intercept_scaling: float, default=1 Useful only when the solver `liblinear` is used and `self.fit_intercept` is set to `True`. In this case, `x` becomes `[x, self.intercept_scaling]`, i.e. a "synthetic" feature with constant value equal to `intercept_scaling` is appended to the instance vector. The intercept becomes ``intercept_scaling * synthetic_feature_weight``. .. note:: The synthetic feature weight is subject to L1 or L2 regularization as all other features. To lessen the effect of regularization on synthetic feature weight (and therefore on the intercept) `intercept_scaling` has to be increased.	1
	class_weight class_weight: dict or 'balanced', default=None Weights associated with classes in the form ``{class_label: weight}``. If not given, all classes are supposed to have weight one. The "balanced" mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as ``n_samples / (n_classes * np.bincount(y))``. Note that these weights will be multiplied with sample_weight (passed through the fit method) if sample_weight is specified. .. versionadded:: 0.17 class_weight='balanced'	None
	random_state random_state: int, RandomState instance, default=None Used when ``solver`` == 'sag', 'saga' or 'liblinear' to shuffle the data. See :term:`Glossary ` for details.	None
	solver solver: {'lbfgs', 'liblinear', 'newton-cg', 'newton-cholesky', 'sag', 'saga'}, default='lbfgs' Algorithm to use in the optimization problem. Default is 'lbfgs'. To choose a solver, you might want to consider the following aspects: - 'lbfgs' is a good default solver because it works reasonably well for a wide class of problems. - For :term:`multiclass` problems (`n_classes >= 3`), all solvers except 'liblinear' minimize the full multinomial loss, 'liblinear' will raise an error. - 'newton-cholesky' is a good choice for `n_samples` >> `n_features * n_classes`, especially with one-hot encoded categorical features with rare categories. Be aware that the memory usage of this solver has a quadratic dependency on `n_features * n_classes` because it explicitly computes the full Hessian matrix. - For small datasets, 'liblinear' is a good choice, whereas 'sag' and 'saga' are faster for large ones; - 'liblinear' can only handle binary classification by default. To apply a one-versus-rest scheme for the multiclass setting one can wrap it with the :class:`~sklearn.multiclass.OneVsRestClassifier`. .. warning:: The choice of the algorithm depends on the penalty chosen (`l1_ratio=0` for L2-penalty, `l1_ratio=1` for L1-penalty and `0 < l1_ratio < 1` for Elastic-Net) and on (multinomial) multiclass support: ================= ======================== ====================== solver l1_ratio multinomial multiclass ================= ======================== ====================== 'lbfgs' l1_ratio=0 yes 'liblinear' l1_ratio=1 or l1_ratio=0 no 'newton-cg' l1_ratio=0 yes 'newton-cholesky' l1_ratio=0 yes 'sag' l1_ratio=0 yes 'saga' 0<=l1_ratio<=1 yes ================= ======================== ====================== .. note:: 'sag' and 'saga' fast convergence is only guaranteed on features with approximately the same scale. You can preprocess the data with a scaler from :mod:`sklearn.preprocessing`. .. seealso:: Refer to the :ref:`User Guide ` for more information regarding :class:`LogisticRegression` and more specifically the :ref:`Table ` summarizing solver/penalty supports. .. versionadded:: 0.17 Stochastic Average Gradient (SAG) descent solver. Multinomial support in version 0.18. .. versionadded:: 0.19 SAGA solver. .. versionchanged:: 0.22 The default solver changed from 'liblinear' to 'lbfgs' in 0.22. .. versionadded:: 1.2 newton-cholesky solver. Multinomial support in version 1.6.	'lbfgs'
	max_iter max_iter: int, default=100 Maximum number of iterations taken for the solvers to converge.	5000
	verbose verbose: int, default=0 For the liblinear and lbfgs solvers set verbose to any positive number for verbosity.	0
	warm_start warm_start: bool, default=False When set to True, reuse the solution of the previous call to fit as initialization, otherwise, just erase the previous solution. Useless for liblinear solver. See :term:`the Glossary `. .. versionadded:: 0.17 warm_start to support lbfgs, newton-cg, sag, saga solvers.	False
	n_jobs n_jobs: int, default=None Does not have any effect. .. deprecated:: 1.8 `n_jobs` is deprecated in version 1.8 and will be removed in 1.10.	None

	n_clusters n_clusters: int, default=8 The number of clusters to form as well as the number of centroids to generate. For an example of how to choose an optimal value for `n_clusters` refer to :ref:`sphx_glr_auto_examples_cluster_plot_kmeans_silhouette_analysis.py`.	5
	init init: {'k-means++', 'random'}, callable or array-like of shape (n_clusters, n_features), default='k-means++' Method for initialization: * 'k-means++' : selects initial cluster centroids using sampling based on an empirical probability distribution of the points' contribution to the overall inertia. This technique speeds up convergence. The algorithm implemented is "greedy k-means++". It differs from the vanilla k-means++ by making several trials at each sampling step and choosing the best centroid among them. * 'random': choose `n_clusters` observations (rows) at random from data for the initial centroids. * If an array is passed, it should be of shape (n_clusters, n_features) and gives the initial centers. * If a callable is passed, it should take arguments X, n_clusters and a random state and return an initialization. For an example of how to use the different `init` strategies, see :ref:`sphx_glr_auto_examples_cluster_plot_kmeans_digits.py`. For an evaluation of the impact of initialization, see the example :ref:`sphx_glr_auto_examples_cluster_plot_kmeans_stability_low_dim_dense.py`.	'k-means++'
	n_init n_init: 'auto' or int, default='auto' Number of times the k-means algorithm is run with different centroid seeds. The final results is the best output of `n_init` consecutive runs in terms of inertia. Several runs are recommended for sparse high-dimensional problems (see :ref:`kmeans_sparse_high_dim`). When `n_init='auto'`, the number of runs depends on the value of init: 10 if using `init='random'` or `init` is a callable; 1 if using `init='k-means++'` or `init` is an array-like. .. versionadded:: 1.2 Added 'auto' option for `n_init`. .. versionchanged:: 1.4 Default value for `n_init` changed to `'auto'`.	'auto'
	max_iter max_iter: int, default=300 Maximum number of iterations of the k-means algorithm for a single run.	300
	tol tol: float, default=1e-4 Relative tolerance with regards to Frobenius norm of the difference in the cluster centers of two consecutive iterations to declare convergence.	0.0001
	verbose verbose: int, default=0 Verbosity mode.	0
	random_state random_state: int, RandomState instance or None, default=None Determines random number generation for centroid initialization. Use an int to make the randomness deterministic. See :term:`Glossary `.	123
	copy_x copy_x: bool, default=True When pre-computing distances it is more numerically accurate to center the data first. If copy_x is True (default), then the original data is not modified. If False, the original data is modified, and put back before the function returns, but small numerical differences may be introduced by subtracting and then adding the data mean. Note that if the original data is not C-contiguous, a copy will be made even if copy_x is False. If the original data is sparse, but not in CSR format, a copy will be made even if copy_x is False.	True
	algorithm algorithm: {"lloyd", "elkan"}, default="lloyd" K-means algorithm to use. The classical EM-style algorithm is `"lloyd"`. The `"elkan"` variation can be more efficient on some datasets with well-defined clusters, by using the triangle inequality. However it's more memory intensive due to the allocation of an extra array of shape `(n_samples, n_clusters)`. .. versionchanged:: 0.18 Added Elkan algorithm .. versionchanged:: 1.1 Renamed "full" to "lloyd", and deprecated "auto" and "full". Changed "auto" to use "lloyd" instead of "elkan".	'lloyd'

Lecture 1: Introduction to CPSC 330

🤝 Introductions ! 🤝

About your instructor

About my research interests

Group work in this class

Group work in this class

Group work: Pods

Meet Eva (a fictitious persona)!

🎯 Learning Outcomes

About this course

CPSC 330 website

Asking questions during class

Registration, waitlist and prerequisites

Lecture format

Home work assignments

Exams

Course structure

Setting up your computer for the course

Tools used in this course

Course conda environment

Python requirements/resources

Workload

Tips for success:

Homework format: Jupyter lab notebooks

Jupyter lab notebooks

More about Jupyter lab

Lecture notes

Grades

Setting up your computer for the course

Recommended browser and tools

Course conda environment

Python requirements/resources

CPSC 330 vs. 340

Break

Activity 1

What is machine learning?

Which cat do you think is AI-generated?

What are AI, ML, DL?

Let’s walk through an example

Image classification

Traditional programming: example

ML approach: example

DL approach: example

What is ML?

When to use ML?

When to use Machine Learning (ML) solutions?

Example: Supervised classification

Model training

New examples

Model predictions on new examples

Example: Supervised regression

Building a regression model

Predicting prices of unseen houses

Text data

Example: Text classification

Let’s train a model

Unseen messages

Predicting on unseen data

Examplel: Text classification with LLMs

Zero-shot learning

Zero-shot learning for emotion detection

Zero-shot learning for emotion detection

Image data

Example: Predicting labels of a given image

Clustering images

Finding groups in food images

K-Means on food dataset

Examining food clusters

Interactive: Is ML appropriate?

❓❓ Questions for you

Summary: When is ML suitable?

Activity 2

Types of machine learning

What is supervised learning?

🤔 Eva’s questions

What is Machine Learning (ML)?

Spam prediction

Traditional programming vs. ML

Let’s train a model

Unseen messages

Course `conda` environment

Course `conda` environment