CPSC 330 Lecture 13: Feature importances
Announcements
- HW4 grades are released
- HW5 is due next week Monday. Make use of office hours and tutorials this week.
Scenario 1: Which model would you pick
Predicting whether a patient is likely to develop diabetes based on features such as age, blood pressure, glucose levels, and BMI. You have two models:
- LGBM which results in 0.9 f1 score
- Logistic regression which results in 0.84 f1 score
Which model would you pick? Why?
Scenario 2
Predicting whether a user will purchase a product next based on their browsing history, previous purchases, and click behavior. You have two models:
- LGBM which results in 0.9 F1 score
- Logistic regression which results in 0.84 F1 score
Which model would you pick? Why?
Transparency
- In many domains understanding the relationship between features and predictions is critical for trust and regulatory compliance.
Feature importances
- How does the output depend upon the input?
- How do the predictions change as a function of a particular feature?
How to get feature importances?
Correlations
- What are some limitations of correlations?
Interepreting coefficients
- Linear models are interpretable because you get coefficients associated with different features.
- Each coefficient represents the estimated impact of a feature on the target variable, assuming all other features are held constant.
- In a
Ridge
model,
- A positive coefficient indicates that as the feature’s value increases, the predicted value also increases.
- A negative coefficient indicates that an increase in the feature’s value leads to a decrease in the predicted value.
Interepreting coefficients
- When we have different types of preprocessed features, what challenges you might face in interpreting them?
- Ordinally encoded features
- One-hot encoded features
- Scaled numeric features
Group Work: Class Demo & Live Coding (if time permits)
For this demo, each student should click this link to create a new repo in their accounts, then clone that repo locally to follow along with the demo from today.
If you really don’t want to create a repo,
- Navigate to the
cpsc330-2024W1
repo
- run
git pull
to pull the latest files in the course repo
- Look for the demo file here