iClicker cloud join link: https://join.iclicker.com/YJHS
Select all of the following statements which are TRUE.
ColumnTransformer object to cross_validate.fit_transform on a ColumnTransformer object, you get a numpy ndarray.handle_unknown = "ignore" of OneHotEncoderhandle_unknown='ignore' with OneHotEncoder to safely ignore unseen categories during transform.Question for you to consider in your Group
How would you determine whether it is reasonable or not to set
handle_unknown = "ignore"?
handle_unknown = "ignore" of OneHotEncoderlocation, device_type, and product_category. During training, you have observed a set of categories for product_category, but in the future, new product categories might be added.handle_unknown = "ignore" of OneHotEncoderdrop="if_binary" argument of OneHotEncoder (Reference)sklearn CountVectorizerscikit-learn’s CountVectorizer to encode text dataCountVectorizer: Transforms text into a matrix of token countsmax_features: Control the number of features used in the modelmax_df, min_df: Control document frequency thresholdsngram_range: Defines the range of n-grams to be extractedstop_words: Enables the removal of common words that are typically uninformative in most applications, such as “and”, “the”, etc.iClicker cloud join link: https://join.iclicker.com/YJHS
Select all of the following statements which are TRUE.
handle_unknown="ignore" would treat all unknown categories equally.max_features hyperparameter of CountVectorizer the training score is likely to go up.CountVectorizer. If you encounter a word in the validation or the test split that’s not available in the training data, we’ll get an error.cross_validate, each fold might have slightly different number of features (columns) in the fold.sklearn CountVectorizerscikit-learn’s CountVectorizer to encode text dataCountVectorizer: Transforms text into a matrix of token countsmax_features: Control the number of features used in the modelmax_df, min_df: Control document frequency thresholdsngram_range: Defines the range of n-grams to be extractedstop_words: Enables the removal of common words that are typically uninformative in most applications, such as “and”, “the”, etc.X and y is linear.Ridge vs. LinearRegressionRidge adds a parameter to control the complexity of a model. Finds a line that balances fit and prevents overly large coefficients.LinearRegression
Ridge
Ridge.iClicker cloud join link: https://join.iclicker.com/YJHS
Select all of the following statements which are TRUE.
iClicker cloud join link: https://join.iclicker.com/YJHS
Select all of the following statements which are TRUE.
C hyperparameter increases model complexity.For this demo, each student should click this link to create a new repo in their accounts, then clone that repo locally to follow along with the demo from today.