Which of the following methods can be used to rebalance a dataset using the rebalance design pattern?
Which two techniques are used to build personas in the ML development lifecycle? (Select two.)
Which of the following text vectorization methods is appropriate and correctly defined for an English-to-Spanish translation machine?
An organization sells house security cameras and has asked their data scientists to implement a model to detect human feces, as distinguished from animals, so they can alert th customers only when a human gets close to their house.
Which of the following algorithms is an appropriate option with a correct reason?
A company is developing a merchandise sales application The product team uses training data to teach the AI model predicting sales, and discovers emergent bias. What caused the biased results?
You have a dataset with thousands of features, all of which are categorical. Using these features as predictors, you are tasked with creating a prediction model to accurately predict the value of a continuous dependent variable. Which of the following would be appropriate algorithms to use? (Select two.)
Given a feature set with rows that contain missing continuous values, and assuming the data is normally distributed, what is the best way to fill in these missing features?
In which of the following scenarios is lasso regression preferable over ridge regression?
Which two of the following decrease technical debt in ML systems? (Select two.)
Which of the following principles supports building an ML system with a Privacy by Design methodology?
Which of the following scenarios is an example of entanglement in ML pipelines?
A big data architect needs to be cautious about personally identifiable information (PII) that may be captured with their new IoT system. What is the final stage of the Data Management Life Cycle, which the architect must complete in order to implement data privacy and security appropriately?
In a self-driving car company, ML engineers want to develop a model for dynamic pathing. Which of following approaches would be optimal for this task?
A dataset can contain a range of values that depict a certain characteristic, such as grades on tests in a class during the semester. A specific student has so far received the following grades: 76,81, 78, 87, 75, and 72. There is one final test in the semester. What minimum grade would the student need to achieve on the last test to get an 80% average?
Word Embedding describes a task in natural language processing (NLP) where:
A data scientist is tasked to extract business intelligence from primary data captured from the public. Which of the following is the most important aspect that the scientist cannot forget to include?
Which of the following is a type 1 error in statistical hypothesis testing?