Halloween Special - Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: mxmas70

Home > CertNexus > Certified AI Practitioner > AIP-210

AIP-210 CertNexus Certified Artificial Intelligence Practitioner (CAIP) Question and Answers

Question # 4

A healthcare company experiences a cyberattack, where the hackers were able to reverse-engineer a dataset to break confidentiality.

Which of the following is TRUE regarding the dataset parameters?

A.

The model is overfitted and trained on a high quantity of patient records.

B.

The model is overfitted and trained on a low quantity of patient records.

C.

The model is underfitted and trained on a high quantity of patient records.

D.

The model is underfitted and trained on a low quantity of patient records.

Full Access
Question # 5

Which of the following models are text vectorization methods? (Select two.)

A.

Lemmatization

B.

PCA

C.

Skip-gram

D.

TF-IDF

E.

Tokenization

F.

t-SNE

Full Access
Question # 6

In which of the following scenarios is lasso regression preferable over ridge regression?

A.

The number of features is much larger than the sample size.

B.

There are many features with no association with the dependent variable.

C.

There is high collinearity among some of the features associated with the dependent variable.

D.

The sample size is much larger than the number of features.

Full Access
Question # 7

Which of the following tests should be performed at the production level before deploying a newly retrained model?

A.

A/Btest

B.

Performance test

C.

Security test

D.

Unit test

Full Access
Question # 8

A change in the relationship between the target variable and input features is

A.

concept drift.

B.

covariate shift.

C.

data drift.

D.

model decay.

Full Access
Question # 9

Which three security measures could be applied in different ML workflow stages to defend them against malicious activities? (Select three.)

A.

Disable logging for model access.

B.

Launch ML Instances In a virtual private cloud (VPC).

C.

Monitor model degradation.

D.

Use data encryption.

E.

Use max privilege to control access to ML artifacts.

F.

Use Secrets Manager to protect credentials.

Full Access
Question # 10

Which of the following is the primary purpose of hyperparameter optimization?

A.

Controls the learning process of a given algorithm

B.

Makes models easier to explain to business stakeholders

C.

Improves model interpretability

D.

Increases recall over precision

Full Access
Question # 11

Which of the following equations best represent an LI norm?

A.

|x| + |y|

B.

|x|+|y|^2

C.

|x|-|y|

D.

|x|^2+|y|^2

Full Access
Question # 12

You and your team need to process large datasets of images as fast as possible for a machine learning task. The project will also use a modular framework with extensible code and an active developer community. Which of the following would BEST meet your needs?

A.

Caffe

B.

Keras

C.

Microsoft Cognitive Services

D.

TensorBoard

Full Access
Question # 13

The following confusion matrix is produced when a classifier is used to predict labels on a test dataset. How precise is the classifier?

A.

48/(48+37)

B.

37/(37+8)

C.

37/(37+7)

D.

(48+37)/100

Full Access
Question # 14

Which of the following principles supports building an ML system with a Privacy by Design methodology?

A.

Avoiding mechanisms to explain and justify automated decisions.

B.

Collecting and processing the largest amount of data possible.

C.

Understanding, documenting, and displaying data lineage.

D.

Utilizing quasi-identifiers and non-unique identifiers, alone or in combination.

Full Access
Question # 15

Which of the following describes a neural network without an activation function?

A.

A form of a linear regression

B.

A form of a quantile regression

C.

An unsupervised learning technique

D.

A radial basis function kernel

Full Access
Question # 16

Which of the following is a type 1 error in statistical hypothesis testing?

A.

The null hypothesis is false, but fails to be rejected.

B.

The null hypothesis is false and is rejected.

C.

The null hypothesis is true and fails to be rejected.

D.

The null hypothesis is true, but is rejected.

Full Access
Question # 17

You have a dataset with thousands of features, all of which are categorical. Using these features as predictors, you are tasked with creating a prediction model to accurately predict the value of a continuous dependent variable. Which of the following would be appropriate algorithms to use? (Select two.)

A.

K-means

B.

K-nearest neighbors

C.

Lasso regression

D.

Logistic regression

E.

Ridge regression

Full Access
Question # 18

The graph is an elbow plot showing the inertia or within-cluster sum of squares on the y-axis and number of clusters (also called K) on the x-axis, denoting the change in inertia as the clusters change using k-means algorithm.

What would be an optimal value of K to ensure a good number of clusters?

A.

2

B.

3

C.

5

D.

9

Full Access
Question # 19

Which two of the following decrease technical debt in ML systems? (Select two.)

A.

Boundary erosion

B.

Design anti-patterns

C.

Documentation readability

D.

Model complexity

E.

Refactoring

Full Access
Question # 20

Which of the following occurs when a data segment is collected in such a way that some members of the intended statistical population are less likely to be included than others?

A.

Algorithmic bias

B.

Sampling bias

C.

Stereotype bias

D.

Systematic value distortion

Full Access
Question # 21

Normalization is the transformation of features:

A.

By subtracting from the mean and dividing by the standard deviation.

B.

Into the normal distribution.

C.

So that they are on a similar scale.

D.

To different scales from each other.

Full Access
Question # 22

You are implementing a support-vector machine on your data, and a colleague suggests you use a polynomial kernel. In what situation might this help improve the prediction of your model?

A.

When it is necessary to save computational time.

B.

When the categories of the dependent variable are not linearly separable.

C.

When the distribution of the dependent variable is Gaussian.

D.

When there is high correlation among the features.

Full Access
Question # 23

Which of the following is TRUE about SVM models?

A.

They can be used only for classification.

B.

They can be used only for regression.

C.

They can take the feature space into higher dimensions to solve the problem.

D.

They use the sigmoid function to classify the data points.

Full Access
Question # 24

What is Word2vec?

A.

A bag of words.

B.

A matrix of how frequently words appear in a group of documents.

C.

A word embedding method that builds a one-hot encoded matrix from samples and the terms that appear in them.

D.

A word embedding method that finds characteristics of words in a very large number of documents.

Full Access
Question # 25

Which two of the following criteria are essential for machine learning models to achieve before deployment? (Select two.)

A.

Complexity

B.

Data size

C.

Explainability

D.

Portability

E.

Scalability

Full Access
Question # 26

Which of the following items should be included in a handover to the end user to enable them to use and run a trained model on their own system? (Select three.)

A.

Information on the folder structure in your local machine

B.

Intermediate data files

C.

Link to a GitHub repository of the codebase

D.

README document

E.

Sample input and output data files

Full Access
Question # 27

Word Embedding describes a task in natural language processing (NLP) where:

A.

Words are converted into numerical vectors.

B.

Words are featurized by taking a histogram of letter counts.

C.

Words are featurized by taking a matrix of bigram counts.

D.

Words are grouped together into clusters and then represented by word cluster membership.

Full Access