Pre-Summer Sale Special - Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: mxmas70

Home > Microsoft > Microsoft Azure > DP-100

DP-100 Designing and Implementing a Data Science Solution on Azure Question and Answers

Question # 4

You are performing feature engineering on a dataset.

You must add a feature named CityName and populate the column value with the text London.

You need to add the new feature to the dataset.

Which Azure Machine Learning Studio module should you use?

A.

Edit Metadata

B.

Preprocess Text

C.

Execute Python Script

D.

Latent Dirichlet Allocation

Full Access
Question # 5

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You have an Azure Machine Learning workspace. You connect to a terminal session from the Notebooks page in Azure Machine Learning studio.

You plan to add a new Jupyter kernel that will be accessible from the same terminal session.

You need to perform the task that must be completed before you can add the new kernel.

Solution: Delete the Python 3.6 - AzureML kernel.

Does the solution meet the goal?

A.

Yes

B.

No

Full Access
Question # 6

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You are using Azure Machine Learning to run an experiment that trains a classification model.

You want to use Hyperdrive to find parameters that optimize the AUC metric for the model. You configure a HyperDriveConfig for the experiment by running the following code:

You plan to use this configuration to run a script that trains a random forest model and then tests it with validation data. The label values for the validation data are stored in a variable named y_test variable, and the predicted probabilities from the model are stored in a variable named y_predicted.

You need to add logging to the script to allow Hyperdrive to optimize hyperparameters for the AUC metric. Solution: Run the following code:

Does the solution meet the goal?

A.

Yes

B.

No

Full Access
Question # 7

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You create an Azure Machine Learning service datastore in a workspace. The datastore contains the following files:

• /data/2018/Q1 .csv

• /data/2018/Q2.csv

• /data/2018/Q3.csv

• /data/2018/Q4.csv

• /data/2019/Q1.csv

All files store data in the following format:

id,M,f2,l

1,1,2,0

2,1,1,1

32,10

You run the following code:

You need to create a dataset named training_data and load the data from all files into a single data frame by using the following code:

Solution: Run the following code:

Does the solution meet the goal?

A.

Yes

B.

No

Full Access
Question # 8

You create an MLflow model

You must deploy the model to Azure Machine Learning for batch inference.

You need to create the batch deployment.

Which two components should you use? Each correct answer presents a complete solution.

NOTE: Each correct selection is worth one point

A.

Compute target

B.

Kubernetes online endpoint

C.

Model files

D.

Online endpoint

E.

Environment

Full Access
Question # 9

You have machine learning models produce unfair predictions across sensitive features.

You must use a post-processing technique to apply a constraint to the models to mitigate their unfairness.

You need to select a post-processing technique and model type.

What should you use? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Full Access
Question # 10

You manage an Azure Machine Learning workspace named workspace1 and a Data Science Virtual Machine (DSVM) named DSMV1.

You must an experiment in DSMV1 by using a Jupiter notebook and Python SDK v2 code. You must store metrics and artifacts in workspace 1 You start by creating Python SCK v2 code to import ail required packages.

You need to implement the Python SOK v2 code to store metrics and article in workspace1.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them the correctly order.

Full Access
Question # 11

You manage an Azure Machine Learning workspace named workspace!.

You plan to author custom pipeline components by using Azure Machine Learning Python SDK v2.

You must transform the Python code into a YAML specification that can be processed by the pipeline service.

You need to import the Python library that provides the transformation functionality.

Which Python library should you import?

A.

azure.ai ml.automl

B.

azure.ai.ml.entities

C.

sklearn

D.

mldesigner

Full Access
Question # 12

You are using the Azure Machine Learning designer to transform a dataset by using an Execute Python Script component and custom code.

You need to define the method signature for the Execute Python Script component and return value type.

What should you define? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Full Access
Question # 13

You manage an Azure Al Foundry project.

You plan to fine-tune a base model by using pre-uploaded training and validation data. You must specify a hyperparameter to ensure the job is reproducible.

You need to submit the fine-tuning training job.

How should you complete the Python code segment? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Full Access
Question # 14

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You have a Python script named train.py in a local folder named scripts. The script trains a regression model by using scikit-learn. The script includes code to load a training data file which is also located in the scripts folder.

You must run the script as an Azure ML experiment on a compute cluster named aml-compute.

You need to configure the run to ensure that the environment includes the required packages for model training. You have instantiated a variable named aml-compute that references the target compute cluster.

Solution: Run the following code:

Does the solution meet the goal?

A.

Yes

B.

No

Full Access
Question # 15

You have an Azure Machine Learning workspace.

You plan to use the workspace to set up automated machine learning training for an image classification model.

You need to choose the primary metric to optimize the model training.

Which primary metric should you choose?

A.

r2_score

B.

mean_absolute_error

C.

accuracy

D.

root_mean_squared_log_error

Full Access
Question # 16

For each of the following statements, select Yes if the statement is true. Otherwise, select No. NOTE: Each correct selection is worth one point.

Full Access
Question # 17

You are analyzing a dataset by using Azure Machine Learning Studio.

YOU need to generate a statistical summary that contains the p value and the unique value count for each feature column.

Which two modules can you users? Each correct answer presents a complete solution.

NOTE: Each correct selection is worth one point.

A.

Execute Python Script

B.

Export Count Table

C.

Convert to Indicator Values

D.

Summarize Data

E.

Compute linear Correlation

Full Access
Question # 18

You are developing a machine learning model.

You must inference the machine learning model for testing.

You need to use a minimal cost compute target

Which two compute targets should you use? Each correct answer presents a complete solution.

NOTE: Each correct selection is worth one point

A.

Local web service

B.

Remote VM

C.

Azure Databricks

D.

Azure Machine Learning Kubernetes

E.

Azure Container Instances

Full Access
Question # 19

You are evaluating a completed binary classification machine learning model.

You need to use the precision as the valuation metric.

Which visualization should you use?

A.

Binary classification confusion matrix

B.

box plot

C.

Gradient descent

D.

coefficient of determination

Full Access
Question # 20

You write five Python scripts that must be processed in the order specified in Exhibit A – which allows the same modules to run in parallel, but will wait for modules with dependencies.

You must create an Azure Machine Learning pipeline using the Python SDK, because you want to script to create the pipeline to be tracked in your version control system. You have created five PythonScriptSteps and have named the variables to match the module names.

You need to create the pipeline shown. Assume all relevant imports have been done.

Which Python code segment should you use?

A.

Option A

B.

Option B

C.

Option C

D.

Option D

Full Access
Question # 21

You manage an Azure Machine Learning won pace named workspace 1 by using the Python SDK v2. You create a Gene-al Purpose v2 Azure storage account named mlstorage1. The storage account includes a pulley accessible container name micOTtalnerl. The container stores 10 blobs with files in the CSV format.

You must develop Python SDK v2 code to create a data asset referencing all blobs in the container named mtcontamer1.

You need to complete the Python SDK v2 code.

How should you complete the code? To answer select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Full Access
Question # 22

You manage an Azure At Foundry project.

You are implementing a RAG solution. The documents contain tables and images that must be broken into semantically relevant chunks.

You need to generate textual representations of images and tables to be used as chunks.

Which two chunking approaches should you use? Each correct answer presents a complete solution. Choose two.

NOTE: Each correct selection is worth one point.

A.

Sentence-based parsing

B.

Document layout analysis

C.

Large language model augmentation

D.

Fixed-size parsing

Full Access
Question # 23

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You have a Python script named train.py in a local folder named scripts. The script trains a regression model by using scikit-learn. The script includes code to load a training data file which is also located in the scripts folder.

You must run the script as an Azure ML experiment on a compute cluster named aml-compute.

You need to configure the run to ensure that the environment includes the required packages for model training. You have instantiated a variable named aml-compute that references the target compute cluster.

Solution: Run the following code:

Does the solution meet the goal?

A.

Yes

B.

No

Full Access
Question # 24

You have an Azure Machine Learning workspace.

You plan to run a job to tram a model as an MLflow model output.

You need to specify the output mode of the MLflow model.

Which three modes can you specify? Each correct answer presents a complete solution.

NOTE: Each correct selection is worth one point.

A.

rw_mount

B.

ro mount

C.

upload

D.

download

E.

direct

Full Access
Question # 25

You have an Azure Machine Learning workspace named Workspaces

You plan to train an image object detection model by using Automated ML in Workspace1.

You need to complete the provided Azure Machine Learning Python SDK v2 code to start an image object detection job.

How should you complete the code? To answer, select the appropriate options in the answer area.

NOTE Each correct selection is worth one point.

Full Access
Question # 26

You train a machine learning model.

You must deploy the model as a real-time inference service for testing. The service requires low CPU utilization and less than 48 MB of RAM. The compute target for the deployed service must initialize automatically while minimizing cost and administrative overhead.

Which compute target should you use?

A.

Azure Kubernetes Service (AKS) inference cluster

B.

Azure Machine Learning compute cluster

C.

Azure Container Instance (ACI)

D.

attached Azure Databricks cluster

Full Access
Question # 27

You need to define an evaluation strategy for the crowd sentiment models.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Full Access
Question # 28

You need to use the Python language to build a sampling strategy for the global penalty detection models.

How should you complete the code segment? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Full Access
Question # 29

You need to define a process for penalty event detection.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Full Access
Question # 30

You need to resolve the local machine learning pipeline performance issue. What should you do?

A.

Increase Graphic Processing Units (GPUs).

B.

Increase the learning rate.

C.

Increase the training iterations,

D.

Increase Central Processing Units (CPUs).

Full Access
Question # 31

You need to implement a new cost factor scenario for the ad response models as illustrated in the

performance curve exhibit.

Which technique should you use?

A.

Set the threshold to 0.5 and retrain if weighted Kappa deviates +/- 5% from 0.45.

B.

Set the threshold to 0.05 and retrain if weighted Kappa deviates +/- 5% from 0.5.

C.

Set the threshold to 0.2 and retrain if weighted Kappa deviates +/- 5% from 0.6.

D.

Set the threshold to 0.75 and retrain if weighted Kappa deviates +/- 5% from 0.15.

Full Access
Question # 32

You need to define a process for penalty event detection.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Full Access
Question # 33

You need to select an environment that will meet the business and data requirements.

Which environment should you use?

A.

Azure HDInsight with Spark MLlib

B.

Azure Cognitive Services

C.

Azure Machine Learning Studio

D.

Microsoft Machine Learning Server

Full Access
Question # 34

You need to implement a model development strategy to determine a user’s tendency to respond to an ad.

Which technique should you use?

A.

Use a Relative Expression Split module to partition the data based on centroid distance.

B.

Use a Relative Expression Split module to partition the data based on distance travelled to the event.

C.

Use a Split Rows module to partition the data based on distance travelled to the event.

D.

Use a Split Rows module to partition the data based on centroid distance.

Full Access
Question # 35

You need to implement a scaling strategy for the local penalty detection data.

Which normalization type should you use?

A.

Streaming

B.

Weight

C.

Batch

D.

Cosine

Full Access
Question # 36

You need to build a feature extraction strategy for the local models.

How should you complete the code segment? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Full Access
Question # 37

You need to modify the inputs for the global penalty event model to address the bias and variance issue.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Full Access
Question # 38

You need to define an evaluation strategy for the crowd sentiment models.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Full Access
Question # 39

You need to define a modeling strategy for ad response.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Full Access
Question # 40

You need to implement a feature engineering strategy for the crowd sentiment local models.

What should you do?

A.

Apply an analysis of variance (ANOVA).

B.

Apply a Pearson correlation coefficient.

C.

Apply a Spearman correlation coefficient.

D.

Apply a linear discriminant analysis.

Full Access
Question # 41

You need to produce a visualization for the diagnostic test evaluation according to the data visualization requirements.

Which three modules should you recommend be used in sequence? To answer, move the appropriate modules from the list of modules to the answer area and arrange them in the correct order.

Full Access
Question # 42

You need to identify the methods for dividing the data according, to the testing requirements.

Which properties should you select? To answer, select the appropriate option-, m the answer area. NOTE: Each correct selection is worth one point.

Full Access
Question # 43

You need to implement early stopping criteria as suited in the model training requirements.

Which three code segments should you use to develop the solution? To answer, move the appropriate code segments from the list of code segments to the answer area and arrange them in the correct order.

NOTE: More than one order of answer choices is correct. You will receive credit for any of the correct orders you select.

Full Access
Question # 44

You need to select a feature extraction method.

Which method should you use?

A.

Spearman correlation

B.

Mutual information

C.

Mann-Whitney test

D.

Pearson’s correlation

Full Access
Question # 45

You need to identify the methods for dividing the data according to the testing requirements.

Which properties should you select? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Full Access
Question # 46

You need to set up the Permutation Feature Importance module according to the model training requirements.

Which properties should you select? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Full Access
Question # 47

You need to configure the Permutation Feature Importance module for the model training requirements.

What should you do? To answer, select the appropriate options in the dialog box in the answer area.

NOTE: Each correct selection is worth one point.

Full Access
Question # 48

You need to configure the Edit Metadata module so that the structure of the datasets match.

Which configuration options should you select? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Full Access
Question # 49

You need to replace the missing data in the AccessibilityToHighway columns.

How should you configure the Clean Missing Data module? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Full Access
Question # 50

You need to visually identify whether outliers exist in the Age column and quantify the outliers before the outliers are removed.

Which three Azure Machine Learning Studio modules should you use in sequence? To answer, move the appropriate modules from the list of modules to the answer area and arrange them in the correct order.

Full Access
Question # 51

You need to select a feature extraction method.

Which method should you use?

A.

Mutual information

B.

Mood’s median test

C.

Kendall correlation

D.

Permutation Feature Importance

Full Access
Question # 52

You need to correct the model fit issue.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Full Access
Question # 53

You need to configure the Feature Based Feature Selection module based on the experiment requirements and datasets.

How should you configure the module properties? To answer, select the appropriate options in the dialog box in the answer area.

NOTE: Each correct selection is worth one point.

Full Access