New Year Special Sale - Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: mxmas70

Home > CompTIA > CompTIA Data+ > DA0-001

DA0-001 CompTIA Data+ Certification Exam Question and Answers

Question # 4

What R package makes it easy to work with dates?

A.

Lubridate.

B.

Datemath.

C.

Stringr.

D.

ggplot.

Full Access
Question # 5

An analyst reviews the following data:

7

3

5

2

3

7

7

10

Which of the following is the value of the mode?

A.

3

B.

5

C.

7

D.

10

Full Access
Question # 6

A data analyst needs to calculate the mean for Q1 sales using the data set below:

Which of the following is the mean?

A.

$2,466.18

B.

$2,667.60

C.

$3,082.72

D.

$12,330.88

Full Access
Question # 7

Daniel is using the structured Query language to work with data stored in relational database.

He would like to add several new rows to a database table.

What command should he use?

A.

SELECT.

B.

ALTER.

C.

INSERT.

D.

UPDATE.

Full Access
Question # 8

A publishing group has requested a dashboard to track submissions before publication. A key requirement is that all changes are tracked, as multiple users will be checking out documents and editing them before submissions are considered final. Which of the following is the BEST way to meet this stakeholder requirement?

A.

Display the version number next to each submission on the dashboard.

B.

Present a data refresh date at the top of the dashboard.

C.

Confirm the dashboard is adhering to the corporate style guide.

D.

Use permissions to ensure users only see certain versions of the submissions.

Full Access
Question # 9

Randy scored 76 on a math test, Katie scored 86 on a science test, Ralph scored 80 on a history test, and Jean scored 80 on an English test. The table below contains the mean and standard deviation of the scores for each of the courses:

Using this information, which of the following students had the BEST score?

A.

Randy

B.

Katie

C.

Ralph

D.

Jean

Full Access
Question # 10

Which of the following data manipulation techniques should an analyst use to hide unnecessary data during analysis?

A.

Filtering

B.

Parametrization

C.

Sorting

D.

Indexing

Full Access
Question # 11

Kelly wants to get feedback on the final draft of a strategic report that has taken her six months to develop.

What can she do to get prevent confusion as see seeks feedback before publishing the report?

Choose the best answer.

A.

Distribute the report to the appropriate stakeholders via email.

B.

Use a watermark to identify the report as a draft.

C.

Show the report to her immediate supervisor.

D.

Publish the report on an internally facing website.

Full Access
Question # 12

A data analyst is developing a data dictionary that aligns with a company's data management processes and policies. Which of the following best describes what should be included in the data dictionary?

A.

Information containing the links to business data

B.

Information explaining the business methodologies

C.

Information containing definitions of the business data

D.

Information describing the data analysis phases

Full Access
Question # 13

Which of the following report types is most appropriate for a high-level, year-end report requested by a Chief Executive Officer?

A.

Dynamic

B.

Recurring

C.

Ad hoc

D.

Self-service

Full Access
Question # 14

A data analyst is using a two-tailed, independent t-test to determine whether the type of stretching, dynamic or static, has any influence on a dancer's flexibility. Which of the following is the alternative hypothesis?

A.

A dancer's flexibility is improved through static stretching.

B.

The change in a dancer's flexibility is not equal to zero.

C.

There is a difference in a dancer's flexibility between static and dynamic stretching.

D.

The means of the static and dynamic stretching groups do not differ from each other.

Full Access
Question # 15

Which of the following programming languages are best suited for analysis and machine-learning applications? (Select two).

A.

Ruby

B.

Rust

C.

PHP

D.

Python

E.

Kotlin

F.

R

Full Access
Question # 16

Which of the following query statements would be used when filtering data in a relational database management system? (Select two).

A.

ORDER BY

B.

HAVING

C.

WHERE

D.

SELECT

E.

INSERT

F.

GROUP BY

Full Access
Question # 17

A sales analyst needs to report how the sales team is performing to target. Which of the following files will be important in determining 2019 performance attainment?

A.

2018 goal data

B.

2018 actual revenue

C.

2019 goal data

D.

2019 commission plan

Full Access
Question # 18

Which of the following can be used to translate data into another form so it can only be read by a user who has a key or a password?

A.

Data encryption.

B.

Data transmission.

C.

Data protection.

D.

Data masking.

Full Access
Question # 19

Which of the following is a domain-specific language used in programming that is designed for managing data that is held in a relational data stream management system?

A.

SAS

B.

SQL

C.

Python

D.

R

Full Access
Question # 20

Which of the following types of analysis is used when comparing last week's sales to the previous week's sales?

A.

Trend analysis

B.

Exploratory analysis

C.

Prescriptive analysis

D.

Link analysis

Full Access
Question # 21

An analyst collected data that includes primary account numbers, expiration dates, and service codes. Which of the following data governance classifications is used to describe this data?

A.

PI I

B.

PCI

C.

PBI

D.

PHI

Full Access
Question # 22

The current date is July 14, 2020. A data analyst has been asked to create a report that shows the company's year-over-year Q2 2020 sales. Which of the following reports should the analyst compare?

A.

Q2 2020 and Q4 2019

B.

YTD 2020 and YTD 2019

C.

Q2 2020 and Q2 2019

D.

Q2 2020 and Q2 2021

Full Access
Question # 23

Which of the following should be accomplished NEXT after understanding a business requirement for a data analysis report?

A.

Rephrase the business requirement.

B.

Determine the data necessary for the analysis.

C.

Build a mock dashboard/presentation layout.

D.

Perform exploratory data analysis.

Full Access
Question # 24

Which of the following best describes the law of large numbers?

A.

As a sample size decreases, its standard deviation gets closer to the average of the whole population.

B.

As a sample size grows, its mean gets closer to the average of the whole population

C.

As a sample size decreases, its mean gets closer to the average of the whole population.

D.

When a sample size doubles. the sample is indicative of the whole population.

Full Access
Question # 25

Consider this dataset showing the retirement age of 11 people, in whole years:

54, 54, 54, 55, 56, 57, 57, 58, 58, 60, 60

This tables show a simple frequency distribution of the retirement age data.

A.

56

B.

55

C.

57

D.

54

Full Access
Question # 26

What category of data stewardship work is focused on ensuring that the organization respects the wishes of data subjects?

A.

Data quality.

B.

Data privacy.

C.

Data security.

D.

Regulatory compliance.

Full Access
Question # 27

What analytics suite is offered by Microsoft and directly integrates with SQL Server Databases?

A.

Qlik.

B.

Power BI.

C.

Domo.

D.

Dataroma.

Full Access
Question # 28

A data analyst needs to write a SOL query measuring last month's website visits and distribute a summary report to the marketing team. Which of the following is the analyst creating?

A.

Date range

B.

Distribution list

C.

Data content

D.

Report view

Full Access
Question # 29

Which of the following is an example of structured data?

A.

A credit card number

B.

An email

C.

A photo

D.

Social media correspondence

Full Access
Question # 30

A marketing analytics team received customer transaction data from two different sources. The data is complete and accurate; however, the field names appear to be inconsistent. Given the following tables:

Which of the following is considered best practice if the team wants to consolidate the files and conduct further analysis?

A.

Standardize the field names.

B.

Recode the data values.

C.

Overwrite the field names in one of the tables.

D.

Edit the field names in the data dictionary.

Full Access
Question # 31

An analyst wants to extract data from a variety of sources and store the data in a cloud-based environment prior to cleaning. Which of the following integration techniques should the analyst use?

A.

ETL

B.

API

C.

SQL

D.

ELT

Full Access
Question # 32

Given the following grocery store orders:

If a query is made to the table with the following logic:

Order_Total > 132 OR (Order Total >= 25 AND Order_Total < 74)

Which of the following is the number of orders that will be returned by the query?

A.

Four

B.

Five

C.

Six

D.

Seven

Full Access
Question # 33

Which of the following is a control measure for preventing a data breach?

A.

Data transmission

B.

Data attribution

C.

Data retention

D.

Data encryption

Full Access
Question # 34

A data analyst is creating a report that will provide information about various regions, products, and time periods. Which of the following formats would be the MOST efficient way to deliver this report?

A.

A workbook with multiple tabs for each region

B.

A daily email with snapshots of regional summaries

C.

A static report with a different page for every filtered view

D.

A dashboard with filters at the top that the user can toggle

Full Access
Question # 35

Encryption is a mechanism for protecting data.

When should encryption be applied to data?

Choose the best answer.

A.

When data is at rest.

B.

When data is at rest or in transit.

C.

When data is in transit.

D.

When data is at rest, unless you are using local storage.

Full Access
Question # 36

A report is scheduled to run and be distributed at the end of business each day. On Mondays, one of the recipients opens the previous week's reports and combines them to calculate the weekly totals and projections for the coming week. This is a tedious process, and the recipient asks an analyst for help. Which of the following should the analyst recommend?

A.

Add calculation fields to the daily report so the totals are built in.

B.

Create a new report with weekly totals set to run at the end of business on Friday.

C.

Provide a daily summary to the report with totals to save the user the effort of manual calculations.

D.

Reduce the frequency of the report to once a week and change the date range.

Full Access
Question # 37

A data analyst needs to create a data visualization that aids in un the cumulative impact of sequentially introduced values that are positive or negative. Which of the following

data visualization methods should the analyst use?

A.

A bubble chart

B.

A waterfall chart

C.

A scatter plot

D.

A line chart

Full Access
Question # 38

Standardized tests are given to students in the middle of each month, and the results are ready by the end of the month. The superintendent needs a quick view of test performance. Which of the following would be the best recommendation to meet the superintendent's requirements?

A.

A dashboard with a continuous data stream and saved searches

B.

A report of test scores by classroom, emailed to the superintendent at the end of the month

C.

A report of test scores with pie charts showing student performance

D.

A dashboard with a scheduled delivery, the ability to filter scores by school, and bar charts for comparison

Full Access
Question # 39

Given the following data tables:

Which of the following MDM processes needs to take place FIRST?

A.

Creation of a data dictionary

B.

Compliance with regulations

C.

Standardization of data field names

D.

Consolidation of multiple data fields

Full Access
Question # 40

An analyst is working with the income data of suburban families in the United States. The data set has a lot of outliers, and the analyst needs to provide a measure that represents the typical income. Which of the following would BEST fulfill the analyst’s goal?

A.

Median

B.

Mean

C.

Mode

D.

Standard deviation

Full Access
Question # 41

A data analyst has a set with more than 40.000 rows in the sample schema below:

The analyst would like to create one column that contains the customers’ birth dates. Which of the following data quality dimensions would BEST explain the reason for compilation?

A.

Data accuracy

B.

Data completeness

C.

Data duplication

D.

Data integrity

Full Access
Question # 42

An analyst has received the requirements for an internal user dashboard. The analyst confirms the data sources and then creates a wireframe. Which of the following is the NEXT step the analyst should take in the dashboard creation process?

A.

Optimize the dashboard.

B.

Create subscriptions.

C.

Get stakeholder approval.

D.

Deploy to production.

Full Access
Question # 43

Given the table below:

Which of the following variables can be considered inconsistent, and how many distinct values should the variable have?

A.

Name, one

B.

Gender, two

C.

Level, three

D.

Code, four

E.

Region, five

Full Access
Question # 44

A data set for sales per month includes the following data:

Which of the following cleaning and profiling methods should be applied to the data set?

A.

Data outliers

B.

Invalid data

C.

Duplicate data

D.

Data type validation

Full Access
Question # 45

Which one of the following in NOT a common data integration tool?

A.

XSS

B.

ELT

C.

ETL

D.

APIs

Full Access
Question # 46

An analyst is working on a project for a director. During this process. the analyst pulled the data. created summarized tables and graphs with descriptions, created a report summary, and inserted all items into a report. After writing the report, which of the following would be the most appropriate next step?

A.

Complete an audit on the data pulled for the report.

B.

Complete a check for quality in the report.

C.

Complete a review of the data and a check for consistency

D.

Complete a trend analysis to be included in the report.

Full Access
Question # 47

Joe. an analyst. tests the loading time on a dashboard he is preparing to go live and finds it is slower than he would like. Which of the following must occur to decrease the loading time?

A.

Deploy the dashboard to production.

B.

Change the field definitions.

C.

Update the dashboard subscribers.

D.

Optimize the dashboard.

Full Access
Question # 48

Which of the following data types must be used when working with variables that require classification into two or more groups before analysis?

A.

Discrete

B.

Numerical

C.

Alphanumeric

D.

Categorical

Full Access
Question # 49

An analyst has conducted a review of business questions. Which of the following should the analyst do next to conduct an analysis?

A.

Determine the data needs and review the observations.

B.

Determine the data needs and sources for analysis.

C.

Determine the data needs and schedule interviews.

D.

Determine the data needs and begin the analysis.

Full Access
Question # 50

Which of the following is a non-parametric test?

A.

One-sample t-test

B.

Two-way ANOVA

C.

Correlation coefficient

D.

Spearman's rank correlation

Full Access
Question # 51

Which of the following data sampling methods involves dividing a population into subgroups by similar characteristics?

A.

Systematic

B.

Simple random

C.

Convenience

D.

Stratified

Full Access
Question # 52

An analyst is required to run a text analysis of data that is found in articles from a digital news outlet. Which of the following would be the BEST technique for the analyst to apply to acquire the data?

A.

Web scraping

B.

Sampling

C.

Data wrangling

D.

ETL

Full Access
Question # 53

Which of the following is an example of a at flat file?

A.

CSV file

B.

PDF file

C.

JSON file

D.

JPEG file

Full Access
Question # 54

A military commander would like to see the health scorecards of the troops daily and filter them based on gender and rank. Considering this data is PHI, which of the following would be the best way for the commander to view the information?

A.

An emailed report

B.

A password-protected dashboard

C.

A daily printout of a report

D.

A cloud-hosted spreadsheet

Full Access
Question # 55

A user imports a data file into the accounts payable system each day. On a regular basis. the field input is not what the system is expecting. so it results in an error for the row and a broken import process. To resolve the issue, the user opens the file, finds the error in the row, and manually corrects it before attempting the import again. The import sometimes breaks on subsequent attempts. though. Which of the following changes should be made to this process to reduce the number of errors?

A.

Delete all incorrect inputs and upload the corrected file.

B.

Have the user manually review the file for data completeness before loading it

C.

Create a data field to data type validator to run the file through prior to import.

D.

Spot-check the file prior to import to catch and correct field errors.

Full Access
Question # 56

An e-commerce company recently tested a new website layout. The website was tested by a test group of customers, and an old website was presented to a control group. The table below shows the percentage of users in each group who made purchases on the websites:

Which of the following conclusions is accurate at a 95% confidence interval?

A.

In Germany, the increase in conversion from the new layout was not significant.

B.

In France, the increase in conversion from the new layout was not significant.

C.

In general, users who visit the new website are more likely to make a purchase.

D.

The new layout has the lowest conversion rates in the United Kingdom.

Full Access
Question # 57

An organization would like to add a secondary email field to its customer database in order to enrich the customer profiles. Which of the following data manipulation techniques should the analyst use to add this information?

A.

Blend

B.

Merge

C.

Append

D.

Aggregate

Full Access
Question # 58

Given the below:

Which of the following numbers represents a Type I error?

A.

1

B.

2

C.

3

D.

4

Full Access
Question # 59

An analyst modified a data set that had a number of issues. Given the original and modified versions:

Which of the following data manipulation techniques did the analyst use?

A.

Imputation

B.

Recoding

C.

Parsing

D.

Deriving

Full Access
Question # 60

Which one of the following would not normally be considered a summary statistic?

A.

z-score.

B.

Mean.

C.

Variance.

D.

Standard deviation.

Full Access
Question # 61

A data analyst has been asked to create a daily manufacturing report for the floor manager Which of the following metrics should be included in the report?

A.

Tons of steel produced per hour

B.

Annual sales budget

C.

End-of-day stock price

D.

Daily corporate employee count

Full Access
Question # 62

Given the information in the following tables:

Which of the following describes merging these tables to create a master file that includes all transactions for both online and in-store sales?

A.

Data audit

B.

Data completeness

C.

Data validation

D.

Data consolidation

Full Access
Question # 63

The director of operations at a power company needs data to help identify where company resources should be allocated in order to monitor activity for outages and restoration of power in the entire state. Specifically, the director wants to see the following:

* County outages

* Status

* Overall trend of outages

INSTRUCTIONS:

Please, select each visualization to fit the appropriate space on the dashboard and choose an appropriate color scheme. Once you have selected all visualizations, please, select the appropriate titles and labels, if applicable. Titles and labels may be used more than once.

If at any time you would like to bring back the initial state of the simulation, please click the Reset All button.

Full Access
Question # 64

Which of the following should be accomplished NEXT after understanding a business requirement for a data analysis report?

A.

Rephrase the business requirement.

B.

Determine the data necessary for the analysis

C.

Build a mock dashboard/presentation layout.

D.

Perform exploratory data analysis.

Full Access
Question # 65

Given the following:

Which of the following is the most important thing for an analyst to do when transforming the table for a trend analysis?

A.

Fill in the missing cost where it is null.

B.

Separate the table into two tables and create a primary key

C.

Replace the extended cost field with a calculated field.

D.

Correct the dates so they have the same format.

Full Access
Question # 66

A data analyst has received a data set that contains actual and projected sales for the fourth quarter of 2019. Which of the following statistical methods should the analyst use to find the measure of dispersion?

A.

Mean

B.

Variance

C.

Correlation

D.

Confidence interval

Full Access
Question # 67

Which of the following is the best description of discrete data types?

A.

Non-numeric data used to describe attributes of a population sample

B.

The frequency of the number of times each value occurs by using whole numbers

C.

Numeric values that can be measured on a continuous scale

D.

Non-numeric data used to describe attributes of a population sample ranked in a specific order

Full Access
Question # 68

You are working with a dataset and want to change the names of categories that you used for different types of books.

What term best describes this action?

A.

Recording.

B.

Summarizing

C.

Aggregating.

D.

Filtering.

Full Access
Question # 69

Given the following data table:

Which of the following are appropriate reasons to undertake data cleansing? (Select two).

A.

Non-parametric data

B.

Missing data

C.

Duplicate data

D.

Invalid data

E.

Redundant data

F.

Normalized data

Full Access
Question # 70

An analyst wants to check the progress and performance regarding the number of customers an organization served in the last six years. Which of the following represents the type of analysis the analyst should perform?

A.

Correlation analysis

B.

Trend analysis

C.

Regression analysis

D.

Descriptive analysis

Full Access
Question # 71

Given the customer table below:

Which of the following chart types is the most appropriate to represent the average spending of active customers vs. inactive customers?

A.

Pie chart

B.

Heat graph

C.

Scatter plot

D.

Line chart

Full Access
Question # 72

Which of the following differentiates a flat text file from other data types?

A.

Data is separated by a delimiter.

B.

Data is stored in defined rows.

C.

Data is defined with key-value pairs.

D.

Data is housed in a markup language.

Full Access
Question # 73

Which one the following is not considered an aggregate function?

A.

SUM

B.

MIN

C.

SELECT

D.

MAX

Full Access
Question # 74

A data analyst must separate the column shown below into multiple columns for each component of the name:

Which of the following data manipulation techniques should the analyst perform?

A.

Imputing

B.

Transposing

C.

Parsing

D.

Concatenating

Full Access
Question # 75

Which of the following reports can be used when insight into operational performance is needed each Wednesday?

A.

Static report

B.

Tactical report

C.

Recurring report

D.

Ad hoc report

Full Access
Question # 76

An analyst has been tracking company intranet usage and has been asked to create a chat to show the most-used/most-clicked portions of a homepage that contains more than 30 links. Which of the following visualizations would BEST illustrate this information?

A.

Scatter plot

B.

Heat map

C.

Pie chart

D.

Infographic

Full Access
Question # 77

A development company is constructing a new Init in its apartment complex. The complex has the following floor plans:

Using the average cost per square foot of the original floor plans. which of the following should be the price of the Rose Init?

A.

$640,900

B.

$690,000

C.

$705,200

D.

$702,500

Full Access
Question # 78

Which of the following is a best practice when updating a legacy data source?

A.

Placing old data in new fields

B.

Keeping only the most recent data

C.

Creating a codebook to document field changes

D.

Removing the data source from production

Full Access
Question # 79

A data engineer is creating a database field to capture whether a customer likes vanilla ice cream. Which of the following data types is the best to capture this information?

A.

Integer

B.

Boolean

C.

Categorical

D.

Numeric

Full Access
Question # 80

A data analyst wants to create "Income Categories" that would be calculated based on the existing variable "Income". The "Income Categories" would be as follows:

Income category 1: less than $1.

Income category 2: more than $1 and less than $20,000.

Income category 3: more than $20,001 and less than $40,000.

Income category 4: more than $40,001.

Which of the following data manipulation techniques should the data analyst use to create "Income Categories"?

A.

Data merge

B.

Derived variables

C.

Data blending

D.

Data append

Full Access
Question # 81

An analyst is designing a dashboard to determine which site has the highest percentage of new customers. The analyst must choose an appropriate chart to include in the dashboard. The following data is available:

Which of the following types of charts should be considered to BEST display the data?

A.

Include a bar chart using the site and the percentage of new customers data.

B.

Include a line chart using the site and the percentage of new customers data.

C.

Include a pie chat using the site and percentage of new customers data.

D.

Include a scatter chart using the site and the percent of new customers data.

Full Access
Question # 82

Which of the following data manipulation techniques is an example of a logical function?

A.

WHERE

B.

AGGREGATE

C.

BOOLEAN

D.

IF

Full Access
Question # 83

Given the table below:

Which of the following boxes indicates that a Type Il error has occurred?

A.

1

B.

2

C.

3

D.

4

Full Access
Question # 84

Which of the following would be used to store unstructured data from different sources?

A.

A data lake

B.

A database management system

C.

A database

D.

A data warehouse

Full Access
Question # 85

An analysts building a monthly report for production and wants to ensure the audience is aware of its once-a-month cadence. Which of the following is the MOST important to convey that information?

A.

The date of the dashboard build

B.

The data refresh date

C.

A report summary

D.

Frequently asked questions

Full Access
Question # 86

Which of the following is a process that is used during data integration to collect, blend, and load data?

A.

MDM

B.

ETL

C.

OLTP

D.

BI

Full Access
Question # 87

Which of the following statistical methods requires two or more categorical variables?

A.

Simple linear regression

B.

Chi-squared test

C.

Z-test

D.

Two-sample t-test

Full Access
Question # 88

A customer's telephone number is in the format 123-456-7890. Which of the following data types is used for the phone number?

A.

Boolean

B.

Date

C.

Text

D.

Number

Full Access
Question # 89

Which of the following statements would be used to append two tables that have the same number of columns?

A.

UNION ALL

B.

MERGE

C.

GROUP BY

D.

JOIN

Full Access
Question # 90

A sales manager wants quarterly sales reports broken down by unit and week. Which of the following data output lists includes the most necessary information?

A.

Order number. salesperson. date shipped, recipient address, and price

B.

Item name, salesperson. recipient address, shipping cost. and date shipped

C.

Item number, item name, salesperson. date sold. and price

D.

Item name. salesperson. price. shipping cost. and date shipped

Full Access
Question # 91

Which of the following will MOST likely be streamed live?

A.

Machine data

B.

Key-value pairs

C.

Delimited rows

D.

Flat files

Full Access
Question # 92

Jenny wants to study the academic performance of undergraduate sophomores and wants to determine the average grade point average at different points during an academic year.

What best describes the data set she needs?

A.

Sample.

B.

Observation.

C.

Variable.

D.

Population.

Full Access
Question # 93

A gambler thinks that a coin is fair and is equally likely to turn up heads or tails when the coin is flipped. Which of the following tests should the gambler use to fest this hypothesis?

A.

t-test

B.

Chi-squared test

C.

Rank sum test

D.

Ratio test

Full Access
Question # 94

Different people manually type a series of handwritten surveys into an online database. Which of the following issues will MOST likely arise with this data? (Choose two.)

A.

Data accuracy

B.

Data constraints

C.

Data attribute limitations

D.

Data bias

E.

Data consistency

F.

Data manipulation

Full Access