Special Summer Sale - Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: mxmas70

Home > CompTIA > CompTIA Data+ > DA0-001

DA0-001 CompTIA Data+ Certification Exam Question and Answers

Question # 4

A web developer wants to ensure that malicious users can't type SQL statements when they asked for input, like their username/userid.

Which of the following query optimization techniques would effectively prevent SQL Injection attacks?

A.

Indexing.

B.

Subset of records.

C.

Temporary table in the query set.

D.

Parametrization.

Full Access
Question # 5

A data analyst has been asked to create an ad-hoc sales report for the Chief Executive Officer (CEO).

Which of the following should be included in the report?

A.

The sales representatives' home addresses.

B.

Line-item SKU numbers.

C.

YTD total sales.

D.

The customers' first and last names.

Full Access
Question # 6

A marketing analytics team received customer transaction data from two different sources. The data is complete and accurate; however, the field names appear to be inconsistent. Given the following tables:

Which of the following is considered best practice if the team wants to consolidate the files and conduct further analysis?

A.

Standardize the field names.

B.

Recode the data values.

C.

Overwrite the field names in one of the tables.

D.

Edit the field names in the data dictionary.

Full Access
Question # 7

After the daily ETL jobs are completed, the data in the reports does not appear complete, and a lot of data seems to be missing. Which of the following concepts should be used to assess and investigate further?

A.

Cross-validation

B.

Data profiling

C.

Data integrity

D.

Data consistency

Full Access
Question # 8

Which of the following is an example of discrete data?

A.

The number of employees at a company

B.

The amount of rain that falls in a storm

C.

The temperature at a weather station

D.

The power consumption in a building

Full Access
Question # 9

Which of the following types of dashboards should a business intelligence engineer develop in order to provide information about failed data pipelines?

A.

Referencing

B.

Strategic

C.

Operational

D.

Technical

Full Access
Question # 10

An e-commerce company recently tested a new website layout. The website was tested by a test group of customers, and an old website was presented to a control group. The table below shows the percentage of users in each group who made purchases on the websites:

Which of the following conclusions is accurate at a 95% confidence interval?

A.

In Germany, the increase in conversion from the new layout was not significant.

B.

In France, the increase in conversion from the new layout was not significant.

C.

In general, users who visit the new website are more likely to make a purchase.

D.

The new layout has the lowest conversion rates in the United Kingdom.

Full Access
Question # 11

Mario works with a group of R programmers tasked with copying data from an accounting system into a data warehouse.

In what phase are the group's R skills most relevant?

A.

Extract.

B.

Load.

C.

Transform.

D.

Purge.

Full Access
Question # 12

An analyst has been asked to validate data quality. Which of the following are the BEST reasons to validate data for quality control purposes? (Choose two.)

A.

Retention

B.

Integrity

C.

Transmission

D.

Consistency

E.

Encryption

F.

Deletion

Full Access
Question # 13

An analyst is working on a project for a director. During this process. the analyst pulled the data. created summarized tables and graphs with descriptions, created a report summary, and inserted all items into a report. After writing the report, which of the following would be the most appropriate next step?

A.

Complete an audit on the data pulled for the report.

B.

Complete a check for quality in the report.

C.

Complete a review of the data and a check for consistency

D.

Complete a trend analysis to be included in the report.

Full Access
Question # 14

A data analyst is designing a dashboard that will provide a story of sales and determine which site is providing the highest sales volume per customer The analyst must choose an appropriate chart to include in the dashboard. The following data is available:

Which of the following types of charts should be considered?

A.

Include a line chart using the site and average sales per customer.

B.

Include a pie chart using the site and sales to average sales per customer.

C.

Include a scatter chart using sales volume and average sales per customer.

D.

Include a column chart using the site and sales to average sales per customer.

Full Access
Question # 15

Which one of the following is a common data warehouse schema?

A.

Snowflake.

B.

Square.

C.

Spiral.

D.

Sphere.

Full Access
Question # 16

Which of the following statistical methods requires two or more categorical variables?

A.

Simple linear regression

B.

Chi-squared test

C.

Z-test

D.

Two-sample t-test

Full Access
Question # 17

A data analyst has been asked to organize the table below in the following ways:

By sales from high to low -

By state in alphabetic order -

Which of the following functions will allow the data analyst to organize the table in this manner?

A.

Conditional formatting

B.

Grouping

C.

Filtering

D.

Sorting

Full Access
Question # 18

A company's human resources department has asked a data analyst to categorize the income of all employees into five salary bands:

Which of the following types of functions would be the most appropriate to use?

A.

Statistical

B.

Aggregate

C.

Logical

D.

Mathematical

Full Access
Question # 19

Which of the following describes the method of sampling in which elements of data are selected randomly from each of the small subgroups within a population?

A.

Simple random

B.

Cluster

C.

Systematic

D.

Stratified

Full Access
Question # 20

Encryption is a mechanism for protecting data.

When should encryption be applied to data?

Choose the best answer.

A.

When data is at rest.

B.

When data is at rest or in transit.

C.

When data is in transit.

D.

When data is at rest, unless you are using local storage.

Full Access
Question # 21

A cereal manufacturer wants to determine whether the sugar content of its cereal has increased over the years. Which of the following is the appropriate descriptive statistic to use?

A.

Frequency

B.

Percent change

C.

Variance

D.

Mean

Full Access
Question # 22

An analyst needs to provide a chart to identify the composition between the categories of the survey response data set:

Which of the following charts would be BEST to use?

A.

Histogram

B.

Pie

C.

Line

D.

Scatter pot

E.

Waterfall

Full Access
Question # 23

Which of the following data protection methods provides confidentiality for data in transit?

A.

De-identification

B.

Encryption

C.

Masking

D.

Anonymization

Full Access
Question # 24

Which of the following data cleansing issues will be fixed when a DISTINCT function is applied?

A.

Missing data

B.

Duplicate data

C.

Redundant data

D.

Invalid data

Full Access
Question # 25

A data analyst needs to apply quality control concepts to a data set for accuracy. Which of the following is the best way to do this?

A.

Standardization

B.

Parameterization

C.

Encryption

D.

Cross-validation

Full Access
Question # 26

A customer list from a financial services company is shown below:

A data analyst wants to create a likely-to-buy score on a scale from 0 to 100, based on an average of the three numerical variables: number of credit cards, age, and income. Which of the following should the analyst do to the variables to ensure they all have the same weight in the score calculation?

A.

Recode the variables.

B.

Calculate the percentiles of the variables.

C.

Calculate the standard deviations of the variables.

D.

Normalize the variables.

Full Access
Question # 27

An analyst has been tracking company intranet usage and has been asked to create a chat to show the most-used/most-clicked portions of a homepage that contains more than 30 links. Which of the following visualizations would BEST illustrate this information?

A.

Scatter plot

B.

Heat map

C.

Pie chart

D.

Infographic

Full Access
Question # 28

A sales manager requested a report that contains the first name, last name, and phone number of all the company’s customers and employees. The data engineer needs to return all the records from several tables, even duplicates. Which of the following is the best way to join the two tables?

A.

FULL OUTER JOIN

B.

INNER JOIN

C.

LEFT OUTER JOIN

D.

CROSS JOIN

Full Access
Question # 29

Which of the following BEST describes standard deviation?

A.

A measure that is used to establish a relationship between two variables

B.

A measure of how data is distributed

C.

A measure of the amount of dispersion of a set of values

D.

A measure that is used to find the significant difference between variables

Full Access
Question # 30

A data analyst needs to create a data visualization that aids in un the cumulative impact of sequentially introduced values that are positive or negative. Which of the following

data visualization methods should the analyst use?

A.

A bubble chart

B.

A waterfall chart

C.

A scatter plot

D.

A line chart

Full Access
Question # 31

An analyst for a small business with multiple locations is using each location’s quarterly sales reports from last year to create a single revenue report for the year. Which of the following data mining techniques should the analyst use to complete this task?

A.

Data merge

B.

Data append

C.

Data blending

D.

Data imputation

Full Access
Question # 32

A data analyst must fulfill a request for information that is needed weekly and should be automatically emailed to a specific set of users. Which of the following types of reports should theanalyst recommend?

A.

A self-service report

B.

A research report

C.

An ad hoc report

D.

An operational report

Full Access
Question # 33

Which one of the following programming languages is specifically designed for use in analytics applications?

A.

Python.

B.

R

C.

C++

D.

Java.

Full Access
Question # 34

An e-commerce company recently tested a new website layout. The website was tested by a test group of customers, and an old website was presented to a control group. The table below shows the percentage of users in each group who made purchases on the websites:

Which of the following conclusions is accurate at a 95% confidence interval?

A.

In Germany, the increase in conversion from the new layout was not significant.

B.

In France, the increase in conversion from the new layout was not significant.

C.

In general, users who visit the new website are more likely to make a purchase.

D.

The new layout has the lowest conversion rates in the United Kingdom.

Full Access
Question # 35

An analyst is currently working on a ticket to revamp a company-wide dashboard that has been in use for five years. Which of the following should be the first step in the development process?

A.

Talk to the group that made the request to determine the desired goal.

B.

Make changes to a frequently used report that is already in production.

C.

Build an additional dashboard with fewer views tailored toward each specific team.

D.

Develop a more streamlined dashboard to roll out by the next delivery date.

Full Access
Question # 36

Which of the following is the best description of the term "data governance"?

A.

Data governance governs the development of a data visualization dashboard in an organization.

B.

Data governance is the policy that protects against data breaches by cybercriminals.

C.

Data governance is the process of analyzing, manipulating, and reporting data in an organization.

D.

Data governance is the availability, usability, integrity, and security of data in an enterprise.

Full Access
Question # 37

Given the following data table:

Which of the following are appropriate reasons to undertake data cleansing? (Select two).

A.

Non-parametric data

B.

Missing data

C.

Duplicate data

D.

Invalid data

E.

Redundant data

F.

Normalized data

Full Access
Question # 38

Which of the following best describes an exploratory analysis?

A.

Involves the use of descriptive statistics to understand observations

B.

Involves analysis of exploring data sets for performance tracking

C.

Involves the testing of specific hypotheses

D.

Involves the use of arithmetic algebra to determine the distribution

Full Access
Question # 39

Which of the following is a non-parametric test?

A.

One-sample t-test

B.

Two-way ANOVA

C.

Correlation coefficient

D.

Spearman's rank correlation

Full Access
Question # 40

Given the diagram below:

Which of the following types of sampling is depicted in the image?

A.

Stratified

B.

Random

C.

Cluster

D.

Systematic

Full Access
Question # 41

An organization would like to add a secondary email field to its customer database in order toenrich the customer profiles. Which of the following data manipulation techniques should the analyst use to add this information?

A.

Blend

B.

Merge

C.

Append

D.

Aggregate

Full Access
Question # 42

Joe. an analyst. tests the loading time on a dashboard he is preparing to go live and finds it is slower than he would like. Which of the following must occur to decrease the loading time?

A.

Deploy the dashboard to production.

B.

Change the field definitions.

C.

Update the dashboard subscribers.

D.

Optimize the dashboard.

Full Access
Question # 43

Which one of the following would not normally be considered a summary statistic?

A.

z-score.

B.

Mean.

C.

Variance.

D.

Standard deviation.

Full Access
Question # 44

An analyst computed a new variable of income per day in the household by multiplying the number of days worked by the number of people working in the household and the income earned per day. Which of the following is the correct name for this new variable?

A.

Derived

B.

Categorical

C.

Continuous

D.

Control

Full Access
Question # 45

You are working with a professional statistician to perform an analysis and would like to use a statistics package.

Which one of the following would be the most appropriate?

A.

Rapid Miner.

B.

QLIK.

C.

Power BI.

D.

Minitab.

Full Access
Question # 46

Which of the following is a domain-specific language used in programming that is designed for managing data that is held in a relational data stream management system?

A.

SAS

B.

SQL

C.

Python

D.

R

Full Access
Question # 47

Standardized tests are given to students in the middle of each month, and the results are ready by the end of the month. The superintendent needs a quick view of test performance. Which of the following would be the best recommendation to meet the superintendent's requirements?

A.

A dashboard with a continuous data stream and saved searches

B.

A report of test scores by classroom, emailed to the superintendent at the end of the month

C.

A report of test scores with pie charts showing student performance

D.

A dashboard with a scheduled delivery, the ability to filter scores by school, and bar charts for comparison

Full Access
Question # 48

An analyst is working with a data set that lists individuals' first and last names in separate columns. Which of the following processes should the analyst use to combine the first and last names into a single spreadsheet cell?

A.

Transpose

B.

Blend

C.

Concatenate

D.

Merges

Full Access
Question # 49

Which of the following data types best describe 4Ac1? (Select two).

A.

Alphanumeric

B.

Symbolic

C.

Numeric

D.

Float

E.

Boolean

F.

String

Full Access
Question # 50

An analyst is working with the income data of suburban families in the United States. The data set has a lot of outliers, and the analyst needs to provide a measure that represents the typical income. Which of the following would BEST fulfill the analyst’s goal?

A.

Median

B.

Mean

C.

Mode

D.

Standard deviation

Full Access
Question # 51

An analyst wants to test the association between the number of doors in a car and the number of gears in the car. Which of the following is the best test to use?

A.

F-test

B.

Acceptance test

C.

Chi-squared test

D.

Z-test

Full Access
Question # 52

Given the following data:

CustomerID

ItemBought

Date

Tre_234

Sofa

2022-09-08

216_Tre

Shoes

08/02/2021

215/Tre

Blanket

2021/06/20

045/Tre

Mug

12-26-2021

Tre-345

Lamp

31/08/2022

TREJD19

Bucket

2022'08/01

Which of the following best describes the main issue in the data set?

A.

Inconsistent data

B.

Data mismatch

C.

Invalid data

D.

Redundant data

Full Access
Question # 53

Which of the following is an example of structured data?

A.

A credit card number

B.

An email

C.

A photo

D.

Social media correspondence

Full Access
Question # 54

Given the following data tables:

Which of the following MDM processes needs to take place FIRST?

A.

Creation of a data dictionary

B.

Compliance with regulations

C.

Standardization of data field names

D.

Consolidation of multiple data fields

Full Access
Question # 55

Which of the following is most likely to be used as a data-mining ETL tool?

A.

SSIS

B.

Stata

C.

SPSS

D.

Cognos

Full Access
Question # 56

A data analyst is developing a data dictionary that aligns with a company's data management processes and policies. Which of the following best describes what should be included in the data dictionary?

A.

Information containing the links to business data

B.

Information explaining the business methodologies

C.

Information containing definitions of the business data

D.

Information describing the data analysis phases

Full Access
Question # 57

Five dogs have the following heights in millimeters:

300, 430, 170, 470, 600

Which of the following is the mean height for the five dogs?

A.

394mm

B.

405mm

C.

493mm

D.

504mm

Full Access
Question # 58

A site reliability team wants to monitor the stability of their website. so they can proactively diagnose issues when they occur Which of the following deliverables would best suit their needs?

A.

A self-serve dashboard of website performance that updates in real time

B.

A weekly log report of site visits and user actions

C.

A portal that is refreshed daily and reports errors classified by type

D.

A daily summary email indicating website outages for the previous day

Full Access
Question # 59

An analyst must obtain the average daily sales for the following week:

Which of the following must the analyst perform to obtain this value?

A.

Data normalization

B.

Data append

C.

Data aggregation

D.

Data blending

Full Access
Question # 60

Which of the following report types is most appropriate for a high-level, year-end report requested by a Chief Executive Officer?

A.

Dynamic

B.

Recurring

C.

Ad hoc

D.

Self-service

Full Access
Question # 61

Which of the following is concatenate typically used to combine?

A.

Rows

B.

Columns

C.

Tables

D.

Databases

Full Access
Question # 62

An analyst runs a report on a daily basis, and the number of datapoints must be validated before the data can be analyzed. The number of datapoints increases each day by approximately 20% of the total number from the day before. On a given day, the number of datapoints was 8,798. Which of the following should be the total number of datapoints on the next day?

A.

7,038

B.

9,600

C.

10,600

D.

10,800

Full Access
Question # 63

Jhon is working on an ELT process that sources data from six different source systems.

Looking at the source data, he finds that data about the sample people exists in two of six systems.

What does he have to make sure he checks for in his ELT process?

Choose the best answer.

A.

Duplicate Data.

B.

Redundant Data.

C.

Invalid Data.

D.

Missing Data.

Full Access
Question # 64

Kelly wants to get feedback on the final draft of a strategic report that has taken her six months to develop.

What can she do to get prevent confusion as see seeks feedback before publishing the report?

Choose the best answer.

A.

Distribute the report to the appropriate stakeholders via email.

B.

Use a watermark to identify the report as a draft.

C.

Show the report to her immediate supervisor.

D.

Publish the report on an internally facing website.

Full Access
Question # 65

A data analyst is attempting to understand how ice cream consumption is affected by different attributes. such as cost, temperature. and income level. Which of the following

regression analyses should the data analyst perform to understand this relationship?

A.

Logistic

B.

Ordinary least squares

C.

Cox

D.

Polynomial

Full Access
Question # 66

An analyst needs to summarize the number of people in Chicago in 2022 using the following set of data:

Which of the following steps should the analyst use to provide results? (Select two).

A.

Aggregation

B.

Sorting

C.

Filtering

D.

Indexing

E.

Cleaning

F.

Replacing

Full Access
Question # 67

An analyst is creating a resource to improve users' experience when they select specific records based on particular dates. Which of the following should the analyst use to create a resource that best meets user needs?

A.

Drop-down menu

B.

Date range

C.

Text field

D.

Frequency

Full Access
Question # 68

The duration of a phone call in milliseconds is an example of:

A.

ordinal data.

B.

nominal data.

C.

boolean data.

D.

continuous data.

Full Access
Question # 69

An analyst collected data that includes primary account numbers, expiration dates, and service codes. Which of the following data governance classifications is used to describe this data?

A.

PI I

B.

PCI

C.

PBI

D.

PHI

Full Access
Question # 70

During data cleansing, an analyst conducts measures of central tendency on a data set. Which of the following data is the analyst attempting to identify?

A.

Duplicate

B.

Missing

C.

Outlying

D.

Invalid

Full Access
Question # 71

A data analyst needs to create a dashboard using the company's yearly revenue data sets. Which of the following would be the best way to plot the information to show the top-performing region?

A.

A line chart

B.

A waterfall chart

C.

A heat map

D.

A stacked bar chart

Full Access
Question # 72

Which of the following is an example of a at flat file?

A.

CSV file

B.

PDF file

C.

JSON file

D.

JPEG file

Full Access
Question # 73

Which of the following tools would be best to use to calculate the interquartile range, median, mean, and standard deviation of a column in a table that has 5.000.000 rows?

A.

Microsoft Excel

B.

R

C.

Snowflake

D.

SQL

Full Access
Question # 74

Which of the following data types must be used when working with variables that require classification into two or more groups before analysis?

A.

Discrete

B.

Numerical

C.

Alphanumeric

D.

Categorical

Full Access
Question # 75

Which of the following is the best description of discrete data types?

A.

Non-numeric data used to describe attributes of a population sample

B.

The frequency of the number of times each value occurs by using whole numbers

C.

Numeric values that can be measured on a continuous scale

D.

Non-numeric data used to describe attributes of a population sample ranked in a specific order

Full Access
Question # 76

A data analyst needs to create a master file that includes customer information from the tables below:

Given the three tables above, the analyst wants to filter down the information prior to joining it together. In which of the following orders should this data manipulation bo approached for the most efficient result?

A.

Merge, append, deduplicate

B.

Merge, deduplicate, append

C.

Deduplicate, append, merge

D.

Append, deduplicate, merge

Full Access
Question # 77

A report is scheduled to run and be distributed at the end of business each day. On Mondays, one of the recipients opens the previous week's reports and combines them to calculate the weekly totals and projections for the coming week. This is a tedious process, and the recipient asks an analyst for help. Which of the following should the analyst recommend?

A.

Add calculation fields to the daily report so the totals are built in.

B.

Create a new report with weekly totals set to run at the end of business on Friday.

C.

Provide a daily summary to the report with totals to save the user the effort of manual calculations.

D.

Reduce the frequency of the report to once a week and change the date range.

Full Access
Question # 78

A research analyst wants to determine whether the data being analyzed is connected to other datapoints. Which of the following is the BEST type of analysis to conduct?

A.

Trend analysis

B.

Performance analysis

C.

Link analysis

D.

Exploratory analysis

Full Access
Question # 79

Which of the following analysis techniques is an unsupervised data mining process?

A.

Clustering

B.

Descriptive

C.

Regression

D.

Predictive

Full Access
Question # 80

Which of the following actions should be taken when transmitting data to mitigate the chance of a data leak occurring? (Choose two.)

A.

Data identification

B.

Data processing

C.

Data Reporting

D.

Data encryption

E.

Data masking

F.

Fata removal

Full Access
Question # 81

Which of the following is an object associated with a table that sorts and stores table row data in a key-value pair?

A.

Foreign key

B.

Function

C.

Stored procedure

D.

Clustered index

Full Access
Question # 82

What analytics suite is offered by Microsoft and directly integrates with SQL Server Databases?

A.

Qlik.

B.

Power BI.

C.

Domo.

D.

Dataroma.

Full Access
Question # 83

An organization wants to evaluate whether project activities are within the set projections and in line to meet the desired project targets. Which of the following types of analysis is best suited for this situation?

A.

Trend analysis

B.

Performance analysis

C.

Descriptive analysis

D.

Exploratory analysis

Full Access
Question # 84

You have two databases tables that you would like to join together using a foreign key relationship.

What term best describes this action?

A.

Blending.

B.

Appending.

C.

Mixing.

D.

Merging.

Full Access
Question # 85

Emma is working in a data warehouse and finds a finance fact table links to an organization dimension, which in turn links to a currency dimension that not linked to the fact table.

What type of design pattern is the data warehouse using?

A.

Star.

B.

Sun.

C.

Snowflake.

D.

Comet.

Full Access
Question # 86

Joseph is interpreting a left skewed distribution of test scores. Joe scored at the mean, Alfonso scored at the median, and gaby scored and the end of the tail.

Who had the highest score?

A.

Joseph

B.

Joe

C.

Alfonso

D.

Gaby

Full Access
Question # 87

A financial analyst is creating a daily billing report for a company. One night, the company's data warehouse did not update the data, which caused the data to be reported incorrectly the next day. Which of the following documentation elements should the analyst add to catch this error?

A.

Version number

B.

Data refresh

C.

Frequently asked questions tab

D.

Summary

Full Access
Question # 88

A database administrator is required to mask certain table columns containing Pll in order to comply with the company privacy policy. Which of the following are the most likely types of information the administrator should mask? (Select two).

A.

Government-issued ID

B.

Address

C.

Order ID

D.

Order date

E.

Customer ID

F.

Referral number

Full Access
Question # 89

A database administrator needs to increase performance on a large dimension table. Which of the following is the best way to accomplish this task?

A.

Sampling

B.

Partitioning

C.

Windowing

D.

Sorting

Full Access
Question # 90

Given the following report:

Which of the following components need to be added to ensure the report is point-in-time and static? (Select two).

A.

A control group for the phrases

B.

A summary of the KPIs

C.

Filter buttons for the status

D.

The date when the report was last accessed

E.

The time period lhe report covers

F.

The date on which the report was run

Full Access
Question # 91

A data analyst needs to calculate the mean for Q1 sales using the data set below:

Which of the following is the mean?

A.

$2,466.18

B.

$2,667.60

C.

$3,082.72

D.

$12,330.88

Full Access
Question # 92

A data analyst wants to create "Income Categories" that would be calculated based on the existing variable "Income". The "Income Categories" would be as follows:

Income category 1: less than $1.

Income category 2: more than $1 and less than $20,000.

Income category 3: more than $20,001 and less than $40,000.

Income category 4: more than $40,001.

Which of the following data manipulation techniques should the data analyst use to create "Income Categories"?

A.

Data merge

B.

Derived variables

C.

Data blending

D.

Data append

Full Access
Question # 93

Consider the following dataset which contains information about houses that are for sale:

Which of the following string manipulation commands will combine the address and region namecolumns to create a full address?

full_address------------------------- 85 Turner St, Northern Metropolitan 25 Bloomburg St, Northern Metropolitan 5 Charles St, Northern Metropolitan 40 Federation La, Northern Metropolitan 55a Park St, Northern Metropolitan

A.

SELECT CONCAT(address, ' , ' , regionname) AS full_address FROM melb LIMIT 5;

B.

SELECT CONCAT(address, '-' , regionname) AS full_address FROM melb LIMIT 5;

C.

SELECT CONCAT(regionname, ' , ' , address) AS full_address FROM melb LIMIT 5

D.

SELECT CONCAT(regionname, '-' , address) AS full_address FROM melb LIMIT 5;

Full Access
Question # 94

Given the below:

Which of the following numbers represents a Type I error?

A.

1

B.

2

C.

3

D.

4

Full Access
Question # 95

A data analyst needs to perform a full outer join of a customer's orders using the tables below:

Which of the following is the mean of the order quantity?

A.

73.5

B.

76.5

C.

78.8

D.

81.5

Full Access
Question # 96

A data analyst is helping a retail store categorize its customers into five different groups based on the following information:

• How recently the customers made purchases

• How frequently the customers made purchases

• How much the customers spent

Given the following information:

Which of the following would be most important for the analysis?

A.

CustomerJD. Channel, Order_Date

B.

CustomerJD, Territory. Amount

C.

CustomerJD, Order_Date. Amount

D.

CustomerJD. Quantity, Amount

Full Access
Question # 97

An analyst needs to provide a chart to identify the composition between the categories of the survey response data set:

Which of the following charts would be BEST to use?

A.

Histogram

B.

Pie

C.

Line

D.

Scatter pot

E.

Waterfall

Full Access
Question # 98

A business intelligence engineer needs to reduce the size of a data model for reporting purposes. The data set contains more than one million rows, and the table has a date-time column named Date. Which of the following should the analyst do to complete this task?

A.

Change the data type of the Date column to text.

B.

Trim the date.

C.

Round the hour of the Date column to the start of the hour.

D.

Split the Date column into two columns—time and date.

Full Access
Question # 99

Which of the following is a relational database?

A.

SQL

B.

Excel

C.

JSON

D.

NoSQL

Full Access
Question # 100

A sales team wants visibility of current sales numbers, pipeline, and team performance. The team would also like to see calculations of individuals’ earned commissions and projected commissions based on sales, but they want that information to be kept confidential. Which of the following would be the BEST way to provide this visibility?

A.

Create a dashboard displaying a data refresh date so users know the current sales numbers and configure permissions to control access.

B.

Create a dashboard for sales numbers, pipeline, and team and individual performance for the management team.

C.

Create a dashboard with filters for the overall team, individuals, and management. Users can filter to see the data they want.

D.

Create a dashboard with views for team, individuals, and management. Configure permissions to control access.

Full Access
Question # 101

Which of the following data types should an analyst use to provide the most flexibility when recording emails on a form?

A.

Alphanumeric

B.

Text

C.

Discrete

D.

Continuous

Full Access
Question # 102

A data analyst has removed the outliers from a data set due to large variances. Which of the following central tendencies would be the best measure to use?

A.

Range

B.

Mean

C.

Mode

D.

Median

Full Access
Question # 103

Which of the following best describes a 95% confidence interval?

A.

There is a 95% probability that a sample is within one standard deviation of the mean.

B.

A stated range may contain 95% of the population mean, 95% of the time.

C.

A set of ranges contains the population mean with 95% certainty.

D.

A range contains 95% of the population mean.

Full Access
Question # 104

A publishing group has requested a dashboard to track submissions before publication. A key requirement is that all changes are tracked, as multiple users will be checking out documents and editing them before submissions are considered final. Which of the following is the BEST way to meet this stakeholder requirement?

A.

Display the version number next to each submission on the dashboard.

B.

Present a data refresh date at the top of the dashboard.

C.

Confirm the dashboard is adhering to the corporate style guide.

D.

Use permissions to ensure users only see certain versions of the submissions.

Full Access
Question # 105

A data analyst is asked to create a sales report for the second-quarter 2020 board meeting, which will include a review of the business’s performance through the second quarter. The board meeting will be held on July 15, 2020, after the numbers are finalized. Which of the following report types should the data analyst create?

A.

Static

B.

Real-time

C.

Self-service

D.

Dynamic

Full Access
Question # 106

While reviewing survey data, a research analyst notices data is missing from all the responses to a single question. Which of the following methods would BEST address this issue?

A.

Replace missing data.

B.

Remove duplicate data.

C.

Replace redundant data.

D.

Remove invalid data.

Full Access
Question # 107

A business unit made the following modification to the values in a table:

Which of the following data quality dimensions was applied in this scenario?

A.

Integrity

B.

Consistency

C.

Completeness

D.

Accuracy

Full Access
Question # 108

Amanda needs to create a dashboard that will draw information from many other data sources and present it to business leaders.

Which one of the following tools is least likely to meet her needs?

A.

QuickSight.

B.

Tableau.

C.

Power BI.

D.

SPSS Modeler.

Full Access