00:00

QUESTION 6

A data analyst has recently joined a new team that uses Databricks SQL, but the analyst has never used Databricks before. The analyst wants to know where in Databricks SQL they can write and execute SQL queries.
On which of the following pages can the analyst write and execute SQL queries?

Correct Answer: E
The SQL Editor page is where the analyst can write and execute SQL queries in Databricks SQL. The SQL Editor page has a query pane where the analyst can type or paste SQL statements, and a results pane where the analyst can view the query results in a table or a chart. The analyst can also browse data objects, edit multiple queries, execute a single query or multiple queries, terminate a query, save a query, download a query result, and more from the SQL Editor page. References: Create a query in SQL editor

QUESTION 7

A business analyst has been asked to create a data entity/object called sales_by_employee. It should always stay up-to-date when new data are added to the sales table. The new entity should have the columns sales_person, which will be the name of the employee from the employees table, and sales, which will be all sales for that particular sales person. Both the sales table and the employees table have an employee_id column that is used to identify the sales person.
Which of the following code blocks will accomplish this task?
A)
Databricks-Certified-Data-Analyst-Associate dumps exhibit
B)
Databricks-Certified-Data-Analyst-Associate dumps exhibit
C)
Databricks-Certified-Data-Analyst-Associate dumps exhibit
D)
Databricks-Certified-Data-Analyst-Associate dumps exhibit

Correct Answer: D
The SQL code provided in Option D is the correct way to create a view named sales_by_employee that will always stay up-to-date with the sales and employees tables. The code uses the CREATE OR REPLACE VIEW statement to define a new view that joins the sales and employees tables on the employee_id column. It selects the employee_name as sales_person and all sales for each employee, ensuring that the data entity/object is always up-to-date when new data are added to these tables.
References: The answer can be verified from Databricks SQL documentation which provides insights on creating views using SQL queries, joining tables, and selecting specific columns to be included in the view. Reference link: Databricks SQL

QUESTION 8

A data analyst has been asked to use the below tablesales_tableto get the percentage rank of products within region by the sales:
Databricks-Certified-Data-Analyst-Associate dumps exhibit
The result of the query should look like this:
Databricks-Certified-Data-Analyst-Associate dumps exhibit
Which of the following queries will accomplish this task?
A)
Databricks-Certified-Data-Analyst-Associate dumps exhibit
B)
Databricks-Certified-Data-Analyst-Associate dumps exhibit
C)
Databricks-Certified-Data-Analyst-Associate dumps exhibit
D)
Databricks-Certified-Data-Analyst-Associate dumps exhibit

Correct Answer: B
The correct query to get the percentage rank of products within region by the sales is option B. This query uses the PERCENT_RANK() window function to calculate the relative rank of each product within each region based on the sales amount. The window function is partitioned by region and ordered by sales in descending order. The result is aliased as rank and displayed along with the region and product columns. The other options are incorrect because:
✑ A. Option A uses the RANK() window function instead of the PERCENT_RANK() function. The RANK() function returns the rank of each row within the partition, but not the percentage rank. Also, the query does not have a GROUP BY clause, which is required for aggregate functions like SUM().
✑ C. Option C uses the DENSE_RANK() window function instead of the PERCENT_RANK() function. The DENSE_RANK() function returns the rank of each row within the partition, but not the percentage rank. Also, the query does not have a GROUP BY clause, which is required for aggregate functions like SUM().
✑ D. Option D uses the ROW_NUMBER() window function instead of the PERCENT_RANK() function. The ROW_NUMBER() function returns the sequential number of each rowwithin the partition, but not the percentage rank. Also, the query does not have a GROUP BY clause, which is required for aggregate functions like SUM(). References:
✑ 1: PERCENT_RANK (Transact-SQL)
✑ 2: Window functions in Databricks SQL
✑ 3: Databricks Certified Data Analyst Associate Exam Guide

QUESTION 9

A data analyst has been asked to provide a list of options on how to share a dashboard with a client. It is a security requirement that the client does not gain access to any other information, resources, or artifacts in the database.
Which of the following approaches cannot be used to share the dashboard and meet the security requirement?

Correct Answer: D
The approach that cannot be used to share the dashboard and meet the security requirement is D. Generating a Personal Access Token that is good for 1 day and sharing it with the client. This approach would give the client access to the Databricks workspace using the token owner??s identity and permissions, which could expose other information, resources, or artifacts in the database1. The other approaches can be used to share the dashboard and meet the security requirement because:
✑ A. Downloading the Dashboard as a PDF and sharing it with the client would only provide a static snapshot of the dashboard without any interactive features or access to the underlying data2.
✑ B. Setting a refresh schedule for the dashboard and entering the client??s email address in the ??Subscribers?? box would send the client an email with the latest dashboard results as an attachment or a link to a secure web page3. The client would not be able to access the Databricks workspace or the dashboard itself.
✑ C. Taking a screenshot of the dashboard and sharing it with the client would also only provide a static snapshot of the dashboard without any interactive features or access to the underlying data4.
✑ E. Downloading a PNG file of the visualizations in the dashboard and sharing them with the client would also only provide a static snapshot of the visualizations without any interactive features or access to the underlying data5. References:
✑ 1: Personal access tokens
✑ 2: Download as PDF
✑ 3: Automatically refresh a dashboard
✑ 4: Take a screenshot
✑ 5: Download a PNG file

QUESTION 10

Which of the following is an advantage of using a Delta Lake-based data lakehouse over common data lake solutions?

Correct Answer: A
A Delta Lake-based data lakehouse is a data platform architecture that combines the scalability and flexibility of a data lake with the reliability and performance of a data warehouse. One of the key advantages of using a Delta Lake-based data lakehouse over common data lake solutions is that it supports ACID transactions, which ensure data integrity and consistency. ACID transactions enable concurrent reads and writes, schema enforcement and evolution, data versioning and rollback, and data quality checks. These features are not available in traditional data lakes, which rely on file-based storage systems that do not support transactions. References:
✑ Delta Lake: Lakehouse, warehouse, advantages | Definition
✑ Synapse – Data Lake vs. Delta Lake vs. Data Lakehouse
✑ Data Lake vs. Delta Lake - A Detailed Comparison
✑ Building a Data Lakehouse with Delta Lake Architecture: A Comprehensive Guide