Which of the following approaches can be used to ingest data directly from cloud-based object storage?
Correct Answer:
E
External tables are tables that are defined in the Databricks metastore using the information stored in a cloud object storage location. External tables do not manage the data, but provide a schema and a table name to query the data. To create an external table, you can use the CREATE EXTERNAL TABLE statement and specify the object storage path to the LOCATION clause. For example, to create an external table named ext_table on a Parquet file stored in S3, you can use the following statement:
SQL
CREATE EXTERNAL TABLE ext_table ( col1 INT,
col2 STRING
)
STORED AS PARQUET
LOCATION 's3://bucket/path/file.parquet'
AI-generated code. Review and use carefully. More info on FAQ.
References: External tables
A data analyst has created a Query in Databricks SQL, and now they want to create two data visualizations from that Query and add both of those data visualizations to the same Databricks SQL Dashboard.
Which of the following steps will they need to take when creating and adding both data visualizations to the Databricks SQL Dashboard?
Correct Answer:
B
A data analyst can create multiple visualizations from the same query in Databricks SQL by clicking the + button next to the Results tab and selecting Visualization. Each visualization can have a different type, name, and configuration. To add a visualization to a dashboard, the data analyst can click the vertical ellipsis button beneath the visualization, select + Add to Dashboard, and choose an existing or new dashboard. The data analyst can repeat this process for each visualization they want to add to the same dashboard. References: Visualization in Databricks SQL, Visualize queries and create a dashboard in Databricks SQL
A data analyst is processing a complex aggregation on a table with zero null values and their query returns the following result:
Which of the following queries did the analyst run to obtain the above result?
A)
B)
C)
D)
E)
Correct Answer:
B
The result set provided shows a combination of grouping by two columns ( group_1andgroup_2) with subtotals for each level of grouping and a grand total. This pattern is typical of aGROUP BY ... WITH ROLLUPoperation in SQL, which provides subtotal rows and a grand total row in the result set.
Considering the query options:
A)Option A:GROUP BY group_1, group_2 INCLUDING NULL- This is not a standard SQL clause and would not result in subtotals and a grand total.
B)Option B:GROUP BY group_1, group_2 WITH ROLLUP- This would create subtotals for each uniquegroup_1, each combination ofgroup_1andgroup_2, and a grand total, which matches the result set provided.
C)Option C:GROUP BY group_1, group 2- This is a simpleGROUP BYand would not include subtotals or a grand total.
D)Option D:GROUP BY group_1, group_2, (group_1, group_2)- This syntax is not standard and would likely result in an error or be interpreted as a simpleGROUP BY, not providing the subtotals and grand total.
E)Option E:GROUP BY group_1, group_2 WITH CUBE- TheWITH CUBEoperation produces subtotals for all combinations of the selected columns and a grand total, which is more than what is shown in the result set.
The correct answer isOption B, which usesWITH ROLLUPto generate the subtotals for each level of grouping as well as a grand total. This matches the result set where we have subtotals for eachgroup_1, each combination ofgroup_1andgroup_2, and the grand total where bothgroup_1andgroup_2areNULL.
Which of the following statements describes descriptive statistics?
Correct Answer:
A
Descriptive statistics is a branch of statistics that uses summary statistics, such as mean, median, mode, standard deviation, range, frequency, or correlation, to quantitatively describe and summarize data. Descriptive statistics can help data analysts understand the main features of a data set, such as its central tendency, variability, or distribution. Descriptive statistics can also help data analysts visualize data using charts, graphs, or tables. Descriptive statistics do not make any inferences or predictions about the data, unlike inferential statistics, which use data analysis techniques to infer properties of an underlying population or probability distribution from a sample of
data. References: Databricks - Descriptive Statistics, Databricks - Data Analysis with Databricks SQL
Which of the following statements about adding visual appeal to visualizations in the Visualization Editor is incorrect?
Correct Answer:
D
The Visualization Editor in Databricks SQL allows users to create and customize various types of charts and visualizations from the query results. Users can change the visualization type, select the data fields, adjust the colors, format the data labels, and modify the tooltips. However, there is no option to add borders to the visualizations in the Visualization Editor. Borders are not a supported feature of the new chart visualizations in Databricks1. Therefore, the statement that borders can be added is incorrect. References:
✑ New chart visualizations in Databricks | Databricks on AWS