Delta Lake stores table data as a series of data files, but it also stores a lot of other information.
Which of the following is stored alongside data files when using Delta Lake?
Correct Answer:
C
Delta Lake stores table data as a series of data files in a specified location, but it also stores table metadata in a transaction log. The table metadata includes the schema, partitioning information, table properties, and other configuration details. The table metadata is stored alongside the data files and is updated atomically with every write operation. The table metadata can be accessed using the DESCRIBE DETAIL command or the DeltaTable class in Scala, Python, or Java. The table metadata can also be enriched with custom tags or user-defined commit messages using the TBLPROPERTIES or
userMetadata options. References:
✑ Enrich Delta Lake tables with custom metadata
✑ Delta Lake Table metadata - Stack Overflow
✑ Metadata - The Internals of Delta Lake
A data analyst creates a Databricks SQL Query where the result set has the following schema:
region STRING number_of_customer INT
When the analyst clicks on the "Add visualization" button on the SQL Editor page, which of the following types of visualizations will be selected by default?
Correct Answer:
C
According to the Databricks SQL documentation, when a data analyst clicks on the ??Add visualization?? button on the SQL Editor page, the default visualization type is Bar Chart. This is because the result set has two columns: one of type STRING and one of type INT. The Bar Chart visualization automatically assigns the STRING column to the X-axis and the INT column to the Y-axis. The Bar Chart visualization is suitable for showing the distribution of a numeric variable across different categories. References: Visualization in Databricks SQL, Visualization types
A data analyst created and is the owner of the managed table my_ table. They now want to change ownership of the table to a single other user using Data Explorer.
Which of the following approaches can the analyst use to complete the task?
Correct Answer:
C
The Owner field in the table page shows the current owner of the table and allows the owner to change it to another user or group. To change the ownership of the table, the owner can click on the Owner field and select the new owner from the drop-down list. This will transfer the ownership of the table to the selected user or group and remove the previous owner from the list of table access control entries1. The other options are incorrect because:
✑ A. Removing the owner??s account from the Owner field will not change the ownership of the table, but will make the table ownerless2.
✑ B. Selecting All Users from the Owner field will not change the ownership of the table, but will grant all users access to the table3.
✑ D. Selecting the Admins group from the Owner field will not change the ownership of the table, but will grant the Admins group access to the table3.
✑ E. Removing all access from the Owner field will not change the ownership of the table, but will revoke all access to the table4. References:
✑ 1: Change table ownership
✑ 2: Ownerless tables
✑ 3: Table access control
✑ 4: Revoke access to a table
A data team has been given a series of projects by a consultant that need to be implemented in the Databricks Lakehouse Platform.
Which of the following projects should be completed in Databricks SQL?
Correct Answer:
C
Databricks SQL is a service that allows users to query data in the lakehouse using SQL and create visualizations and dashboards1. One of the common use cases for Databricks SQL is to combine data from different sources and formats into a single, comprehensive dataset that can be used for further analysis or reporting2. For example, a data analyst can use Databricks SQL to join data from a CSV file and a Parquet file, or from a Delta table and a JDBC table, and create a new table or view that contains the combined data3. This can help simplify the data management and governance, as well as improve the data quality and consistency. References:
✑ Databricks SQL overview
✑ Databricks SQL use cases
✑ Joining data sources
A data analyst has been asked to configure an alert for a query that returns the income in the accounts_receivable table for a date range. The date range is configurable using a Date query parameter.
The Alert does not work.
Which of the following describes why the Alert does not work?
Correct Answer:
D
According to the Databricks documentation1, queries that use query parameters cannot be used with Alerts. This is because Alerts do not support user input or dynamic values. Alerts leverage queries with parameters using the default value specified in the SQL editor for each parameter. Therefore, if the query uses a Date query parameter, the alert will always use the same date range as the default value, regardless of the actual date. This may cause the alert to not work as expected, or to not trigger at all. References:
✑ Databricks SQL alerts: This is the official documentation for Databricks SQL alerts,
where you can find information about how to create, configure, and monitor alerts, as well as the limitations and best practices for using alerts.