최신 Databricks-Certified-Data-Engineer-Associate 무료덤프 - Databricks Certified Data Engineer Associate
A data engineer needs to apply custom logic to identify employees with more than 5 years of experience in array column employees in table stores. The custom logic should create a new column exp_employees that is an array of all of the employees with more than 5 years of experience for each row. In order to apply this custom logic at scale, the data engineer wants to use the FILTER higher-order function.
Which of the following code blocks successfully completes this task?
Which of the following code blocks successfully completes this task?
정답: E
설명: (DumpTOP 회원만 볼 수 있음)
Which of the following describes a scenario in which a data engineer will want to use a single-node cluster?
정답: E
설명: (DumpTOP 회원만 볼 수 있음)
A data engineer and data analyst are working together on a data pipeline. The data engineer is working on the raw, bronze, and silver layers of the pipeline using Python, and the data analyst is working on the gold layer of the pipeline using SQL The raw source of the pipeline is a streaming input. They now want to migrate their pipeline to use Delta Live Tables.
Which change will need to be made to the pipeline when migrating to Delta Live Tables?
Which change will need to be made to the pipeline when migrating to Delta Live Tables?
정답: B
설명: (DumpTOP 회원만 볼 수 있음)
In order for Structured Streaming to reliably track the exact progress of the processing so that it can handle any kind of failure by restarting and/or reprocessing, which of the following two approaches is used by Spark to record the offset range of the data being processed in each trigger?
정답: E
설명: (DumpTOP 회원만 볼 수 있음)
A data engineer wants to create a relational object by pulling data from two tables. The relational object does not need to be used by other data engineers in other sessions. In order to save on storage costs, the data engineer wants to avoid copying and storing physical data.
Which of the following relational objects should the data engineer create?
Which of the following relational objects should the data engineer create?
정답: D
설명: (DumpTOP 회원만 볼 수 있음)
A data engineer is attempting to drop a Spark SQL table my_table. The data engineer wants to delete all table metadata and data.
They run the following command:
DROP TABLE IF EXISTS my_table
While the object no longer appears when they run SHOW TABLES, the data files still exist.
Which of the following describes why the data files still exist and the metadata files were deleted?
They run the following command:
DROP TABLE IF EXISTS my_table
While the object no longer appears when they run SHOW TABLES, the data files still exist.
Which of the following describes why the data files still exist and the metadata files were deleted?
정답: A
설명: (DumpTOP 회원만 볼 수 있음)
A data engineer has configured a Structured Streaming job to read from a table, manipulate the data, and then perform a streaming write into a new table.
The cade block used by the data engineer is below:
If the data engineer only wants the query to execute a micro-batch to process data every 5 seconds, which of the following lines of code should the data engineer use to fill in the blank?
The cade block used by the data engineer is below:
If the data engineer only wants the query to execute a micro-batch to process data every 5 seconds, which of the following lines of code should the data engineer use to fill in the blank?
정답: C
설명: (DumpTOP 회원만 볼 수 있음)
Which of the following SQL keywords can be used to convert a table from a long format to a wide format?
정답: A
설명: (DumpTOP 회원만 볼 수 있음)
A data analyst has created a Delta table sales that is used by the entire data analysis team. They want help from the data engineering team to implement a series of tests to ensure the data is clean. However, the data engineering team uses Python for its tests rather than SQL.
Which of the following commands could the data engineering team use to access sales in PySpark?
Which of the following commands could the data engineering team use to access sales in PySpark?
정답: E
설명: (DumpTOP 회원만 볼 수 있음)
A data engineer has left the organization. The data team needs to transfer ownership of the data engineer's Delta tables to a new data engineer. The new data engineer is the lead engineer on the data team.
Assuming the original data engineer no longer has access, which of the following individuals must be the one to transfer ownership of the Delta tables in Data Explorer?
Assuming the original data engineer no longer has access, which of the following individuals must be the one to transfer ownership of the Delta tables in Data Explorer?
정답: B
설명: (DumpTOP 회원만 볼 수 있음)