최신 Associate-Developer-Apache-Spark 무료덤프 - Databricks Certified Associate Developer for Apache Spark 3.0
Which of the following code blocks reads in parquet file /FileStore/imports.parquet as a DataFrame?
정답: E
설명: (DumpTOP 회원만 볼 수 있음)
Which of the following statements about data skew is incorrect?
정답: D
설명: (DumpTOP 회원만 볼 수 있음)
Which of the following statements about Spark's configuration properties is incorrect?
정답: B
설명: (DumpTOP 회원만 볼 수 있음)
Which of the following code blocks reads in the parquet file stored at location filePath, given that all columns in the parquet file contain only whole numbers and are stored in the most appropriate format for this kind of data?
정답: C
설명: (DumpTOP 회원만 볼 수 있음)
Which of the following code blocks stores DataFrame itemsDf in executor memory and, if insufficient memory is available, serializes it and saves it to disk?
정답: B
설명: (DumpTOP 회원만 볼 수 있음)
The code block displayed below contains an error. The code block should use Python method find_most_freq_letter to find the letter present most in column itemName of DataFrame itemsDf and return it in a new column most_frequent_letter. Find the error.
Code block:
1. find_most_freq_letter_udf = udf(find_most_freq_letter)
2. itemsDf.withColumn("most_frequent_letter", find_most_freq_letter("itemName"))
Code block:
1. find_most_freq_letter_udf = udf(find_most_freq_letter)
2. itemsDf.withColumn("most_frequent_letter", find_most_freq_letter("itemName"))
정답: C
설명: (DumpTOP 회원만 볼 수 있음)
Which of the following code blocks returns a one-column DataFrame for which every row contains an array of all integer numbers from 0 up to and including the number given in column predError of DataFrame transactionsDf, and null if predError is null?
Sample of DataFrame transactionsDf:
1.+-------------+---------+-----+-------+---------+----+
2.|transactionId|predError|value|storeId|productId| f|
3.+-------------+---------+-----+-------+---------+----+
4.| 1| 3| 4| 25| 1|null|
5.| 2| 6| 7| 2| 2|null|
6.| 3| 3| null| 25| 3|null|
7.| 4| null| null| 3| 2|null|
8.| 5| null| null| null| 2|null|
9.| 6| 3| 2| 25| 2|null|
10.+-------------+---------+-----+-------+---------+----+
Sample of DataFrame transactionsDf:
1.+-------------+---------+-----+-------+---------+----+
2.|transactionId|predError|value|storeId|productId| f|
3.+-------------+---------+-----+-------+---------+----+
4.| 1| 3| 4| 25| 1|null|
5.| 2| 6| 7| 2| 2|null|
6.| 3| 3| null| 25| 3|null|
7.| 4| null| null| 3| 2|null|
8.| 5| null| null| null| 2|null|
9.| 6| 3| 2| 25| 2|null|
10.+-------------+---------+-----+-------+---------+----+
정답: C
설명: (DumpTOP 회원만 볼 수 있음)
The code block shown below should set the number of partitions that Spark uses when shuffling data for joins or aggregations to 100. Choose the answer that correctly fills the blanks in the code block to accomplish this.
spark.sql.shuffle.partitions
__1__.__2__.__3__(__4__, 100)
spark.sql.shuffle.partitions
__1__.__2__.__3__(__4__, 100)
정답: E
설명: (DumpTOP 회원만 볼 수 있음)
Which of the following describes the difference between client and cluster execution modes?
정답: C
설명: (DumpTOP 회원만 볼 수 있음)
The code block shown below should show information about the data type that column storeId of DataFrame transactionsDf contains. Choose the answer that correctly fills the blanks in the code block to accomplish this.
Code block:
transactionsDf.__1__(__2__).__3__
Code block:
transactionsDf.__1__(__2__).__3__
정답: C
설명: (DumpTOP 회원만 볼 수 있음)