최신 DP-203 Korean 무료덤프 - Microsoft Data Engineering on Microsoft Azure (DP-203 Korean Version)

문제1

Azure Stream Analytics 쿼리가 있습니다. 쿼리는 clusterID라는 열에 대해 10,000개의 개별 값이 포함된 결과 집합을 반환합니다.
Stream Analytics 작업을 모니터링하고 높은 대기 시간을 발견합니다.
레이턴시를 줄여야 합니다.
어떤 두 가지 작업을 수행해야 합니까? 각 정답은 완전한 솔루션을 제시합니다.
참고: 각 올바른 선택은 1점의 가치가 있습니다.

A. 시간 분석 기능을 추가합니다.

B. 스트리밍 단위 수를 늘립니다.

C. PARTITION BY를 사용하여 쿼리를 확장합니다.

D. 쿼리를 참조 쿼리로 변환합니다.

E. 통과 쿼리를 추가합니다.

정답: B,C

설명: (DumpTOP 회원만 볼 수 있음)

문제2

전용 SQL 풀에서 Twitter 피드 데이터를 분석할 수 있는지 확인해야 합니다. 솔루션은 고객 감정 분석 요구 사항을 충족해야 합니다.
어떤 세 가지 Transaction-SQL DDL 명령을 순서대로 실행해야 합니까? 응답하려면 명령 목록에서 해당 명령을 응답 영역으로 이동하고 올바른 순서로 정렬하십시오.
참고: 하나 이상의 답변 선택 순서가 정확합니다. 선택한 올바른 주문에 대해 크레딧을 받게 됩니다.

정답:

Explanation:

Scenario: Allow Contoso users to use PolyBase in an Azure Synapse Analytics dedicated SQL pool to query the content of the data records that host the Twitter feeds. Data must be protected by using row-level security (RLS). The users must be authenticated by using their own Azure AD credentials.
Box 1: CREATE EXTERNAL DATA SOURCE
External data sources are used to connect to storage accounts.
Box 2: CREATE EXTERNAL FILE FORMAT
CREATE EXTERNAL FILE FORMAT creates an external file format object that defines external data stored in Azure Blob Storage or Azure Data Lake Storage. Creating an external file format is a prerequisite for creating an external table.
Box 3: CREATE EXTERNAL TABLE AS SELECT
When used in conjunction with the CREATE TABLE AS SELECT statement, selecting from an external table imports data into a table within the SQL pool. In addition to the COPY statement, external tables are useful for loading data.
Reference:
https://docs.microsoft.com/en-us/azure/synapse-analytics/sql/develop-tables-external-tables

문제3

Azure Data Lake Storage Gen2 컨테이너가 있습니다.
데이터는 컨테이너로 수집된 다음 데이터 통합 애플리케이션에 의해 변환됩니다. 그 이후에는 데이터가 수정되지 않습니다. 사용자는 컨테이너의 파일을 읽을 수 있지만 파일을 수정할 수는 없습니다.
다음 요구 사항을 충족하는 데이터 보관 솔루션을 설계해야 합니다.
새로운 데이터는 자주 액세스되며 가능한 한 빨리 사용할 수 있어야 합니다.
5년 이상 된 데이터는 자주 액세스하지 않지만 요청 시 1초 이내에 사용할 수 있어야 합니다.
7년 이상 된 데이터는 액세스할 수 없습니다. 7년 후에는 데이터를 가능한 최저 비용으로 유지해야 합니다.
필요한 가용성을 유지하면서 비용을 최소화해야 합니다.
데이터를 어떻게 관리해야 할까요? 대답하려면 대답 영역에서 적절한 옵션을 선택하십시오.
참고: 각 올바른 선택은 1점의 가치가 있습니다.

정답:

Explanation:

Box 1: Move to cool storage
Box 2: Move to archive storage
Archive - Optimized for storing data that is rarely accessed and stored for at least 180 days with flexible latency requirements, on the order of hours.
The following table shows a comparison of premium performance block blob storage, and the hot, cool, and archive access tiers.

Reference:
https://docs.microsoft.com/en-us/azure/storage/blobs/storage-blob-storage-tiers Explanation:
Box 1: Replicated
Replicated tables are ideal for small star-schema dimension tables, because the fact table is often distributed on a column that is not compatible with the connected dimension tables. If this case applies to your schema, consider changing small dimension tables currently implemented as round-robin to replicated.
Box 2: Replicated
Box 3: Replicated
Box 4: Hash-distributed
For Fact tables use hash-distribution with clustered columnstore index. Performance improves when two hash tables are joined on the same distribution column.
Reference:
https://azure.microsoft.com/en-us/updates/reduce-data-movement-and-make-your-queries-more-efficient-with- the-general-availability-of-replicated-tables/
https://azure.microsoft.com/en-us/blog/replicated-tables-now-generally-available-in-azure-sql-data-warehouse/

문제4

세 개의 파이프라인과 Trigger 1, Trigger2 및 Tiigger3이라는 세 개의 트리거가 포함된 Azure Synapse Analytics 작업 영역이 있습니다.
트리거 3에는 다음과 같은 정의가 있습니다.

정답:

Explanation:

문제5

Azure Synapse Analytics에서 엔터프라이즈 데이터 웨어하우스를 관리합니다.
사용자는 일반적으로 사용되는 쿼리를 실행할 때 성능이 저하된다고 보고합니다. 사용자는 자주 사용되지 않는 쿼리에 대한 성능 변화를 보고하지 않습니다.
성능 문제의 원인을 확인하려면 리소스 활용도를 모니터링해야 합니다.
어떤 지표를 모니터링해야 합니까?

A. DWU 비율

B. 로컬 tempdb 비율

C. 데이터 IO 비율

D. 캐시 적중률

정답: D

설명: (DumpTOP 회원만 볼 수 있음)

문제6

Azure Data Lake Storage Gen 2 계정을 사용하여 요금소의 페타바이트 번호판 사진을 저장하는 애플리케이션을 설계하고 있습니다. 계정은 영역 중복 스토리지(ZRS)를 사용합니다.
다음 사용 패턴을 식별합니다.
* 데이터가 생성된 후 처음 30일 동안 하루에 여러 번 데이터에 액세스합니다. 데이터는 99.9%의 가용성 SU를 충족해야 합니다.
* 90일 이후에는 데이터에 자주 액세스하지 않지만 30초 이내에 사용할 수 있어야 합니다.
* 365일 이후에는 데이터에 자주 액세스하지 않지만 5분 이내에 사용할 수 있어야 합니다.

정답:

Explanation:
Box 1: Hot
The data will be accessed several times a day during the first 30 days after the data is created. The data must meet an availability SLA of 99.9%.
Box 2: Cool
After 90 days, the data will be accessed infrequently but must be available within 30 seconds.
Data in the Cool tier should be stored for a minimum of 30 days.
When your data is stored in an online access tier (either Hot or Cool), users can access it immediately. The Hot tier is the best choice for data that is in active use, while the Cool tier is ideal for data that is accessed less frequently, but that still must be available for reading and writing.
Box 3: Cool
After 365 days, the data will be accessed infrequently but must be available within five minutes.
Reference: https://docs.microsoft.com/en-us/azure/storage/blobs/access-tiers-overview
https://docs.microsoft.com/en-us/azure/storage/blobs/archive-rehydrate-overview

문제7

Server1이라는 논리적 Microsoft SQL 서버가 포함된 Azure 구독이 있습니다. Server1은 Pool1이라는 Azure Synapse Analytics SQL 전용 풀을 호스팅합니다.
Server1용 TDE(투명한 데이터 암호화) 솔루션을 권장해야 합니다. 솔루션은 다음 요구 사항을 충족해야 합니다.
암호화 키 사용을 추적합니다.
암호화 키의 가용성에 영향을 미치는 Azure 데이터 센터 중단이 발생할 경우 Pool1에 대한 클라이언트 앱의 액세스를 유지합니다.
추천서에 무엇을 포함해야 합니까? 대답하려면 대답 영역에서 적절한 옵션을 선택하십시오.
참고: 각 올바른 선택은 1점의 가치가 있습니다.

정답:

Explanation:

Box 1: TDE with customer-managed keys
Customer-managed keys are stored in the Azure Key Vault. You can monitor how and when your key vaults are accessed, and by whom. You can do this by enabling logging for Azure Key Vault, which saves information in an Azure storage account that you provide.
Box 2: Create and configure Azure key vaults in two Azure regions
The contents of your key vault are replicated within the region and to a secondary region at least 150 miles away, but within the same geography to maintain high durability of your keys and secrets.
Reference:
https://docs.microsoft.com/en-us/azure/synapse-analytics/security/workspaces-encryption
https://docs.microsoft.com/en-us/azure/key-vault/general/logging

문제8

Azure Synapse Analytics에서 1GB 미만의 차원 테이블을 만들 계획입니다.
다음 요구 사항을 충족하려면 테이블을 만들어야 합니다.
* 가장 빠른 쿼리 시간을 제공합니다.
* 쿼리 중 데이터 이동을 최소화합니다.
어떤 유형의 테이블을 사용해야 합니까?

A. 라운드 로빈

B. 복제됨

C. 힙

D. 해시 배포

정답: B

설명: (DumpTOP 회원만 볼 수 있음)

문제9

페타바이트 규모의 의료 영상 데이터를 저장할 애플리케이션을 설계하고 있습니다. 데이터가 처음 생성되면 첫 주 동안 데이터에 자주 액세스하게 됩니다. 한 달이 지나면 30초 이내에 데이터에 액세스할 수 있어야 하지만 파일에 자주 액세스하지 않습니다. 1년 후에는 데이터에 자주 액세스하지 않지만 5분 이내에 액세스할 수 있어야 합니다.
날짜에 대한 스토리지 전략을 선택해야 합니다.
ㅏ. 솔루션은 비용을 최소화해야 합니다.
각 기간에 어떤 스토리지 계층을 사용해야 합니까? 대답하려면 대답 영역에서 적절한 옵션을 선택하십시오.
참고: 각 올바른 선택은 1점의 가치가 있습니다.

정답:

Explanation:

First week: Hot
Hot - Optimized for storing data that is accessed frequently.
After one month: Cool
Cool - Optimized for storing data that is infrequently accessed and stored for at least 30 days.
After one year: Cool

문제10

Azure Data Factory에는 태평양 표준시로 예약된 일정 트리거가 있습니다.
태평양 표준시는 일광 절약 시간을 준수합니다.
트리거에는 다음 JSON 파일이 있습니다.

드롭다운 메뉴를 사용하여 제공된 정보를 기반으로 각 문항을 완성하는 답변 선택을 선택하세요.
참고: 올바른 선택은 각각 1점의 가치가 있습니다.

정답:

Explanation:

문제11

Microsoft Purview 계정이 있습니다. CSV 파일의 계보 보기는 다음 그림에 나와 있습니다.

계보에 대한 데이터는 어떻게 채워집니까?

A. 데이터 저장소 스캔

B. 수동

C. Data Factory 파이프라인 실행

정답: A

설명: (DumpTOP 회원만 볼 수 있음)

문제12

Litware 온프레미스 네트워크 외부의 사용자가 분석 데이터 저장소에 액세스하지 못하도록 하려면 무엇을 권장해야 합니까?

A. 서버 수준 가상 네트워크 규칙

B. 데이터베이스 수준 방화벽 IP 규칙

C. 서버 수준 방화벽 IP 규칙

D. 데이터베이스 수준 가상 네트워크 규칙

정답: A

설명: (DumpTOP 회원만 볼 수 있음)

문제13

Azure Synapse Analytics에 FactOnlineSales라는 테이블이 포함된 엔터프라이즈 데이터 웨어하우스가 있습니다. 이 테이블에는 2009년 시작부터 2012년 말까지의 데이터가 포함되어 있습니다.
테이블 파티션을 사용하여 FactOnlineSales에 대한 쿼리 성능을 개선해야 합니다. 솔루션은 다음 요구 사항을 충족해야 합니다.
주문 날짜를 기준으로 4개의 파티션을 만듭니다.
각 파티션에 지정된 연도 동안의 모든 주문 장소가 포함되어 있는지 확인하십시오.
T-SQL 명령을 어떻게 완료해야 합니까? 대답하려면 대답 영역에서 적절한 옵션을 선택하십시오.
참고: 각 올바른 선택은 1점의 가치가 있습니다.

정답:

Explanation:

Range Left or Right, both are creating similar partition but there is difference in comparison For example: in this scenario, when you use LEFT and 20100101,20110101,20120101 Partition will be, datecol<=20100101, datecol>20100101 and datecol<=20110101, datecol>20110101 and datecol<=20120101, datecol>20120101 But if you use range RIGHT and 20100101,20110101,20120101 Partition will be, datecol<20100101, datecol>=20100101 and datecol<20110101, datecol>=20110101 and datecol<20120101, datecol>=20120101 In this example, Range RIGHT will be suitable for calendar comparison Jan 1st to Dec 31st Reference:
https://docs.microsoft.com/en-us/sql/t-sql/statements/create-partition-function-transact-sql?view=sql-server- ver15

문제14

Azure Synapse Analytics 전용 SQL 풀에서 FactPurchase라는 팩트 테이블을 디자인하고 있습니다. 이 테이블에는 소매점에 대한 공급업체의 구매가 포함되어 있습니다. FactPurchase에는 다음 열이 포함됩니다.

FactPurchase는 매일 100만 행의 데이터를 추가하고 3년간의 데이터를 포함합니다.
다음 쿼리와 유사한 Transact-SQL 쿼리가 매일 실행됩니다.
선택하다
SupplierKey, StockItemKey, COUNT(*)
팩트구매에서
WHERE DateKey >= 20210101
AND 날짜 키 <= 20210131
GROUP by SupplierKey, StockItemKey
어떤 테이블 분포가 쿼리 시간을 최소화합니까?

A. DateKey에 해시 분산

B. 라운드 로빈

C. PurchaseKey에 해시 분배

D. 복제됨

정답: C

설명: (DumpTOP 회원만 볼 수 있음)

문제15

폴더가 포함된 Azure Blob 저장소 계정이 있습니다. 폴더에는 120,000개의 파일이 있습니다. 각 파일에는 62개의 열이 있습니다.
매일 1,500개의 새 파일이 폴더에 추가됩니다.
각각의 새 파일에서 Azure Synapse Analytics 작업 영역으로 5개의 데이터 열을 증분식으로 로드할 계획입니다.
증분 로드를 수행하는 데 걸리는 시간을 최소화해야 합니다.
파일과 형식을 저장하기 위해 무엇을 사용해야 합니까?

정답:

Explanation:
Box 1 = timeslice partitioning in the foldersThis means that you should organize your files into folders based on a time attribute, such as year, month, day, or hour. For example, you can have a folder structure like /yyyy
/mm/dd/file.csv. This way, you can easily identify and load only the new files that are added each day by using a time filter in your Azure Synapse pipeline12. Timeslice partitioning can also improve the performance of data loading and querying by reducing the number of files that need to be scanned Box = 2 Apache Parquet This is because Parquet is a columnar file format that can efficiently store and compress data with many columns. Parquet files can also be partitioned by a time attribute, which can improve the performance of incremental loading and querying by reducing the number of files that need to be scanned123. Parquet files are supported by both dedicated SQL pool and serverless SQL pool in Azure Synapse Analytics2.

문제16

Azure Synapse Analytics 전용 SQL Pool1이 있습니다. Pool1에는 dbo.Sales라는 분할된 팩트 테이블과 일치하는 테이블 및 파티션 정의가 있는 stg.Sales라는 준비 테이블이 있습니다.
dbo.Sales의 첫 번째 파티션 내용을 stg.Sales의 동일한 파티션 내용으로 덮어써야 합니다. 솔루션은 로드 시간을 최소화해야 합니다.
어떻게 해야 합니까?

A. stg.Sales에서 dbo.Sales를 업데이트합니다.

B. 첫 번째 파티션을 dbo.Sales에서 stg.Sales로 전환합니다.

C. stg.Sales의 데이터를 dbo.Sales에 삽입합니다.

D. 첫 번째 파티션을 stg.Sales에서 dbo로 전환합니다. 매상.

정답: B

문제17

새 파일이 Azure Data Lake Storage Gen2 컨테이너에 도착하면 실행할 Azure Data Factory 파이프라인을 예약해야 합니다.
어떤 유형의 트리거를 사용해야 합니까?

A. 일정

B. 주문형

C. 저장 이벤트

D. 텀블링 창

정답: C

설명: (DumpTOP 회원만 볼 수 있음)

최신 DP-203 Korean 무료덤프 - Microsoft Data Engineering on Microsoft Azure (DP-203 Korean Version)

우리와 연락하기

유용한 링크

최신 업데이트