Question 1

You have an Azure Databricks workspace that contains a job in Lakeflow Jobs named Job1.

Job! processes raw data files stored in Azure Storage.

New files arrive at unpredictable intervals.

You need to ensure that Job1 starts automatically when new files arrive and does NOT consume compute resources when no data is available.

Which type of job trigger should you use?

Ascheduled

Bcontinuous

Cfile arrival

Dmanual

Answer : C

CORRECT ANSWE R: C - File arrival trigger.

According to Microsoft Learn on Lakeflow Jobs triggers, the File Arrival trigger monitors a specified Azure Storage path and automatically starts a job run when new files are detected. This satisfies: 'Job1 starts automatically when new files arrive' (event-driven triggering) and 'does NOT consume compute resources when no data is available' (the job cluster only starts when triggered by a file arrival event, not on a fixed schedule). Option A (scheduled) runs at fixed intervals regardless of whether new files are present, wasting compute when no data arrives. Option B (continuous) keeps the job running permanently, consuming resources even when no data is available. Option D (manual) requires human intervention and cannot automate the response to unpredictable file arrivals.

Question 2

You have an Azure Databricks workspace that is enabled for Unity Catalog and contains a managed Delta table named Table1. Table1 stores customer data.

You need to implement a data retention solution that meets the following requirements:

Deleted data must be retained for 30 days to support audits.

Deleted data that is older than 30 days must be removed permanently.

The solution must minimize administrative effort.

Which two properties should you configure? Each correct answer presents part of the solution.

NOTE: Each correct selection is worth one point.

Adelta.timeUntilArchived

Bdelta.deletedFileRetentionDuration

Cdelta.autoOptimize.autoCompact

Ddelta.logRetentionDuration

Edelta.enableDeletionVectors

Answer : B, D

CORRECT ANSWE R: B - delta.deletedFileRetentionDuration; D - delta.logRetentionDuration.

According to Microsoft Learn on Delta Lake data retention, two table properties control how long data is retained after deletion. The delta.deletedFileRetentionDuration property controls how long physically deleted data files are retained before the VACUUM command can remove them --- setting this to 30 days ensures deleted data is retained for 30 days to support audits. The delta.logRetentionDuration property controls how long the Delta transaction log is kept --- this enables time-travel queries for the 30-day audit window. Together, both properties must be configured to 30 days to meet the full requirement. Option A (delta.timeUntilArchived) does not exist as a standard Delta property. Option C (delta.autoOptimize.autoCompact) controls file compaction, not retention. Option E (delta.enableDeletionVectors) enables deletion vectors for faster deletes but does not control data retention duration.

Question 3

You have an Azure Databricks workspace.

You have an Apache Spark Structured Streaming job named Job! that processes data continuously and fails periodically due to transient errors

You need to ensure that Job! meets the following requirements

* Resumes processing from the point that Job1 failed

* Minimizes how long it takes to restart Job!

* Minimizes the costs to restart Job!

What should you do?

ADecrease the retry interval.

BImplement checkpointing.

CAdd an alert and manually restart Job1.

DIncrease the minimum number of nodes in the cluster

Answer : B

CORRECT ANSWE R: B - Implement checkpointing.

According to Microsoft Learn on Apache Spark Structured Streaming fault tolerance, checkpointing stores the streaming query's progress in durable storage so that when a failure occurs and the job restarts, it resumes processing from exactly the point of failure without reprocessing previous data. This satisfies all three requirements: 'Resumes processing from the point that Job1 failed' (checkpoint stores last committed offset), 'Minimizes how long it takes to restart' (no data replay needed), and 'Minimizes costs' (no reprocessing of already-processed data). Option A (decrease retry interval) reduces the wait before retry but does not ensure resuming from the failure point. Option C (alert and manual restart) adds human latency. Option D (increase minimum nodes) increases cluster cost without addressing the root cause of transient errors.

Question 4

You have an Azure Databricks workspace that contains an all-purpose cluster named Cluster! You need to configure Cluster1 to meet the following requirements;

* The cluster must scale up automatically when workloads increase.

* The cluster must scale down automatically when workloads decrease.

The solution must minimize costs.

Which two actions should you perform? Each correct answer presents part of the solution.

NOTE: Each correct selection is worth one point.

ADisable Photon acceleration.

BApply a compute policy that enables users to manage the cluster settings.

CConfigure Cluster1 to terminate after 30 minutes of inactivity.

DEnable autoscaling for Cluster1.

ESpecify a fixed number of workers.

Answer : C, D

CORRECT ANSWE R: C - Configure Cluster1 to terminate after 30 minutes of inactivity; D - Enable autoscaling for Cluster1.

According to Microsoft Learn on Azure Databricks compute configuration, enabling autoscaling allows the cluster to automatically add workers when demand increases and remove workers when demand decreases, directly satisfying both scale-up and scale-down requirements. Configuring auto-termination ensures the cluster shuts down after 30 minutes of inactivity, minimizing costs when no workloads are running. Together, these two settings provide full cost optimization. Option A (disable Photon) would reduce performance without reducing costs meaningfully. Option B (compute policy that enables users to manage settings) adds administrative complexity and does not directly reduce costs. Option E (fixed number of workers) contradicts the autoscaling requirement --- a fixed worker count prevents dynamic scaling and results in over-provisioning or under-provisioning.

Question 5

You have an Azure Databricks workspace that contains a job in Lakeflow Jobs named Job1. Job1 contains multiple tasks.

Failures of non-critical tasks must be logged but must NOT trigger notifications. Notifications must be triggered only when critical tasks have failed, and Job1 has completed

You need to configure the job alerting behavior.

What should trigger a notification?

Aa task failure

Ba job failure

Cjob success

Dtask success

Answer : B

CORRECT ANSWE R: B - A job failure.

According to Microsoft Learn on Lakeflow Jobs alerting, notifications can be configured at both the job level and the task level. The requirement states: 'Failures of non-critical tasks must be logged but must NOT trigger notifications. Notifications must be triggered only when critical tasks have failed, and Job1 has completed.' The correct behavior is to trigger a notification on 'Job Failure' --- this fires when the overall job fails (meaning at least one critical task has failed and the job cannot complete successfully). Option A (task failure) would trigger a notification for every failed task, including non-critical ones, violating the requirement. Option C (job success) would not alert on failures at all. Option D (task success) is irrelevant to failure alerting.

Free Practice Questions for Microsoft DP-750 Exam

Question 1

Question 2

Question 3

Question 4

Question 5