최신 Databricks-Machine-Learning-Associate 무료덤프 - Databricks Certified Machine Learning Associate
Which of the following tools can be used to parallelize the hyperparameter tuning process for single-node machine learning models using a Spark cluster?
정답: B
설명: (DumpTOP 회원만 볼 수 있음)
A data scientist has written a feature engineering notebook that utilizes the pandas library. As the size of the data processed by the notebook increases, the notebook's runtime is drastically increasing, but it is processing slowly as the size of the data included in the process increases.
Which of the following tools can the data scientist use to spend the least amount of time refactoring their notebook to scale with big data?
Which of the following tools can the data scientist use to spend the least amount of time refactoring their notebook to scale with big data?
정답: B
설명: (DumpTOP 회원만 볼 수 있음)
Which of the following machine learning algorithms typically uses bagging?
정답: B
설명: (DumpTOP 회원만 볼 수 있음)
A data scientist wants to efficiently tune the hyperparameters of a scikit-learn model. They elect to use the Hyperopt library's fmin operation to facilitate this process. Unfortunately, the final model is not very accurate. The data scientist suspects that there is an issue with the objective_function being passed as an argument to fmin.
They use the following code block to create the objective_function:
Which of the following changes does the data scientist need to make to their objective_function in order to produce a more accurate model?
They use the following code block to create the objective_function:
Which of the following changes does the data scientist need to make to their objective_function in order to produce a more accurate model?
정답: D
설명: (DumpTOP 회원만 볼 수 있음)
A data scientist is attempting to tune a logistic regression model logistic using scikit-learn. They want to specify a search space for two hyperparameters and let the tuning process randomly select values for each evaluation.
They attempt to run the following code block, but it does not accomplish the desired task:
Which of the following changes can the data scientist make to accomplish the task?
They attempt to run the following code block, but it does not accomplish the desired task:
Which of the following changes can the data scientist make to accomplish the task?
정답: A
설명: (DumpTOP 회원만 볼 수 있음)
A data scientist learned during their training to always use 5-fold cross-validation in their model development workflow. A colleague suggests that there are cases where a train-validation split could be preferred over k-fold cross-validation when k > 2.
Which of the following describes a potential benefit of using a train-validation split over k-fold cross-validation in this scenario?
Which of the following describes a potential benefit of using a train-validation split over k-fold cross-validation in this scenario?
정답: B
A data scientist is wanting to explore the Spark DataFrame spark_df. The data scientist wants visual histograms displaying the distribution of numeric features to be included in the exploration.
Which of the following lines of code can the data scientist run to accomplish the task?
Which of the following lines of code can the data scientist run to accomplish the task?
정답: B
설명: (DumpTOP 회원만 볼 수 있음)
A data scientist has developed a random forest regressor rfr and included it as the final stage in a Spark MLPipeline pipeline. They then set up a cross-validation process with pipeline as the estimator in the following code block:
Which of the following is a negative consequence of including pipeline as the estimator in the cross-validation process rather than rfr as the estimator?
Which of the following is a negative consequence of including pipeline as the estimator in the cross-validation process rather than rfr as the estimator?
정답: A
설명: (DumpTOP 회원만 볼 수 있음)