Google Professional Machine Learning Engineer Updated Dumps
Google Professional Machine Learning Engineer Updated Dumps
Learning Engineer
Exam Name: Google Professional Machine Learning
Engineer
3. You work for a gaming company that develops massively multiplayer online (MMO) games.
You built a TensorFlow model that predicts whether players will make in-app purchases of more
than $10 in the next two weeks. The model’s predictions will be used to adapt each user’s
game experience. User data is stored in BigQuery.
How should you serve your model while optimizing cost, user experience, and ease of
management?
A. Import the model into BigQuery ML. Make predictions using batch reading data from
BigQuery, and push the data to Cloud SQL
B. Deploy the model to Vertex AI Prediction. Make predictions using batch reading data from
Cloud Bigtable, and push the data to Cloud SQL.
C. Embed the model in the mobile application. Make predictions after every in-app purchase
event is published in Pub/Sub, and push the data to Cloud SQL.
D. Embed the model in the streaming Dataflow pipeline. Make predictions after every in-app
purchase event is published in Pub/Sub, and push the data to Cloud SQL.
Answer: B
Explanation:
The best option to serve the model while optimizing cost, user experience, and ease of
management is to deploy the model to Vertex AI Prediction, which is a managed service that
can scale up or down according to the demand and provide low latency and high availability.
Vertex AI Prediction can also handle TensorFlow models natively, without requiring any
additional steps or conversions. By using batch prediction, the model can process large
volumes of data efficiently and periodically, without affecting the user experience. The data can
be read from Cloud Bigtable, which is a scalable and performant NoSQL database that can
store user data in a flexible schema. The predictions can then be pushed to Cloud SQL, which
is a fully managed relational database that can store the predictions in a structured format and
enable easy querying and analysis. This option also simplifies the management of the model
and the data, as it leverages the existing Google Cloud services and does not require any
additional infrastructure or code.
The other options are not optimal for the following reasons:
A. Importing the model into BigQuery ML is not a good option, as it requires converting the
TensorFlow model into a format that BigQuery ML can understand, which can introduce errors
and reduce the performance. Moreover, BigQuery ML is not designed for serving real-time
predictions, but rather for training and evaluating models using SQL queries. Reading and
writing data from BigQuery and Cloud SQL can also incur additional costs and latency, as they
are both relational databases that require schema definition and data transformation.
C. Embedding the model in the mobile application is not a good option, as it increases the size
and complexity of the application, and requires updating the application every time the model
changes. Moreover, it exposes the model to the users, which can pose security and privacy
risks, as well as potential misuse or abuse. Additionally, it does not leverage the benefits of the
cloud, such as scalability, reliability, and performance.
D. Embedding the model in the streaming Dataflow pipeline is not a good option, as it requires
building and maintaining a custom pipeline that can handle the model inference and data
processing. This can increase the development and operational costs and complexity, as well
as the potential for errors and failures. Moreover, it does not take advantage of the batch
prediction feature of Vertex AI Prediction, which can optimize the resource utilization and cost
efficiency.
Reference: Professional ML Engineer Exam Guide
Preparing for Google Cloud Certification: Machine Learning Engineer Professional Certificate
Google Cloud launches machine learning engineer certification
Vertex AI Prediction documentation
Cloud Bigtable documentation
Cloud SQL documentation
4. You are training a TensorFlow model on a structured data set with 100 billion records stored
in several CSV files. You need to improve the input/output execution performance.
What should you do?
A. Load the data into BigQuery and read the data from BigQuery.
B. Load the data into Cloud Bigtable, and read the data from Bigtable
C. Convert the CSV files into shards of TFRecords, and store the data in Cloud Storage
D. Convert the CSV files into shards of TFRecords, and store the data in the Hadoop Distributed
File System (HDFS)
Answer: C
Explanation:
The input/output execution performance of a TensorFlow model depends on how efficiently the
model can read and process the data from the data source. Reading and processing data from
CSV files can be slow and inefficient, especially if the data is large and distributed. Therefore, to
improve the input/output execution performance, one should use a more suitable data format
and storage system.
One of the best options for improving the input/output execution performance is to convert the
CSV files into shards of TFRecords, and store the data in Cloud Storage. TFRecord is a binary
data format that can store a sequence of serialized TensorFlow examples.
TFRecord has several advantages over CSV, such as:
Faster data loading: TFRecord can be read and processed faster than CSV, as it avoids the
overhead of parsing and decoding the text data. TFRecord also supports compression and
checksums, which can reduce the data size and ensure data integrity1
Better performance: TFRecord can improve the performance of the model, as it allows the
model to access the data in a sequential and streaming manner, and leverage the tf.data API to
build efficient data pipelines. TFRecord also supports sharding and interleaving, which can
increase the parallelism and throughput of the data processing2
Easier integration: TFRecord can integrate seamlessly with TensorFlow, as it is the native data
format for TensorFlow. TFRecord also supports various types of data, such as images, text,
audio, and video, and can store the data schema and metadata along with the data3
Cloud Storage is a scalable and reliable object storage service that can store any amount of
data.
Cloud Storage has several advantages over other storage systems, such as:
High availability: Cloud Storage can provide high availability and durability for the data, as it
replicates the data across multiple regions and zones, and supports versioning and lifecycle
management. Cloud Storage also offers various storage classes, such as Standard, Nearline,
Coldline, and Archive, to meet different performance and cost requirements4
Low latency: Cloud Storage can provide low latency and high bandwidth for the data, as it
supports HTTP and HTTPS protocols, and integrates with other Google Cloud services, such as
AI Platform, Dataflow, and BigQuery. Cloud Storage also supports resumable uploads and
downloads, and parallel composite uploads, which can improve the data transfer speed and
reliability5
Easy access: Cloud Storage can provide easy access and management for the data, as it
supports various tools and libraries, such as gsutil, Cloud Console, and Cloud Storage Client
Libraries. Cloud Storage also supports fine-grained access control and encryption, which can
ensure the data security and privacy.
The other options are not as effective or feasible. Loading the data into BigQuery and reading
the data from BigQuery is not recommended, as BigQuery is mainly designed for analytical
queries on large-scale data, and does not support streaming or real-time data processing.
Loading the data into Cloud Bigtable and reading the data from Bigtable is not ideal, as Cloud
Bigtable is mainly designed for low-latency and high-throughput key-value operations on sparse
and wide tables, and does not support complex data types or schemas. Converting the CSV
files into shards of TFRecords and storing the data in the Hadoop Distributed File System
(HDFS) is not optimal, as HDFS is not natively supported by TensorFlow, and requires
additional configuration and dependencies, such as Hadoop, Spark, or Beam.
Reference: 1: TFRecord and tf.Example 2: Better performance with the tf.data API 3:
TensorFlow Data
Validation 4: Cloud Storage overview 5: Performance: [How-to guides]
5. You have created multiple versions of an ML model and have imported them to Vertex AI
Model Registry. You want to perform A/B testing to identify the best-performing model using the
simplest approach.
What should you do?
A. Split incoming traffic among separate Cloud Run instances of deployed models. Monitor the
performance of each version using Cloud Monitoring.
B. Split incoming traffic to distribute prediction requests among the versions. Monitor the
performance of each version using Looker Studio dashboards that compare logged data for
each version.
C. Split incoming traffic among Google Kubernetes Engine (GKE) clusters and use Traffic
Director to distribute prediction requests to different versions. Monitor the performance of each
version using Cloud Monitoring.
D. Split incoming traffic to distribute prediction requests among the versions. Monitor the
performance of each version using Vertex AI’s built-in monitoring tools.
Answer: D
Explanation:
Vertex AI Model Registry supports traffic splitting and built-in monitoring, making A/B testing
seamless. This approach eliminates the need for additional monitoring tools and infrastructure
overhead. Cloud Run and GKE solutions (Options A and C) add unnecessary complexity, while
Looker Studio (Option B) requires additional configuration for monitoring.
8. You are building an ML model to detect anomalies in real-time sensor data. You will use
Pub/Sub to handle incoming requests. You want to store the results for analytics and
visualization.
How should you configure the pipeline?
9. You recently designed and built a custom neural network that uses critical dependencies
specific to your organization's framework. You need to train the model using a managed training
service on Google Cloud. However, the ML framework and related dependencies are not
supported by Al Platform Training. Also, both your model and your data are too large to fit in
memory on a single machine. Your ML framework of choice uses the scheduler, workers, and
servers distribution structure.
What should you do?
A. Use a built-in model available on Al Platform Training
B. Build your custom container to run jobs on Al Platform Training
C. Build your custom containers to run distributed training jobs on Al Platform Training
D. Reconfigure your code to a ML framework with dependencies that are supported by Al
Platform Training
Answer: C
Explanation:
AI Platform Training is a service that allows you to run your machine learning training jobs on
Google Cloud using various features, model architectures, and hyperparameters. You can use
AI Platform Training to scale up your training jobs, leverage distributed training, and access
specialized hardware such as GPUs and TPUs1. AI Platform Training supports several pre-built
containers that provide different ML frameworks and dependencies, such as TensorFlow,
PyTorch, scikit-learn, and XGBoost2. However, if the ML framework and related dependencies
that you need are not supported by the pre-built containers, you can build your own custom
containers and use them to run your training jobs on AI Platform Training3.
Custom containers are Docker images that you create to run your training application. By using
custom containers, you can specify and pre-install all the dependencies needed for your
application, and have full control over the code, serving, and deployment of your model4.
Custom containers also enable you to run distributed training jobs on AI Platform Training,
which can help you train large-scale and complex models faster and more efficiently5.
Distributed training is a technique that splits the training data and computation across multiple
machines, and coordinates them to update the model parameters. AI Platform Training supports
two types of distributed training: parameter server and collective all-reduce. The parameter
server architecture consists of a set of workers that perform the computation, and a set of
servers that store and update the model parameters. The collective all-reduce architecture
consists of a set of workers that perform the computation and synchronize the model
parameters among themselves. Both architectures also have a scheduler that coordinates the
workers and servers.
For the use case of training a custom neural network that uses critical dependencies specific to
your organization’s framework, the best option is to build your custom containers to run
distributed training jobs on AI Platform Training. This option allows you to use the ML framework
and dependencies of your choice, and train your model on multiple machines without having to
manage the infrastructure. Since your ML framework of choice uses the scheduler, workers, and
servers distribution structure, you can use the parameter server architecture to run your
distributed training job on AI Platform Training. You can specify the number and type of
machines, the custom container image, and the training application arguments when you submit
your training job. Therefore, building your custom containers to run distributed training jobs on
AI Platform Training is the best option for this use case.
Reference: AI Platform Training documentation
Pre-built containers for training
Custom containers for training
Custom containers overview | Vertex AI | Google Cloud
Distributed training overview
[Types of distributed training]
[Distributed training architectures]
[Using custom containers for training with the parameter server architecture]
10. Grant the Notebook Viewer role to the other team members.
11. You have trained a DNN regressor with TensorFlow to predict housing prices using a set of
predictive features. Your default precision is tf.float64, and you use a standard TensorFlow
estimator;
estimator tf.estimator.DNNRegressor(
feature_columns[YOUR_LIST_OF_FEATURES],
hidden_units-[1024, 512, 256],
dropoutNone)
Your model performs well, but Just before deploying it to production, you discover that your
current serving latency is 10ms @ 90 percentile and you currently serve on CPUs. Your
production requirements expect a model latency of 8ms @ 90 percentile. You are willing to
accept a small decrease in performance in order to reach the latency requirement Therefore
your plan is to improve latency while evaluating how much the model's prediction decreases.
What should you first try to quickly lower the serving latency?
A. Increase the dropout rate to 0.8 in_PREDICT mode by adjusting the TensorFlow Serving
parameters
B. Increase the dropout rate to 0.8 and retrain your model.
C. Switch from CPU to GPU serving
D. Apply quantization to your SavedModel by reducing the floating point precision to tf.float16.
Answer: D
Explanation:
Quantization is a technique that reduces the numerical precision of the weights and activations
of a neural network, which can improve the inference speed and reduce the memory footprint of
the model1.
Reducing the floating point precision from tf.float64 to tf.float16 can potentially halve the latency
and memory usage of the model, while having minimal impact on the accuracy2.
Increasing the dropout rate to 0.8 in either mode would not affect the latency, but would likely
degrade the performance of the model significantly, as dropout is a regularization technique that
randomly drops out units during training to prevent overfitting3.
Switching from CPU to GPU serving may or may not improve the latency, depending on the
hardware specifications and the model complexity, but it would also incur additional costs and
complexity for deployment4
12. Associate the pipeline with your experiment when you submit the job.
D. 1. Create a Vertex Al pipeline Use the Dataset and Model artifact types from the Kubeflow
Pipelines. DSL as the inputs and outputs of the components in your pipeline.
13. You are creating a deep neural network classification model using a dataset with categorical
input values. Certain columns have a cardinality greater than 10,000 unique values.
How should you encode these categorical values as input into the model?
A. Convert each categorical value into an integer value.
B. Convert the categorical string data to one-hot hash buckets.
C. Map the categorical variables into a vector of boolean values.
D. Convert each categorical value into a run-length encoded string.
Answer: B
Explanation:
Option A is incorrect because converting each categorical value into an integer value is not a
good way to encode categorical values with high cardinality. This method implies an ordinal
relationship between the categories, which may not be true. For example, assigning the values
1, 2, and 3 to the categories “red”, “green”, and “blue” does not make sense, as there is no
inherent order among these colors1.
Option B is correct because converting the categorical string data to one-hot hash buckets is a
suitable way to encode categorical values with high cardinality. This method uses a hash
function to map each category to a fixed-length vector of binary values, where only one element
is 1 and the rest are 0. This method preserves the sparsity and independence of the categories,
and reduces the dimensionality of the input space2.
Option C is incorrect because mapping the categorical variables into a vector of boolean values
is not a valid way to encode categorical values with high cardinality. This method implies that
each category can be represented by a combination of true/false values, which may not be
possible for a large number of categories. For example, if there are 10,000 categories, then
there are 2^10,000 possible combinations of boolean values, which is impractical to store and
process3.
Option D is incorrect because converting each categorical value into a run-length encoded
string is not a useful way to encode categorical values with high cardinality. This method
compresses a string by replacing consecutive repeated characters with the character and the
number of repetitions. For example, “AAAABBBCC” becomes “A4B3C2”. This method does
not reduce the dimensionality of the input space, and does not preserve the semantic meaning
of the categories4.
Reference: Encoding categorical features
One-hot hash buckets
Boolean vector
Run-length encoding
14. You work for a company that captures live video footage of checkout areas in their retail
stores You need to use the live video footage to build a mode! to detect the number of
customers waiting for service in near real time You want to implement a solution quickly and
with minimal effort.
How should you build the model?
A. Use the Vertex Al Vision Occupancy Analytics model.
B. Use the Vertex Al Vision Person/vehicle detector model
C. Train an AutoML object detection model on an annotated dataset by using Vertex AutoML
D. Train a Seq2Seq+ object detection model on an annotated dataset by using Vertex AutoML
Answer: A
Explanation:
According to the official exam guide1, one of the skills assessed in the exam is to “design,
build, and productionalize ML models to solve business challenges using Google Cloud
technologies”. The Vertex AI Vision Occupancy Analytics model2 is a specialized pre-built
vision model that lets you count people or vehicles given specific inputs you add in video
frames. It provides advanced features such as active zones counting, line crossing counting,
and dwelling detection. This model is suitable for the use case of detecting the number of
customers waiting for service in near real time. You can easily create and deploy an occupancy
analytics application using Vertex AI Vision3. The other options are not relevant or optimal for
this scenario.
Reference: Professional ML Engineer Exam Guide
Occupancy analytics guide
Create an occupancy analytics app with BigQuery forecasting Google Professional Machine
Learning Certification Exam 2023
Latest Google Professional Machine Learning Engineer Actual Free Exam Questions
15. Your task is classify if a company logo is present on an image. You found out that 96% of a
data does not include a logo. You are dealing with data imbalance problem.
Which metric do you use to evaluate to model?
A. F1 Score
B. RMSE
C. F Score with higher precision weighting than recall
D. F Score with higher recall weighted than precision
Answer: A
Explanation:
The F1 score is a metric that combines both precision and recall, and is suitable for evaluating
imbalanced classification problems. Precision measures the fraction of true positives among the
predicted positives, and recall measures the fraction of true positives among the actual
positives. The F1 score is the harmonic mean of precision and recall, and it ranges from 0 to 1,
with higher values indicating better performance. The F1 score is a good metric for imbalanced
data because it balances both the false positives and the false negatives, and does not favor
the majority class over the minority class.
The other options are not good metrics for imbalanced data. RMSE (root mean squared error) is
a metric for regression problems, not classification problems. It measures the average squared
difference between the predicted and the actual values, and is not suitable for binary outcomes.
F score with higher precision weighting than recall, or F0.5 score, is a metric that gives more
importance to precision than recall. This means that it penalizes false positives more than false
negatives, which is not desirable for imbalanced data where the minority class is more
important. F score with higher recall weighting than precision, or F2 score, is a metric that gives
more importance to recall than precision. This means that it penalizes false negatives more than
false positives, which might be suitable for some imbalanced data problems, but not for the logo
detection problem. In this problem, both false positives and false negatives are equally
important, as we want to accurately identify the presence or absence of a logo in an image.
Therefore, the F1 score is a better metric than the F2 score.
Reference: Tour of Evaluation Metrics for Imbalanced Classification
Metrics for imbalanced data (simply explained)
16. You are training an ML model using data stored in BigQuery that contains several values
that are considered Personally Identifiable Information (Pll). You need to reduce the sensitivity
of the dataset before training your model. Every column is critical to your model.
How should you proceed?
A. Using Dataflow, ingest the columns with sensitive data from BigQuery, and then randomize
the values in each sensitive column.
B. Use the Cloud Data Loss Prevention (DLP) API to scan for sensitive data, and use Dataflow
with the DLP API to encrypt sensitive values with Format Preserving Encryption
C. Use the Cloud Data Loss Prevention (DLP) API to scan for sensitive data, and use Dataflow
to replace all sensitive data by using the encryption algorithm AES-256 with a salt.
D. Before training, use BigQuery to select only the columns that do not contain sensitive data
Create an authorized view of the data so that sensitive values cannot be accessed by
unauthorized individuals.
Answer: B
Explanation:
The best option for reducing the sensitivity of the dataset before training the model is to use the
Cloud Data Loss Prevention (DLP) API to scan for sensitive data, and use Dataflow with the
DLP API to encrypt sensitive values with Format Preserving Encryption. This option allows you
to keep every column in the dataset, while protecting the sensitive data from unauthorized
access or exposure. The Cloud DLP API can detect and classify various types of sensitive data,
such as names, email addresses, phone numbers, credit card numbers, and more1. Dataflow
can create scalable and reliable pipelines to process large volumes of data from BigQuery and
other sources2. Format Preserving Encryption (FPE) is a technique that encrypts sensitive data
while preserving its original format and length, which can help maintain the utility and validity of
the data3. By using Dataflow with the DLP API, you can apply FPE to the sensitive values in the
dataset, and store the encrypted data in BigQuery or another destination. You can also use the
same pipeline to decrypt the data when needed, by using the same encryption key and
method4.
The other options are not as suitable as option B, for the following reasons:
Option A: Using Dataflow to ingest the columns with sensitive data from BigQuery, and then
randomize the values in each sensitive column, would reduce the sensitivity of the data, but
also the utility and accuracy of the data. Randomization is a technique that replaces sensitive
data with random values, which can prevent re-identification of the data, but also distort the
distribution and relationships of the data3. This can affect the performance and quality of the ML
model, especially if every column is critical to the model.
Option C: Using the Cloud DLP API to scan for sensitive data, and use Dataflow to replace all
sensitive data by using the encryption algorithm AES-256 with a salt, would reduce the
sensitivity of the data, but also the utility and validity of the data. AES-256 is a symmetric
encryption algorithm that uses a 256-bit key to encrypt and decrypt data. A salt is a random
value that is added to the data before encryption, to increase the randomness and security of
the encrypted data. However, AES-256 does not preserve the format or length of the original
data, which can cause problems when storing or processing the data. For example, if the
original data is a 10-digit phone number, AES-256 would produce a much longer and different
string, which can break the schema or logic of the dataset3.
Option D: Before training, using BigQuery to select only the columns that do not contain
sensitive data, and creating an authorized view of the data so that sensitive values cannot be
accessed by unauthorized individuals, would reduce the exposure of the sensitive data, but also
the completeness and relevance of the data. An authorized view is a BigQuery view that allows
you to share query results with particular users or groups, without giving them access to the
underlying tables. However, this option assumes that you can identify the columns that do not
contain sensitive data, which may not be easy or accurate. Moreover, this option would remove
some columns from the dataset, which can affect the performance and quality of the ML model,
especially if every column is critical to the model.
Reference: Preparing for Google Cloud Certification: Machine Learning Engineer, Course 5:
Responsible AI,
Week 2: Privacy
Google Cloud Professional Machine Learning Engineer Exam Guide, Section 5: Developing
responsible AI solutions, 5.2 Implementing privacy techniques
Official Google Cloud Certified Professional Machine Learning Engineer Study Guide, Chapter
9:
Responsible AI, Section 9.4: Privacy
De-identification techniques
Cloud Data Loss Prevention (DLP) API
Dataflow
Using Dataflow and Sensitive Data Protection to securely tokenize and import data from a
relational database to BigQuery
[AES encryption]
[Salt (cryptography)]
[Authorized views]
19. Generate predictions by using a Vertex Al Pipelines model batch predict component
C. 1. Upload your dataset to BigQuery
20. You work for a social media company. You need to detect whether posted images contain
cars. Each training example is a member of exactly one class. You have trained an object
detection neural network and deployed the model version to Al Platform Prediction for
evaluation. Before deployment, you created an evaluation job and attached it to the Al Platform
Prediction model version. You notice that the precision is lower than your business
requirements allow.
How should you adjust the model's final layer softmax threshold to increase precision?
A. Increase the recall
B. Decrease the recall.
C. Increase the number of false positives
D. Decrease the number of false negatives
Answer: B
Explanation:
Precision and recall are two common metrics for evaluating the performance of a classification
model. Precision measures the proportion of positive predictions that are correct, while recall
measures the proportion of positive examples that are correctly predicted. Precision and recall
are inversely related, meaning that increasing one will decrease the other, and vice versa. The
trade-off between precision and recall depends on the goal and the cost of the classification
problem1.
For the use case of detecting whether posted images contain cars, precision is more important
than recall, as the social media company wants to minimize the number of false positives, or
images that are incorrectly labeled as containing cars. A high precision means that the model is
confident and accurate in its positive predictions, while a low recall means that the model may
miss some positive examples, or images that actually contain cars. The cost of missing some
positive examples is lower than the cost of making wrong positive predictions, as the latter may
affect the user experience and the reputation of the social media company.
The softmax function is a function that transforms a vector of real numbers into a probability
distribution over the possible classes. The softmax function is often used as the final layer of a
neural network for multi-class classification problems, as it assigns a probability to each class,
and the class with the highest probability is chosen as the prediction. The softmax function is
defined as: softmax (x_i) exp (x_i) / sum_j exp (x_j)
where x_i is the input value for class i, and softmax (x_i) is the output probability for class i.
The softmax threshold is a parameter that determines the minimum probability that a class must
have to be chosen as the prediction. For example, if the softmax threshold is 0.5, then the class
with the highest probability must have at least 0.5 to be selected, otherwise the prediction is
none. The softmax threshold can be used to adjust the trade-off between precision and recall,
as a higher threshold will increase the precision and decrease the recall, while a lower threshold
will decrease the precision and increase the recall2.
For the use case of detecting whether posted images contain cars, the best way to adjust the
model’s final layer softmax threshold to increase precision is to decrease the recall. This means
that the softmax threshold should be increased, so that the model will only make positive
predictions when it is highly confident, and avoid making false positives. By increasing the
softmax threshold, the model will become more selective and accurate in its positive predictions,
and improve the precision metric. Therefore, decreasing the recall is the best option for this use
case.
Reference: Precision and recall - Wikipedia
How to add a threshold in softmax scores - Stack Overflow
21. During batch training of a neural network, you notice that there is an oscillation in the loss.
How should you adjust your model
Oscillation in the loss during batch to ensure that it converges?
A. Increase the size of the training batch
B. Decrease the size of the training batch
C. Increase the learning rate hyperparameter
D. Decrease the learning rate hyperparameter
Answer: D
Explanation:
training of a neural network means
that the model is overshooting the optimal point of the loss function and
bouncing back and forth. This can prevent the model from converging to the
minimum loss value. One of the main reasons for this phenomenon is that the
learning rate hyperparameter, which controls the size of the steps that the
model takes along the gradient, is too high. Therefore, decreasing the learning
rate hyperparameter can help the model take smaller and more precise steps and
avoid oscillation. This is a common technique to improve the stability and
performance of neural network training12.
Reference: Interpreting Loss Curves
Is learning rate the only reason for training loss oscillation after few epochs?
22. You work for a biotech startup that is experimenting with deep learning ML models based on
properties of biological organisms. Your team frequently works on early-stage experiments with
new architectures of ML models, and writes custom TensorFlow ops in C++. You train your
models on large datasets and large batch sizes. Your typical batch size has 1024 examples,
and each example is about 1 MB in size. The average size of a network with all weights and
embeddings is 20 GB.
What hardware should you choose for your models?
A. A cluster with 2 n1-highcpu-64 machines, each with 8 NVIDIA Tesla V100 GPUs (128 GB
GPU memory in total), and a n1-highcpu-64 machine with 64 vCPUs and 58 GB RAM
B. A cluster with 2 a2-megagpu-16g machines, each with 16 NVIDIA Tesla A100 GPUs (640 GB
GPU memory in total), 96 vCPUs, and 1.4 TB RAM
C. A cluster with an n1-highcpu-64 machine with a v2-8 TPU and 64 GB RAM
D. A cluster with 4 n1-highcpu-96 machines, each with 96 vCPUs and 86 GB RAM
Answer: B
Explanation:
The best hardware to choose for your models is a cluster with 2 a2-megagpu-16g machines,
each with 16 NVIDIA Tesla A100 GPUs (640 GB GPU memory in total), 96 vCPUs, and 1.4 TB
RAM. This hardware configuration can provide you with enough compute power, memory, and
bandwidth to handle your large and complex deep learning models, as well as your custom
TensorFlow ops in C++. The NVIDIA Tesla A100 GPUs are the latest and most advanced GPUs
from NVIDIA, which offer high performance, scalability, and efficiency for various ML workloads.
They also support multi-instance GPU (MIG) technology, which allows you to partition each
GPU into up to seven smaller instances, each with its own memory, cache, and compute cores.
This can enable you to run multiple experiments in parallel, or to optimize the resource
utilization and cost efficiency of your models. The a2-megagpu-16g machines are part of the
Google Cloud Accelerator-Optimized VM (A2) family, which are designed to provide the best
performance and flexibility for GPU-intensive applications. They also offer high-speed NVLink
interconnects between the GPUs, which can improve the data transfer and communication
between the GPUs. Moreover, the a2-megagpu-16g machines have 96 vCPUs and 1.4 TB
RAM, which can support the CPU and memory requirements of your models, as well as the data
preprocessing and postprocessing tasks. The other options are not optimal for the following
reasons:
A. A cluster with 2 n1-highcpu-64 machines, each with 8 NVIDIA Tesla V100 GPUs (128 GB
GPU memory in total), and a n1-highcpu-64 machine with 64 vCPUs and 58 GB RAM is not a
good option, as it has less GPU memory, compute power, and bandwidth than the
a2-megagpu-16g machines. The NVIDIA Tesla V100 GPUs are the previous generation of
GPUs from NVIDIA, which have lower performance, scalability, and efficiency than the NVIDIA
Tesla A100 GPUs. They also do not support the MIG technology, which can limit the flexibility
and optimization of your models. Moreover, the n1-highcpu-64 machines are part of the Google
Cloud N1 VM family, which are general-purpose VMs that do not offer the best performance and
features for GPU-intensive applications. They also have lower vCPUs and RAM than the
a2-megagpu-16g machines, which can affect the CPU and memory requirements of your
models, as well as the data preprocessing and postprocessing tasks.
C. A cluster with an n1-highcpu-64 machine with a v2-8 TPU and 64 GB RAM is not a good
option, as it has less GPU memory, compute power, and bandwidth than the a2-megagpu-16g
machines. The v2-8 TPU is a cloud tensor processing unit (TPU) device, which is a custom
ASIC chip designed by Google to accelerate ML workloads. However, the v2-8 TPU is the
second generation of TPUs, which have lower performance, scalability, and efficiency than the
latest v3-8 TPUs. They also have less memory and bandwidth than the NVIDIA Tesla A100
GPUs, which can limit the size and complexity of your models, as well as the data transfer and
communication between the devices. Moreover, the n1-highcpu-64 machine has lower vCPUs
and RAM than the a2-megagpu-16g machines, which can affect the CPU and memory
requirements of your models, as well as the data preprocessing and postprocessing tasks.
D. A cluster with 4 n1-highcpu-96 machines, each with 96 vCPUs and 86 GB RAM is not a good
option, as it does not have any GPUs, which are essential for accelerating deep learning
models. The n1-highcpu-96 machines are part of the Google Cloud N1 VM family, which are
general-purpose VMs that do not offer the best performance and features for GPU-intensive
applications. They also have lower RAM than the a2-megagpu-16g machines, which can affect
the memory requirements of your models, as well as the data preprocessing and
postprocessing tasks.
Reference: Professional ML Engineer Exam Guide
Preparing for Google Cloud Certification: Machine Learning Engineer Professional Certificate
Google Cloud launches machine learning engineer certification
NVIDIA Tesla A100 GPU
Google Cloud Accelerator-Optimized VM (A2) family Google Cloud N1 VM family
Cloud TPU
23. You work for a hotel and have a dataset that contains customers' written comments
scanned from paper-based customer feedback forms which are stored as PDF files Every form
has the same layout. You need to quickly predict an overall satisfaction score from the customer
comments on each form.
How should you accomplish this task'?
A. Use the Vision API to parse the text from each PDF file Use the Natural Language API
analyzesentiment feature to infer overall satisfaction scores.
B. Use the Vision API to parse the text from each PDF file Use the Natural Language API
analyzeEntitysentiment feature to infer overall satisfaction scores.
C. Uptrain a Document Al custom extractor to parse the text in the comments section of each
PDF file. Use the Natural Language API analyze sentiment feature to infer overall satisfaction
scores.
D. Uptrain a Document Al custom extractor to parse the text in the comments section of each
PDF file. Use the Natural Language API analyzeEntitySentiment feature to infer overall
satisfaction scores.
Answer: C
Explanation:
According to the official exam guide1, one of the skills assessed in the exam is to “design,
build, and productionalize ML models to solve business challenges using Google Cloud
technologies”. Document AI2 is a document understanding platform that takes unstructured
data from documents and transforms it into structured data, making it easier to understand,
analyze, and consume. Document AI Workbench3 allows you to create custom extractors to
parse the text in specific sections of your documents. Natural Language API4 is a service that
provides natural language understanding technologies, such as sentiment analysis, entity
analysis, and other text annotations. The analyzeSentiment feature5 inspects the given text and
identifies the prevailing emotional opinion within the text, especially to determine a writer’s
attitude as positive, negative, or neutral. Therefore, option C is the best way to accomplish the
task of predicting an overall satisfaction score from the customer comments on each form. The
other options are not relevant or optimal for this scenario.
Reference: Professional ML Engineer Exam Guide
Document AI
Document AI Workbench
Natural Language API
Sentiment analysis
Google Professional Machine Learning Certification Exam 2023
Latest Google Professional Machine Learning Engineer Actual Free Exam Questions
24. Dispatch an appropriately sized shuttle and provide the map with the required stops based
on the prediction.
B. 1. Build a tree-based classification model that predicts whether the shuttle should pick up
passengers at each shuttle station.
25. You work on a growing team of more than 50 data scientists who all use Al Platform. You
are designing a strategy to organize your jobs, models, and versions in a clean and scalable
way.
Which strategy should you choose?
A. Set up restrictive I AM permissions on the Al Platform notebooks so that only a single user or
group can access a given instance.
B. Separate each data scientist's work into a different project to ensure that the jobs, models,
and versions created by each data scientist are accessible only to that user.
C. Use labels to organize resources into descriptive categories. Apply a label to each created
resource so that users can filter the results by label when viewing or monitoring the resources
D. Set up a BigQuery sink for Cloud Logging logs that is appropriately filtered to capture
information about Al Platform resource usage In BigQuery create a SQL view that maps users
to the resources they are using.
Answer: C
Explanation:
Labels are key-value pairs that can be attached to any AI Platform resource, such as jobs,
models, versions, or endpoints1. Labels can help you organize your resources into descriptive
categories, such as project, team, environment, or purpose. You can use labels to filter the
results when you list or monitor your resources, or to group them for billing or quota purposes2.
Using labels is a simple and scalable way to manage your AI Platform resources without
creating unnecessary complexity or overhead. Therefore, using labels to organize resources is
the best strategy for this use case.
Reference: Using labels
Filtering and grouping by labels
26. Submit a Vertex AI batch prediction job that uses your trained model in Cloud Storage to
perform scoring on the preprocessed data.
27. You recently joined a machine learning team that will soon release a new project. As a lead
on the project, you are asked to determine the production readiness of the ML components. The
team has already tested features and data, model development, and infrastructure.
Which additional readiness check should you recommend to the team?
A. Ensure that training is reproducible
B. Ensure that all hyperparameters are tuned
C. Ensure that model performance is monitored
D. Ensure that feature expectations are captured in the schema
Answer: C
Explanation:
Monitoring model performance is an essential part of production readiness, as it allows the team
to
detect and address any issues that may arise after deployment, such as data drift, model
degradation, or errors.
Other Options:
A. Ensuring that training is reproducible is important for model development, but not necessarily
for production readiness. Reproducibility helps the team to track and compare different
experiments, but it does not guarantee that the model will perform well in production.
B. Ensuring that all hyperparameters are tuned is also important for model development, but not
sufficient for production readiness. Hyperparameter tuning helps the team to find the optimal
configuration for the model, but it does not account for the dynamic and changing nature of the
production environment.
D. Ensuring that feature expectations are captured in the schema is a part of testing features
and data, which the team has already done. The schema defines the expected format, type, and
range of the features, and helps the team to validate and preprocess the data.
28. You work at a gaming startup that has several terabytes of structured data in Cloud Storage.
This data includes gameplay time data, user metadata, and game metadata. You want to build
a model that recommends new games to users that requires the least amount of coding.
What should you do?
A. Load the data in BigQuery. Use BigQuery ML to train an Autoencoder model.
B. Load the data in BigQuery. Use BigQuery ML to train a matrix factorization model.
C. Read data to a Vertex Al Workbench notebook. Use TensorFlow to train a two-tower model.
D. Read data to a Vertex Al Workbench notebook. Use TensorFlow to train a matrix
factorization model.
Answer: B
Explanation:
The best option to build a game recommendation model with the least amount of coding is to
use BigQuery ML, which allows you to create and execute machine learning models using
standard SQL queries. BigQuery ML supports several types of models, including matrix
factorization, which is a common technique for collaborative filtering-based recommendation
systems. Matrix factorization models learn latent factors for users and items from the observed
ratings, and then use them to predict the ratings for new user-item pairs. BigQuery ML provides
a built-in function called ML.RECOMMEND that can generate recommendations for a given user
based on a trained matrix factorization model. To use BigQuery ML, you need to load the data
in BigQuery, which is a serverless, scalable, and cost-effective data warehouse. You can use
the bq command-line tool, the BigQuery API, or the Cloud Console to load data from Cloud
Storage to BigQuery. Alternatively, you can use federated queries to query data directly from
Cloud Storage without loading it to BigQuery, but this may incur additional costs and
performance overhead.
Option A is incorrect because BigQuery ML does not support Autoencoder models, which are a
type of neural network that can learn compressed representations of the input data.
Autoencoder models are not suitable for recommendation systems, as they do not capture the
interactions between users and items.
Option C is incorrect because using TensorFlow to train a two-tower model requires more
coding than using BigQuery ML. A two-tower model is a type of neural network that learns
embeddings for users and items separately, and then combines them with a dot product or a
cosine similarity to compute the rating. TensorFlow is a low-level framework that requires you to
define the model architecture, the loss function, the optimizer, the training loop, and the
evaluation metrics. Moreover, you need to read the data from Cloud Storage to a Vertex AI
Workbench notebook, which is an instance of JupyterLab that runs on a Google Cloud virtual
machine. This may involve additional steps such as authentication, authorization, and data
preprocessing.
Option D is incorrect because using TensorFlow to train a matrix factorization model also
requires more coding than using BigQuery ML. Although TensorFlow provides some high-level
APIs such as Keras and TensorFlow Recommenders that can simplify the model development,
you still need to handle the data loading and the model training and evaluation yourself.
Furthermore, you need to read the data from Cloud Storage to a Vertex AI Workbench
notebook, which may incur additional complexity and costs.
Reference: BigQuery ML documentation
Using matrix factorization with BigQuery ML
Recommendations AI documentation
Loading data into BigQuery
Querying data in Cloud Storage from BigQuery
Vertex AI Workbench documentation
TensorFlow documentation
TensorFlow Recommenders documentation
31. You are an ML engineer responsible for designing and implementing training pipelines for
ML models. You need to create an end-to-end training pipeline for a TensorFlow model. The
TensorFlow model will be trained on several terabytes of structured data. You need the pipeline
to include data quality checks before training and model quality checks after training but prior to
deployment. You want to minimize development time and the need for infrastructure
maintenance.
How should you build and orchestrate your training pipeline?
A. Create the pipeline using Kubeflow Pipelines domain-specific language (DSL) and predefined
Google Cloud components. Orchestrate the pipeline using Vertex AI Pipelines.
B. Create the pipeline using TensorFlow Extended (TFX) and standard TFX components.
Orchestrate the pipeline using Vertex AI Pipelines.
C. Create the pipeline using Kubeflow Pipelines domain-specific language (DSL) and predefined
Google Cloud components. Orchestrate the pipeline using Kubeflow Pipelines deployed on
Google Kubernetes Engine.
D. Create the pipeline using TensorFlow Extended (TFX) and standard TFX components.
Orchestrate the pipeline using Kubeflow Pipelines deployed on Google Kubernetes Engine.
Answer: B
Explanation:
The best option for creating and orchestrating an end-to-end training pipeline for a TensorFlow
model is to use TensorFlow Extended (TFX) and standard TFX components, and deploy the
pipeline to Vertex AI Pipelines. TFX is an end-to-end platform for deploying production ML
pipelines, which consists of several built-in components that cover the entire ML lifecycle, from
data ingestion and validation, to model training and evaluation, to model deployment and
monitoring. TFX also supports custom components and integrations with other Google Cloud
services, such as BigQuery, Dataflow, and Cloud Storage. Vertex AI Pipelines is a fully
managed service that allows you to run TFX pipelines on Google Cloud, without having to worry
about infrastructure provisioning, scaling, or maintenance. Vertex AI Pipelines also provides a
user-friendly interface to monitor and manage your pipelines, as well as tools to track and
compare experiments. The other options are not as suitable for creating and orchestrating an
end-to-end training pipeline for a TensorFlow model, because: Creating the pipeline using
Kubeflow Pipelines domain-specific language (DSL) and predefined Google Cloud components
would require more development time and effort, as Kubeflow Pipelines DSL is not as
expressive or compatible with TensorFlow as TFX. Predefined Google Cloud components might
not cover all the stages of the ML lifecycle, and might not be optimized for TensorFlow models.
Orchestrating the pipeline using Kubeflow Pipelines deployed on Google Kubernetes Engine
would require more infrastructure maintenance, as Kubeflow Pipelines is not a fully managed
service, and you would have to provision and manage your own Kubernetes cluster. This would
also incur more costs, as you would have to pay for the cluster resources, regardless of the
pipeline usage.
Reference: TFX | ML Production Pipelines | TensorFlow
Vertex AI Pipelines | Google Cloud
Kubeflow Pipelines | Google Cloud
Google Cloud launches machine learning engineer certification
Google Professional Machine Learning Engineer Certification
Professional ML Engineer Exam Guide
32. The website will send a Pub/Sub message with the relevant data and then receive a
message with the prediction from Pub/Sub.
33. Reference tf .data.TFRecordDataset in the training script.
34. You work for a retailer that sells clothes to customers around the world. You have been
tasked with ensuring that ML models are built in a secure manner. Specifically, you need to
protect sensitive customer data that might be used in the models. You have identified four fields
containing sensitive data that are being used by your data science team: AGE,
IS_EXISTING_CUSTOMER, LATITUDE_LONGITUDE, and SHIRT_SIZE.
What should you do with the data before it is made available to the data science team for
training purposes?
A. Tokenize all of the fields using hashed dummy values to replace the real values.
B. Use principal component analysis (PCA) to reduce the four sensitive fields to one PCA
vector.
C. Coarsen the data by putting AGE into quantiles and rounding LATITUDE_LONGTTUDE into
single precision. The other two fields are already as coarse as possible.
D. Remove all sensitive data fields, and ask the data science team to build their models using
non-sensitive data.
Answer: C
Explanation:
The best option for protecting sensitive customer data that might be used in the ML models is to
coarsen the data by putting AGE into quantiles and rounding LATITUDE_LONGITUDE into
single precision.
This option has the following advantages:
It preserves the utility and relevance of the data for the ML models, as the coarsened data still
captures the essential information and patterns that the models need to learn. For example,
putting AGE into quantiles can group the customers into different age ranges, which can be
useful for predicting their preferences or behavior. Rounding LATITUDE_LONGITUDE into
single precision can reduce the precision of the location data, but still retain the general
geographic region of the customers, which can be useful for personalizing the recommendations
or offers.
It reduces the risk of exposing the personal or private information of the customers, as the
coarsened data makes it harder to identify or re-identify the individual customers from the data.
For example, putting AGE into quantiles can hide the exact age of the customers, which can be
considered sensitive or confidential. Rounding LATITUDE_LONGITUDE into single precision
can obscure the exact location of the customers, which can be considered sensitive or
confidential. The other options are less optimal for the following reasons:
Option A: Tokenizing all of the fields using hashed dummy values to replace the real values
eliminates the utility and relevance of the data for the ML models, as the tokenized data loses
all the information and patterns that the models need to learn. For example, tokenizing AGE
using hashed dummy values can make the data meaningless and irrelevant, as the models
cannot learn anything from the random tokens. Tokenizing LATITUDE_LONGITUDE using
hashed dummy values can make the data meaningless and irrelevant, as the models cannot
learn anything from the random tokens.
Option B: Using principal component analysis (PCA) to reduce the four sensitive fields to one
PCA vector reduces the utility and relevance of the data for the ML models, as the PCA vector
may not capture all the information and patterns that the models need to learn. For example,
using PCA to reduce AGE, IS_EXISTING_CUSTOMER, LATITUDE_LONGITUDE, and
SHIRT_SIZE to one PCA vector can lose some information or introduce noise in the data, as
the PCA vector is a linear combination of the original features, which may not reflect their true
relationship or importance. Moreover, using PCA to reduce the four sensitive fields to one PCA
vector may not reduce the risk of exposing the personal or private information of the customers,
as the PCA vector may still be reversible or linkable to the original data, depending on the
amount of variance explained by the PCA vector and the availability of the PCA transformation
matrix.
Option D: Removing all sensitive data fields, and asking the data science team to build their
models using non-sensitive data reduces the utility and relevance of the data for the ML models,
as the non-sensitive data may not contain enough information and patterns that the models
need to learn.
For example, removing AGE, IS_EXISTING_CUSTOMER, LATITUDE_LONGITUDE, and
SHIRT_SIZE from the data can make the data insufficient and unrepresentative, as the models
may not be able to learn the factors that influence the customers’ preferences or behavior.
Moreover, removing all sensitive data fields from the data may not be necessary or feasible, as
the data protection legislation may allow the use of sensitive data for the ML models, as long as
the data is processed in a secure and ethical manner, and the customers’ consent and rights
are respected.
Reference: Protecting Sensitive Data and AI Models with Confidential Computing | NVIDIA
Technical Blog Training machine learning models from sensitive data | Fast Data Science
Securing ML applications. Model security and protection - Medium Security of AI/ML systems,
ML model security | Cossack Labs Vulnerabilities, security and privacy for machine learning
models
35. You are developing an ML model that uses sliced frames from video feed and creates
bounding boxes around specific objects. You want to automate the following steps in your
training pipeline: ingestion and preprocessing of data in Cloud Storage, followed by training and
hyperparameter tuning of the object model using Vertex AI jobs, and finally deploying the model
to an endpoint. You want to orchestrate the entire pipeline with minimal cluster management.
What approach should you use?
A. Use Kubeflow Pipelines on Google Kubernetes Engine.
B. Use Vertex AI Pipelines with TensorFlow Extended (TFX) SDK.
C. Use Vertex AI Pipelines with Kubeflow Pipelines SDK.
D. Use Cloud Composer for the orchestration.
Answer: B
Explanation:
Option A is incorrect because using Kubeflow Pipelines on Google Kubernetes Engine is not the
most convenient way to orchestrate the entire pipeline with minimal cluster management.
Kubeflow Pipelines is an open-source platform that allows you to build, run, and manage ML
pipelines using containers1. Google Kubernetes Engine is a service that allows you to create
and manage clusters of virtual machines that run Kubernetes, an open-source system for
orchestrating containerized applications2. However, this option requires more effort and
resources than option B, as it involves creating and configuring the clusters, installing and
maintaining Kubeflow Pipelines, and writing and running the pipeline code.
Option B is correct because using Vertex AI Pipelines with TensorFlow Extended (TFX) SDK is
the best way to orchestrate the entire pipeline with minimal cluster management. Vertex AI
Pipelines is a service that allows you to create and run scalable and portable ML pipelines on
Google Cloud3. TensorFlow Extended (TFX) is a framework that provides a set of components
and libraries for building production-ready ML pipelines using TensorFlow4. You can use Vertex
AI Pipelines with TFX SDK to ingest and preprocess the data in Cloud Storage, train and tune
the object model using Vertex AI jobs, and deploy the model to an endpoint, using predefined or
custom components.
Vertex AI Pipelines handles the underlying infrastructure and orchestration for you, so you don’t
need to worry about cluster management or scalability.
Option C is incorrect because using Vertex AI Pipelines with Kubeflow Pipelines SDK is not the
most suitable way to orchestrate the entire pipeline with minimal cluster management. Kubeflow
Pipelines SDK is a library that allows you to build and run ML pipelines using Kubeflow
Pipelines5. You can use Vertex AI Pipelines with Kubeflow Pipelines SDK to create and run ML
pipelines on Google Cloud, using containers. However, this option is less convenient and
consistent than option B, as it requires you to use different APIs and tools for different steps of
the pipeline, such as Vertex AI SDK for training and deployment, and Kubeflow Pipelines SDK
for ingestion and preprocessing. Moreover, this option does not leverage the benefits of TFX,
such as the standard components, the metadata store, or the ML Metadata library.
Option D is incorrect because using Cloud Composer for the orchestration is not the most
efficient way to orchestrate the entire pipeline with minimal cluster management. Cloud
Composer is a service that allows you to create and run workflows using Apache Airflow, an
open-source platform for orchestrating complex tasks. You can use Cloud Composer to
orchestrate the entire pipeline, by creating and managing DAGs (directed acyclic graphs) that
define the dependencies and order of the tasks. However, this option is more complex and
costly than option B, as it involves creating and configuring the environments, installing and
maintaining Airflow, and writing and running the DAGs.
Reference: Kubeflow Pipelines documentation
Google Kubernetes Engine documentation
Vertex AI Pipelines documentation
TensorFlow Extended documentation
Kubeflow Pipelines SDK documentation
[Cloud Composer documentation]
[Vertex AI documentation]
[Cloud Storage documentation]
[TensorFlow documentation]
40. You are an ML engineer at a global shoe store. You manage the ML models for the
company's website. You are asked to build a model that will recommend new products to the
user based on their purchase behavior and similarity with other users.
What should you do?
A. Build a classification model
B. Build a knowledge-based filtering model
C. Build a collaborative-based filtering model
D. Build a regression model using the features as predictors
Answer: C
Explanation:
A recommender system is a type of machine learning system that suggests relevant items to
users based on their preferences and behavior. Recommender systems are widely used in e-
commerce, media, and entertainment industries to enhance user experience and increase
revenue1
There are different types of recommender systems that use different filtering methods to
generate recommendations.
The most common types are:
Content-based filtering: This method uses the features of the items and the users to find the
similarity between them. For example, a content-based recommender system for movies may
use the genre, director, cast, and ratings of the movies, and the preferences, demographics,
and history of the users, to recommend movies that are similar to the ones the user liked
before2 Collaborative filtering: This method uses the feedback and ratings of the users to find
the similarity between them and the items. For example, a collaborative filtering recommender
system for books may use the ratings of the users for different books, and recommend books
that are liked by other users who have similar ratings to the target user3
Hybrid method: This method combines content-based and collaborative filtering methods to
overcome the limitations of each method and improve the accuracy and diversity of the
recommendations. For example, a hybrid recommender system for music may use both the
features of the songs and the artists, and the ratings and listening habits of the users, to
recommend songs that match the user’s taste and preferences4
Deep learning-based: This method uses deep neural networks to learn complex and non-linear
patterns from the data and generate recommendations. Deep learning-based recommender
systems can handle large-scale and high-dimensional data, and incorporate various types of
information, such as text, images, audio, and video. For example, a deep learning-based
recommender system for fashion may use the images and descriptions of the products, and the
profiles and feedback of the users, to recommend products that suit the user’s style and
preferences.
For the use case of building a model that will recommend new products to the user based on
their purchase behavior and similarity with other users, the best option is to build a collaborative-
based filtering model. This is because collaborative filtering can leverage the implicit feedback
and ratings of the users to find the items that are most likely to interest them. Collaborative
filtering can also help discover new products that the user may not be aware of, and increase
the diversity and serendipity of the recommendations3
The other options are not as suitable for this use case. Building a classification model or a
regression model using the features as predictors is not a good idea, as these models are not
designed for recommendation tasks, and may not capture the preferences and behavior of the
users. Building a knowledge-based filtering model is not relevant, as this method uses the
explicit knowledge and requirements of the users to find the items that meet their criteria, and
does not rely on the purchase behavior or similarity with other users.
Reference: 1: Recommender system 2: Content-based filtering 3: Collaborative filtering 4:
Hybrid recommender system: [Deep learning for recommender systems]: [Knowledge-based
recommender system]
45. You have a demand forecasting pipeline in production that uses Dataflow to preprocess raw
data prior to model training and prediction. During preprocessing, you employ Z-score
normalization on data stored in BigQuery and write it back to BigQuery. New training data is
added every week. You want to make the process more efficient by minimizing computation
time and manual intervention.
What should you do?
A. Normalize the data using Google Kubernetes Engine
B. Translate the normalization algorithm into SQL for use with BigQuery
C. Use the normalizer_fn argument in TensorFlow's Feature Column API
D. Normalize the data with Apache Spark using the Dataproc connector for BigQuery
Answer: B
Explanation:
Z-score normalization is a technique that transforms the values of a numeric variable into
standardized units, such that the mean is zero and the standard deviation is one. Z-score
normalization can help to compare variables with different scales and ranges, and to reduce the
effect of outliers and skewness.
The formula for z-score normalization is:
z (x - mu) / sigma
where x is the original value, mu is the mean of the variable, and sigma is the standard
deviation of the variable.
Dataflow is a service that allows you to create and run data processing pipelines on Google
Cloud. You can use Dataflow to preprocess raw data prior to model training and prediction, such
as applying z-score normalization on data stored in BigQuery. However, using Dataflow for this
task may not be the most efficient option, as it involves reading and writing data from and to
BigQuery, which can be time-consuming and costly. Moreover, using Dataflow requires manual
intervention to update the pipeline whenever new training data is added.
A more efficient way to perform z-score normalization on data stored in BigQuery is to translate
the normalization algorithm into SQL and use it with BigQuery. BigQuery is a service that allows
you to analyze large-scale and complex data using SQL queries. You can use BigQuery to
perform z-score normalization on your data using SQL functions such as AVG(),
STDDEV_POP(), and OVER(). For example, the following SQL query can normalize the values
of a column called temperature in a table called weather:
SELECT (temperature - AVG(temperature) OVER ()) / STDDEV_POP(temperature) OVER ()
AS normalized_temperature FROM weather;
By using SQL to perform z-score normalization on BigQuery, you can make the process more
efficient by minimizing computation time and manual intervention. You can also leverage the
scalability and performance of BigQuery to handle large and complex datasets. Therefore,
translating the normalization algorithm into SQL for use with BigQuery is the best option for this
use case.
46. You have been tasked with deploying prototype code to production. The feature engineering
code is in PySpark and runs on Dataproc Serverless. The model training is executed by using a
Vertex Al custom training job. The two steps are not connected, and the model training must
currently be run manually after the feature engineering step finishes. You need to create a
scalable and maintainable production process that runs end-to-end and tracks the connections
between steps.
What should you do?
A. Create a Vertex Al Workbench notebook Use the notebook to submit the Dataproc
Serverless feature engineering job Use the same notebook to submit the custom model training
job Run the notebook cells sequentially to tie the steps together end-to-end
B. Create a Vertex Al Workbench notebook Initiate an Apache Spark context in the notebook,
and run the PySpark feature engineering code Use the same notebook to run the custom model
training job in TensorFlow Run the notebook cells sequentially to tie the steps together end-to-
end
C. Use the Kubeflow pipelines SDK to write code that specifies two components - The first is a
Dataproc Serverless component that launches the feature engineering job - The second is a
custom component wrapped in the creare_cusrora_rraining_job_from_ccraponent Utility that
launches the custom model training job.
D. Create a Vertex Al Pipelines job to link and run both components Use the Kubeflow pipelines
SDK to write code that specifies two components - The first component initiates an Apache
Spark context that runs the PySpark feature engineering code - The second component runs
the TensorFlow custom model training code Create a Vertex Al Pipelines job to link and run
both components
Answer: C
Explanation:
The best option for creating a scalable and maintainable production process that runs end-to-
end and tracks the connections between steps, using prototype code to production, feature
engineering code in PySpark that runs on Dataproc Serverless, and model training that is
executed by using a Vertex AI custom training job, is to use the Kubeflow pipelines SDK to write
code that specifies two components. The first is a Dataproc Serverless component that
launches the feature engineering job. The second is a custom component wrapped in the
create_custom_training_job_from_component utility that launches the custom model training
job. This option allows you to leverage the power and simplicity of Kubeflow pipelines to
orchestrate and automate your machine learning workflows on Vertex AI. Kubeflow pipelines is
a platform that can build, deploy, and manage machine learning pipelines on Kubernetes.
Kubeflow pipelines can help you create reusable and scalable pipelines, experiment with
different pipeline versions and parameters, and monitor and debug your pipelines. Kubeflow
pipelines SDK is a set of Python packages that can help you build and run Kubeflow pipelines.
Kubeflow pipelines SDK can help you define pipeline components, specify pipeline parameters
and inputs, and create pipeline steps and tasks. A component is a self-contained set of code
that performs one step in a pipeline, such as data preprocessing, model training, or model
evaluation. A component can be created from a Python function, a container image, or a
prebuilt component. A custom component is a component that is not provided by Kubeflow
pipelines, but is created by the user to perform a specific task. A custom component can be
wrapped in a utility function that can help you create a Vertex AI custom training job from the
component. A custom training job is a resource that can run your custom training code on
Vertex AI. A custom training job can help you train various types of models, such as linear
regression, logistic regression, k-means clustering, matrix factorization, and deep neural
networks. By using the Kubeflow pipelines SDK to write code that specifies two components,
the first is a Dataproc Serverless component that launches the feature engineering job, and the
second is a custom component wrapped in the create_custom_training_job_from_component
utility that launches the custom model training job, you can create a scalable and maintainable
production process that runs end-to-end and tracks the connections between steps. You can
write code that defines the two components, their inputs and outputs, and their dependencies.
You can then use the Kubeflow pipelines SDK to create a pipeline that runs the two
components in sequence, and submit the pipeline to Vertex AI Pipelines for execution. By using
Dataproc Serverless component, you can run your PySpark feature engineering code on
Dataproc Serverless, which is a service that can run Spark batch workloads without provisioning
and managing your own cluster. By using custom component wrapped in the
create_custom_training_job_from_component utility, you can run your custom model training
code on Vertex AI, which is a unified platform for building and deploying machine learning
solutions on Google Cloud1.
The other options are not as good as option C, for the following reasons:
Option A: Creating a Vertex AI Workbench notebook, using the notebook to submit the
Dataproc Serverless feature engineering job, using the same notebook to submit the custom
model training job, and running the notebook cells sequentially to tie the steps together end-to-
end would require more skills and steps than using the Kubeflow pipelines SDK to write code
that specifies two components, the first is a Dataproc Serverless component that launches the
feature engineering job, and the second is a custom component wrapped in the
create_custom_training_job_from_component utility that launches the custom model training
job.
Vertex AI Workbench is a service that can provide managed notebooks for machine learning
development and experimentation. Vertex AI Workbench can help you create and run
JupyterLab notebooks, and access various tools and frameworks, such as TensorFlow,
PyTorch, and JAX. By creating a Vertex AI Workbench notebook, using the notebook to submit
the Dataproc Serverless feature engineering job, using the same notebook to submit the custom
model training job, and running the notebook cells sequentially to tie the steps together end-to-
end, you can create a production process that runs end-to-end and tracks the connections
between steps. You can write code that submits the Dataproc Serverless feature engineering
job and the custom model training job to Vertex AI, and run the code in the notebook cells.
However, creating a Vertex AI Workbench notebook, using the notebook to submit the Dataproc
Serverless feature engineering job, using the same notebook to submit the custom model
training job, and running the notebook cells sequentially to tie the steps together end-to-end
would require more skills and steps than using the Kubeflow pipelines SDK to write code that
specifies two components, the first is a Dataproc Serverless component that launches the
feature engineering job, and the second is a custom component wrapped in the
create_custom_training_job_from_component utility that launches the custom model training
job. You would need to write code, create and configure the Vertex AI Workbench notebook,
submit the Dataproc Serverless feature engineering job and the custom model training job, and
run the notebook cells. Moreover, this option would not use the Kubeflow pipelines SDK, which
can simplify the pipeline creation and execution process, and provide various features, such as
pipeline parameters, pipeline metrics, and pipeline visualization2.
Option B: Creating a Vertex AI Workbench notebook, initiating an Apache Spark context in the
notebook, and running the PySpark feature engineering code, using the same notebook to run
the custom model training job in TensorFlow, and running the notebook cells sequentially to tie
the steps together end-to-end would not allow you to use Dataproc Serverless to run the feature
engineering job, and could increase the complexity and cost of the production process. Apache
Spark is a framework that can perform large-scale data processing and machine learning.
Apache Spark can help you run various tasks, such as data ingestion, data transformation, data
analysis, and data visualization. PySpark is a Python API for Apache Spark. PySpark can help
you write and run Spark code in Python. An Apache Spark context is a resource that can
initialize and configure the Spark environment. An Apache Spark context can help you create
and manage Spark objects, such as SparkSession, SparkConf, and SparkContext. By creating
a Vertex AI Workbench notebook, initiating an Apache Spark context in the notebook, and
running the PySpark feature engineering code, using the same notebook to run the custom
model training job in TensorFlow, and running the notebook cells sequentially to tie the steps
together end-to-end, you can create a production process that runs end-to-end and tracks the
connections between steps. You can write code that initiates an Apache Spark context and runs
the PySpark feature engineering code, and runs the custom model training job in TensorFlow,
and run the code in the notebook cells. However, creating a Vertex AI Workbench notebook,
initiating an Apache Spark context in the notebook, and running the PySpark feature
engineering code, using the same notebook to run the custom model training job in TensorFlow,
and running the notebook cells sequentially to tie the steps together end-to-end would not allow
you to use Dataproc Serverless to run the feature engineering job, and could increase the
complexity and cost of the production process. You would need to write code, create and
configure the Vertex AI Workbench notebook, initiate and configure the Apache Spark context,
run the PySpark feature engineering code, and run the custom model training job in
TensorFlow. Moreover, this option would not use Dataproc Serverless, which is a service that
can run Spark batch workloads without provisioning and managing your own cluster, and
provide various benefits, such as autoscaling, dynamic resource allocation, and serverless
billing2.
Option D: Creating a Vertex AI Pipelines job to link and run both components, using the
Kubeflow pipelines SDK to write code that specifies two components, the first component
initiates an Apache Spark context that runs the PySpark feature engineering code, and the
second component runs the TensorFlow custom model training code, would not allow you to
use Dataproc Serverless to run the feature engineering job, and could increase the complexity
and cost of the production process. Vertex AI Pipelines is a service that can run Kubeflow
pipelines on Vertex AI. Vertex AI Pipelines can help you create and manage machine learning
pipelines, and integrate with various Vertex AI services, such as Vertex AI Workbench, Vertex
AI Training, and Vertex AI Prediction. A Vertex AI Pipelines job is a resource that can execute a
pipeline on Vertex AI Pipelines. A Vertex AI Pipelines job can help you run your pipeline steps
and tasks, and monitor and debug your pipeline execution. By creating a Vertex AI Pipelines job
to link and run both components, using the Kubeflow pipelines SDK to write code that specifies
two components, the first component initiates an Apache Spark context that runs the PySpark
feature engineering code, and the second component runs the TensorFlow custom model
training code, you can create a scalable and maintainable production process that runs end-to-
end and tracks the connections between steps. You can write code that defines the two
components, their inputs and outputs, and their dependencies. You can then use the Kubeflow
pipelines SDK to create a pipeline that runs the two components in sequence, and submit the
pipeline to Vertex AI Pipelines for execution. However, creating a Vertex AI Pipelines job to link
and run both components, using the Kubeflow pipelines SDK to write code that specifies two
components, the first component initiates an Apache Spark context that runs the PySpark
feature engineering code,
47. You work for a bank with strict data governance requirements. You recently implemented a
custom model to detect fraudulent transactions You want your training code to download
internal data by using an API endpoint hosted in your projects network You need the data to be
accessed in the most secure way, while mitigating the risk of data exfiltration.
What should you do?
A. Enable VPC Service Controls for peering’s, and add Vertex Al to a service perimeter
B. Create a Cloud Run endpoint as a proxy to the data Use Identity and Access Management
(1AM) authentication to secure access to the endpoint from the training job.
C. Configure VPC Peering with Vertex Al and specify the network of the training job
D. Download the data to a Cloud Storage bucket before calling the training job
Answer: A
Explanation:
The best option for accessing internal data in the most secure way, while mitigating the risk of
data exfiltration, is to enable VPC Service Controls for peerings, and add Vertex AI to a service
perimeter. This option allows you to leverage the power and simplicity of VPC Service Controls
to isolate and protect your data and services on Google Cloud. VPC Service Controls is a
service that can create a secure perimeter around your Google Cloud resources, such as
BigQuery, Cloud Storage, and Vertex AI. VPC Service Controls can help you prevent
unauthorized access and data exfiltration from your perimeter, and enforce fine-grained access
policies based on context and identity. Peerings are connections that can allow traffic to flow
between different networks. Peerings can help you connect your Google Cloud network with
other Google Cloud networks or external networks, and enable communication between your
resources and services. By enabling VPC Service Controls for peerings, you can allow your
training code to download internal data by using an API endpoint hosted in your project’s
network, and restrict the data transfer to only authorized networks and services. Vertex AI is a
unified platform for building and deploying machine learning solutions on Google Cloud. Vertex
AI can support various types of models, such as linear regression, logistic regression, k-means
clustering, matrix factorization, and deep neural networks. Vertex AI can also provide various
tools and services for data analysis, model development, model deployment, model monitoring,
and model governance. By adding Vertex AI to a service perimeter, you can isolate and protect
your Vertex AI resources, such as models, endpoints, pipelines, and feature store, and prevent
data exfiltration from your perimeter1.
The other options are not as good as option A, for the following reasons:
Option B: Creating a Cloud Run endpoint as a proxy to the data, and using Identity and Access
Management (IAM) authentication to secure access to the endpoint from the training job would
require more skills and steps than enabling VPC Service Controls for peerings, and adding
Vertex AI to a service perimeter. Cloud Run is a service that can run your stateless containers
on a fully managed environment or on your own Google Kubernetes Engine cluster. Cloud Run
can help you deploy and scale your containerized applications quickly and easily, and pay only
for the resources you use. A Cloud Run endpoint is a URL that can expose your containerized
application to the internet or to other Google Cloud services. A Cloud Run endpoint can help
you access and invoke your application from anywhere, and handle the load balancing and
traffic routing. A proxy is a server that can act as an intermediary between a client and a target
server. A proxy can help you modify, filter, or redirect the requests and responses between the
client and the target server, and provide additional functionality or security. IAM is a service that
can manage access control for Google Cloud resources. IAM can help you define who (identity)
has what access (role) to which resource, and enforce the access policies. By creating a Cloud
Run endpoint as a proxy to the data, and using IAM authentication to secure access to the
endpoint from the training job, you can access internal data by using an API endpoint hosted in
your project’s network, and restrict the data access to only authorized identities and roles.
However, creating a Cloud Run endpoint as a proxy to the data, and using IAM authentication to
secure access to the endpoint from the training job would require more skills and steps than
enabling VPC Service Controls for peerings, and adding Vertex AI to a service perimeter. You
would need to write code, create and configure the Cloud Run endpoint, implement the proxy
logic, deploy and monitor the Cloud Run endpoint, and set up the IAM policies. Moreover, this
option would not prevent data exfiltration from your network, as the Cloud Run endpoint can be
accessed from outside your network2.
Option C: Configuring VPC Peering with Vertex AI and specifying the network of the training job
would not allow you to access internal data by using an API endpoint hosted in your project’s
network, and could cause errors or poor performance. VPC Peering is a service that can create
a peering connection between two VPC networks. VPC Peering can help you connect your
Google Cloud network with another Google Cloud network or an external network, and enable
communication between your resources and services. By configuring VPC Peering with Vertex
AI and specifying the network of the training job, you can allow your training code to access
Vertex AI resources, such as models, endpoints, pipelines, and feature store, and use the same
network for the training job. However, configuring VPC Peering with Vertex AI and specifying
the network of the training job would not allow you to access internal data by using an API
endpoint hosted in your project’s network, and could cause errors or poor performance. You
would need to write code, create and configure the VPC Peering connection, and specify the
network of the training
job. Moreover, this option would not isolate and protect your data and services on Google
Cloud, as the VPC Peering connection can expose your network to other networks and
services3.
Option D: Downloading the data to a Cloud Storage bucket before calling the training job would
not allow you to access internal data by using an API endpoint hosted in your project’s network,
and could increase the complexity and cost of the data access. Cloud Storage is a service that
can store and manage your data on Google Cloud. Cloud Storage can help you upload and
organize your data, and track the data versions and metadata. A Cloud Storage bucket is a
container that can hold your data on Cloud Storage. A Cloud Storage bucket can help you store
and access your data from anywhere, and provide various storage classes and options. By
downloading the data to a Cloud Storage bucket before calling the training job, you can access
the data from Cloud Storage, and use it as the input for the training job. However, downloading
the data to a Cloud Storage bucket before calling the training job would not allow you to access
internal data by using an API endpoint hosted in your project’s network, and could increase the
complexity and cost of the data access. You would need to write code, create and configure the
Cloud Storage bucket, download the data to the Cloud Storage bucket, and call the training job.
Moreover, this option would create an intermediate data source on Cloud Storage, which can
increase the storage and transfer costs, and expose the data to unauthorized access or data
exfiltration4.
Reference: Preparing for Google Cloud Certification: Machine Learning Engineer, Course 3:
Production ML Systems, Week 1: Data Engineering
Google Cloud Professional Machine Learning Engineer Exam Guide, Section 1: Framing ML
problems, 1.2 Defining data needs
Official Google Cloud Certified Professional Machine Learning Engineer Study Guide, Chapter
2: Data
Engineering, Section 2.2: Defining Data Needs
VPC Service Controls
Cloud Run
VPC Peering
Cloud Storage
you to leverage the power and simplicity of Vertex AI to update your model version and serve
online predictions with low latency. Vertex AI is a unified platform for building and deploying
machine learning solutions on Google Cloud. Vertex AI can deploy a trained model to an online
prediction endpoint, which can provide low-latency predictions for individual instances. A model
is a resource that represents a machine learning model that you can use for prediction. A model
can have one or more versions, which are different implementations of the same model. A
model version can have different parameters, code, or data than another version of the same
model. A model version can help you experiment and iterate on your model, and improve the
model performance and accuracy. A parentModel parameter is a parameter that specifies the
model ID of the model that the new model version is based on. A parentModel parameter can
help you inherit the settings and metadata of the existing model, and avoid duplicating the
model configuration. Vertex AI Model Registry is a service that can store and manage your
machine learning models on Google Cloud. Vertex AI Model Registry can help you upload and
organize your models, and track the model versions and metadata. An endpoint is a resource
that provides the service endpoint (URL) you use to request the prediction. An endpoint can
have one or more deployed models, which are instances of model versions that are associated
with physical resources. A deployed model can help you serve online predictions with low
latency, and scale up or down based on the traffic. By creating a new model, setting the
parentModel parameter to the model ID of the currently deployed model, uploading the model to
Vertex AI Model Registry, deploying the new model to the existing endpoint, and setting the new
model to 100% of the traffic, you can deploy a new version of a model to a production Vertex AI
endpoint that is serving traffic, direct all user traffic to the new model, and deploy the model with
minimal disruption to your application1.
The other options are not as good as option C, for the following reasons:
Option A: Creating a new endpoint, creating a new model, setting it as the default version,
uploading the model to Vertex AI Model Registry, deploying the new model to the new endpoint,
and updating Cloud DNS to point to the new endpoint would require more skills and steps than
creating a new model, setting the parentModel parameter to the model ID of the currently
deployed model, uploading the model to Vertex AI Model Registry, deploying the new model to
the existing endpoint, and setting the new model to 100% of the traffic. Cloud DNS is a service
that can provide reliable and scalable Domain Name System (DNS) services on Google Cloud.
Cloud DNS can help you manage your DNS records, and resolve domain names to IP
addresses. By updating Cloud DNS to point to the new endpoint, you can redirect the user
traffic to the new endpoint, and avoid breaking the existing application. However, creating a new
endpoint, creating a new model, setting it as the default version, uploading the model to Vertex
AI Model Registry, deploying the new model to the new endpoint, and updating Cloud DNS to
point to the new endpoint would require more skills and steps than creating a new model,
setting the parentModel parameter to the model ID of the currently deployed model, uploading
the model to Vertex AI Model Registry, deploying the new model to the existing endpoint, and
setting the new model to 100% of the traffic. You would need to write code, create and
configure the new endpoint, create and configure the new model, upload the model to Vertex AI
Model Registry, deploy the model to the new endpoint, and update Cloud DNS to point to the
new endpoint. Moreover, this option would create a new endpoint, which can increase the
maintenance and management costs2.
Option B: Creating a new endpoint, creating a new model, setting the parentModel parameter to
the model ID of the currently deployed model and setting it as the default version, uploading the
model to Vertex AI Model Registry, and deploying the new model to the new endpoint and
setting the new model to 100% of the traffic would require more skills and steps than creating a
new model, setting the parentModel parameter to the model ID of the currently deployed model,
uploading the model to Vertex AI Model Registry, deploying the new model to the existing
endpoint, and setting the new model to 100% of the traffic. A parentModel parameter is a
parameter that specifies the model ID of the model that the new model version is based on. A
parentModel parameter can help you inherit the settings and metadata of the existing model,
and avoid duplicating the model configuration. A default version is a model version that is used
for prediction when no other version is specified. A default version can help you simplify the
prediction request, and avoid specifying the model version every time. By setting the
parentModel parameter to the model ID of the currently deployed model and setting it as the
default version, you can create a new model that is based on the existing model, and use it for
prediction without specifying the model version. However, creating a new endpoint, creating a
new model, setting the parentModel parameter to the model ID of the currently deployed model
and setting it as the default version, uploading the model to Vertex AI Model Registry, and
deploying the new model to the new endpoint and setting the new model to 100% of the traffic
would require more skills and steps than creating a new model, setting the parentModel
parameter to the model ID of the currently deployed model, uploading the model to Vertex AI
Model Registry, deploying the new model to the existing endpoint, and setting the new model to
100% of the traffic.
You would need to write code, create and configure the new endpoint, create and configure the
new model, upload the model to Vertex AI Model Registry, and deploy the model to the new
endpoint. Moreover, this option would create a new endpoint, which can increase the
maintenance and management costs2.
Option D: Creating a new model, setting it as the default version, uploading the model to Vertex
AI Model Registry, and deploying the new model to the existing endpoint would not allow you to
inherit the settings and metadata of the existing model, and could cause errors or poor
performance. A default version is a model version that is used for prediction when no other
version is specified. A default version can help you simplify the prediction request, and avoid
specifying the model version every time. By setting the new model as the default version, you
can use the new model for prediction without specifying the model version. However, creating a
new model, setting it as the default version, uploading the model to Vertex AI Model Registry,
and deploying the new model to the existing endpoint would not allow you to inherit the settings
and metadata of the existing model, and could cause errors or poor performance. You would
need to write code, create and configure the new model, upload the model to Vertex AI Model
Registry, and deploy the model to the existing endpoint. Moreover, this option would not set the
parentModel parameter to the model ID of the currently deployed model, which could prevent
you from inheriting the settings and metadata of the existing model, and cause inconsistencies
or conflicts between the model versions2.
Reference: Preparing for Google Cloud Certification: Machine Learning Engineer, Course 3:
Production ML
Systems, Week 2: Serving ML Predictions
Google Cloud Professional Machine Learning Engineer Exam Guide, Section 3: Scaling ML
models in production, 3.1 Deploying ML models to production
Official Google Cloud Certified Professional Machine Learning Engineer Study Guide, Chapter
6:
Production ML Systems, Section 6.2: Serving ML Predictions
Vertex AI
Cloud DNS
51. You built a custom ML model using scikit-learn. Training time is taking longer than expected.
You decide to migrate your model to Vertex AI Training, and you want to improve the model’s
training time.
What should you try out first?
A. Migrate your model to TensorFlow, and train it using Vertex AI Training.
B. Train your model in a distributed mode using multiple Compute Engine VMs.
C. Train your model with DLVM images on Vertex AI, and ensure that your code utilizes NumPy
and SciPy internal methods whenever possible.
D. Train your model using Vertex AI Training with GPUs.
Answer: D
Explanation:
Option A is incorrect because migrating your model to TensorFlow, and training it using Vertex
AI Training, is not the easiest way to improve the model’s training time. TensorFlow is a
framework that allows you to create and train ML models using Python or other languages.
Vertex AI Training is a service that allows you to train and optimize ML models using built-in
algorithms or custom containers. However, this option requires significant code changes, as
TensorFlow and scikit-learn have different APIs and functionalities. Moreover, this option does
not leverage the parallelism or the scalability of the cloud, as it only uses a single instance.
Option B is incorrect because training your model in a distributed mode using multiple Compute
Engine VMs, is not the most convenient way to improve the model’s training time. Compute
Engine is a service that allows you to create and manage virtual machines that run on Google
Cloud. You can use Compute Engine to run your scikit-learn model in a distributed mode, by
using libraries such as Dask or Joblib. However, this option requires more effort and resources
than option D, as it involves creating and configuring the VMs, installing and maintaining the
libraries, and writing and running the distributed code.
Option C is incorrect because training your model with DLVM images on Vertex AI, and
ensuring that your code utilizes NumPy and SciPy internal methods whenever possible, is not
the most effective way to improve the model’s training time. DLVM (Deep Learning Virtual
Machine) images are preconfigured VM images that include popular ML frameworks and tools,
such as TensorFlow, PyTorch, or scikit-learn1. You can use DLVM images on Vertex AI to train
your scikit-learn model, by using a custom container. NumPy and SciPy are libraries that
provide numerical and scientific computing functionalities for Python. You can use NumPy and
SciPy internal methods to optimize your scikit-learn code, as they are faster and more efficient
than pure Python code2. However, this option does not leverage the parallelism or the
scalability of the cloud, as it only uses a single instance. Moreover, this option may not have a
significant impact on the training time, as scikit-learn already relies on NumPy and SciPy for
most of its operations3.
Option D is correct because training your model using Vertex AI Training with GPUs, is the best
way to improve the model’s training time. A GPU (Graphics Processing Unit) is a hardware
accelerator that can perform parallel computations faster than a CPU (Central Processing
Unit)4. Vertex AI
Training is a service that allows you to train and optimize ML models using built-in algorithms or
custom containers. You can use Vertex AI Training with GPUs to train your scikit-learn model,
by using a custom container and specifying the accelerator type and count5. By using Vertex AI
Training with GPUs, you can leverage the parallelism and the scalability of the cloud, and speed
up the training
process significantly, without changing your code.
Reference: DLVM images
NumPy and SciPy
scikit-learn dependencies
GPU overview
Vertex AI Training with GPUs
[scikit-learn overview]
[TensorFlow overview]
[Compute Engine overview]
[Dask overview]
[Joblib overview]
[Vertex AI Training overview]
52. You work as an ML researcher at an investment bank and are experimenting with the
Gemini large language model (LLM). You plan to deploy the model for an internal use case and
need full control of the model’s underlying infrastructure while minimizing inference time.
Which serving configuration should you use for this task?
A. Deploy the model on a Vertex AI endpoint using one-click deployment in Model Garden.
B. Deploy the model on a Google Kubernetes Engine (GKE) cluster manually by creating a
custom YAML manifest.
C. Deploy the model on a Vertex AI endpoint manually by creating a custom inference
container.
D. Deploy the model on a Google Kubernetes Engine (GKE) cluster using the deployment
options in Model Garden.
Answer: B
Explanation:
Deploying the model on GKE with a custom YAML manifest allows maximum control over
infrastructure and latency, aligning with the need for low inference time and internal model use.
Vertex AI's one-click deployment (Option A) limits control, and deploying on Vertex AI (Option
C) doesn’t allow for as much customization as a GKE setup.
53. Your team needs to build a model that predicts whether images contain a driver's license,
passport, or credit card. The data engineering team already built the pipeline and generated a
dataset composed of 10,000 images with driver's licenses, 1,000 images with passports, and
1,000 images with credit cards. You now have to train a model with the following label map:
['driversjicense', 'passport', 'credit_card'].
Which loss function should you use?
A. Categorical hinge
B. Binary cross-entropy
C. Categorical cross-entropy
D. Sparse categorical cross-entropy
Answer: C
Explanation:
Categorical cross-entropy is a loss function that is suitable for multi-class classification
problems, where the target variable has more than two possible values. Categorical cross-
entropy measures the difference between the true probability distribution of the target classes
and the predicted probability distribution of the model. It is defined as: L - sum(y_i * log(p_i))
where y_i is the true probability of class i, and p_i is the predicted probability of class i.
Categorical cross-entropy penalizes the model for making incorrect predictions, and encourages
the model to assign high probabilities to the correct classes and low probabilities to the incorrect
classes. For the use case of building a model that predicts whether images contain a driver’s
license, passport, or credit card, categorical cross-entropy is the appropriate loss function to
use. This is because the problem is a multi-class classification problem, where the target
variable has three possible values: [‘drivers_license’, ‘passport’, ‘credit_card’]. The label
map is a list that maps the class names to the class indices, such that ‘drivers_license’
corresponds to index 0, ‘passport’ corresponds to index 1, and ‘credit_card’ corresponds to
index 2. The model should output a probability distribution over the three classes for each
image, and the categorical cross-entropy loss function should compare the output with the true
labels. Therefore, categorical cross-entropy is the best loss function for this use case.
B)
C)
D)
A. Option A
B. Option B
C. Option C
D. Option D
Answer: C
Explanation:
Vertex Explainable AI is a set of tools and frameworks to help you understand and interpret
predictions made by your machine learning models, natively integrated with a number of
Google’s products and services1. With Vertex Explainable AI, you can generate feature-based
explanations that show how much each input feature contributed to the model’s prediction2.
This can help you debug and improve your model performance, and build confidence in your
model’s behavior. Feature-based explanations are supported for custom image classification
models deployed on Vertex AI Prediction3.
Reference: Explainable AI | Google Cloud
Introduction to Vertex Explainable AI | Vertex AI | Google Cloud
Supported model types for feature-based explanations | Vertex AI | Google Cloud
56. You have a large corpus of written support cases that can be classified into 3 separate
categories: Technical Support, Billing Support, or Other Issues. You need to quickly build, test,
and deploy a service that will automatically classify future written requests into one of the
categories.
How should you configure the pipeline?
A. Use the Cloud Natural Language API to obtain metadata to classify the incoming cases.
B. Use AutoML Natural Language to build and test a classifier. Deploy the model as a REST
API.
C. Use BigQuery ML to build and test a logistic regression model to classify incoming requests.
Use BigQuery ML to perform inference.
D. Create a TensorFlow model using Google’s BERT pre-trained model. Build and test a
classifier, and deploy the model using Vertex AI.
Answer: B
Explanation:
AutoML Natural Language is a service that allows you to quickly build, test and deploy natural
language processing (NLP) models without needing to have expertise in NLP or machine
learning. You can use it to train a classifier on your corpus of written support cases, and then
use the AutoML API to perform classification on new requests. Once the model is trained, it can
be deployed as a REST API. This allows the classifier to be integrated into your pipeline and be
easily consumed by other systems.
61. Pub/Sub messages containing predictions are then consumed by a downstream system for
monitoring.
C. 1. Export your data to Cloud Storage using Dataflow.