Professional-Cloud-DevOps-Engineer Google Cloud Certified - Professional Cloud DevOps Engineer Exam sample Question + Exam 2025 Practice Exam Dumps

Question # 4

You work for a company that manages highly sensitive user data. You are designing the Google Kubernetes Engine (GKE) infrastructure for your company, including several applications that will be deployed in development and production environments. Your design must protect data from unauthorized access from other applications while minimizing the amount of management overhead required. What should you do?

Create one cluster for the organization with separate namespaces for each application and environment combination.

Create one cluster for each environment (development and production) with each application in its own namespace within each cluster.

Create one cluster for the organization with separate namespaces for each application.

Create one cluster for each application with separate namespaces for production and development environments.

Full Access

Question # 5

Your organization stores all application logs from multiple Google Cloud projects in a central Cloud Logging project. Your security team wants to enforce a rule that each project team can only view their respective logs, and only the operations team can view all the logs. You need to design a solution that meets the security team's requirements, while minimizing costs. What should you do?

Export logs to BigQuery tables for each project team. Grant project teams access to their tables. Grant logs writer access to the operations team in the central logging project.

Create log views for each project team, and only show each project team their application logs. Grant the operations team access to the _ Al Il-jogs View in the central logging project.

Grant each project team access to the project _ Default view in the central logging project. Grant logging viewer access to the operations team in the central logging project.

Create Identity and Access Management (IAM) roles for each project team and restrict access to the _ Default log view in their individual Google Cloud project. Grant viewer access to the operations team in the central logging project.

Full Access

Question # 6

You are performing a semi-annual capacity planning exercise for your flagship service You expect a service user growth rate of 10% month-over-month for the next six months Your service is fully containerized and runs on a Google Kubemetes Engine (GKE) standard cluster across three zones with cluster autoscaling enabled You currently consume about 30% of your total deployed CPU capacity and you require resilience against the failure of a zone. You want to ensure that your users experience minimal negative impact as a result of this growth o' as a result of zone failure while you avoid unnecessary costs How should you prepare to handle the predicted growth?

Verify the maximum node pool size enable a Horizontal Pod Autoscaler and then perform a load lest to verify your expected resource needs

Because you deployed the service on GKE and are using a cluster autoscaler your GKE cluster will scale automatically regardless of growth rate

Because you are only using 30% of deployed CPU capacity there is significant headroom and you do not need to add any additional capacity for this rate of growth

Proactively add 80% more node capacity to account for six months of 10% growth rate and then perform a load test to ensure that you have enough capacity

Full Access

Question # 7

You use Terraform to manage an application deployed to a Google Cloud environment The application runs on instances deployed by a managed instance group The Terraform code is deployed by using aCI/CD pipeline When you change the machine type on the instance template used by the managed instance group, the pipeline fails at the terraform apply stage with the following error message

You need to update the instance template and minimize disruption to the application and the number of pipeline runs What should you do?

Delete the managed instance group and recreate it after updating the instance template

Add a new instance template update the managed instance group to use the new instance template and delete the old instance template

Remove the managed instance group from the Terraform state file update the instance template and reimport the managed instance group.

Set the create_bef ore_destroy meta-argument to true in the lifecycle block on the instance template

Full Access

Question # 8

You need to define SLOs for a high-traffic web application. Customers are currently happy with the application performance and availability. Based on current measurement, the 90th percentile Of latency is 160 ms and the 95th

percentile of latency is 300 ms over a 28-day window. What latency SLO should you publish?

90th percentile - 150 ms95th percentile - 290 ms

90th percentile - 160 ms95th percentile - 300 ms

90th percentile - 190 ms95th percentile - 330 ms

90th percentile - 300 ms95th percentile - 450 ms

Full Access

Question # 9

Your company runs services by using multiple globally distributed Google Kubernetes Engine (GKE) clusters Your operations team has set up workload monitoring that uses Prometheus-based tooling for metrics alerts: and generating dashboards This setup does not provide a method to view metrics globally across all clusters You need to implement a scalable solution to support global Prometheus querying and minimize management overhead What should you do?

Configure Prometheus cross-service federation for centralized data access

Configure workload metrics within Cloud Operations for GKE

Configure Prometheus hierarchical federation for centralized data access

Configure Google Cloud Managed Service for Prometheus

Full Access

Question # 10

Your team has an application built by using a Dockerfile. The build is executed from Cloud Build, and the resulting artifacts are stored in Artifact Registry. Your team is reporting that builds are slow. You need to increase build speed, while following Google-recommended practices. What should you do?

Use the --cache-from parameter, and point to Artifact Registry. Add the most frequently modified files to the later stages of the build process.

Use the --cache-from parameter, and point to Artifact Registry. Add the most frequently modified files to the earlier stages of the build process.

Cache the container layers of the build process to Cloud Storage. Add the most frequently modified files to the earlier stages of the build process.

Cache the container layers of the build process to Cloud Storage. Add the most frequently modified files to the later stages of the build process.

Full Access

Question # 11

Your company recently migrated to Google Cloud. You need to design a fast, reliable, and repeatable solution for your company to provision new projects and basic resources in Google Cloud. What should you do?

Use the Google Cloud console to create projects.

Write a script by using the gcloud CLI that passes the appropriate parameters from the request. Save the script in a Git repository.

Write a Terraform module and save it in your source control repository. Copy and run the apply command to create the new project.

Use the Terraform repositories from the Cloud Foundation Toolkit. Apply the code with appropriate parameters to create the Google Cloud project and related resources.

Full Access

Question # 12

You are running an application on Compute Engine and collecting logs through Stackdriver. You discover that some personally identifiable information (PII) is leaking into certain log entry fields. You want to prevent these fields from being written in new log entries as quickly as possible. What should you do?

Use the filter-record-transformer Fluentd filter plugin to remove the fields from the log entries in flight.

Use the fluent-plugin-record-reformer Fluentd output plugin to remove the fields from the log entries in flight.

Wait for the application developers to patch the application, and then verify that the log entries are no longer exposing PII.

Stage log entries to Cloud Storage, and then trigger a Cloud Function to remove the fields and write the entries to Stackdriver via the Stackdriver Logging API.

Full Access

Question # 13

Your company experiences bugs, outages, and slowness in its production systems. Developers use the production environment for new feature development and bug fixes. Configuration and experiments are done in the production environment, causing outages for users. Testers use the production environmentfor load testing, which often slows the production systems. You need to redesign the environment to reduce the number of bugs and outages in production and to enable testers to load test new features. What should you do?

Create an automated testing script in production to detect failures as soon as they occur.

Create a development environment with smaller server capacity and give access only to developers and testers.

Secure the production environment to ensure that developers can't change it and set up one controlled update per year.

Create a development environment for writing code and a test environment for configurations, experiments, and load testing.

Full Access

Question # 14

You manage several production systems that run on Compute Engine in the same Google Cloud Platform (GCP) project. Each system has its own set of dedicated Compute Engine instances. You want to know how must it costs to run each of the systems. What should you do?

In the Google Cloud Platform Console, use the Cost Breakdown section to visualize the costs per system.

Assign all instances a label specific to the system they run. Configure BigQuery billing export and query costs per label.

Enrich all instances with metadata specific to the system they run. Configure Stackdriver Logging to export to BigQuery, and query costs based on the metadata.

Name each virtual machine (VM) after the system it runs. Set up a usage report export to a Cloud Storage bucket. Configure the bucket as a source in BigQuery to query costs based on VM name.

Full Access

Question # 15

You support a stateless web-based API that is deployed on a single Compute Engine instance in the europe-west2-a zone . The Service Level Indicator (SLI) for service availability is below the specified Service Level Objective (SLO). A postmortem has revealed that requests to the API regularly time out. The time outs are due to the API having a high number of requests and running out memory. You want to improve service availability. What should you do?

Change the specified SLO to match the measured SLI.

Move the service to higher-specification compute instances with more memory.

Set up additional service instances in other zones and load balance the traffic between all instances.

Set up additional service instances in other zones and use them as a failover in case the primary instance is unavailable.

Full Access

Question # 16

Your company stores a large volume of infrequently used data in Cloud Storage. The projects in your company's CustomerService folder access Cloud Storage frequently, but store very little data. You want to enable Data Access audit logging across the company to identify data usage patterns. You need to exclude the CustomerService folder projects from Data Access audit logging. What should you do?

Enable Data Access audit logging for Cloud Storage for all projects and folders, and configure exempted principals to include users of the CustomerService folder.

Enable Data Access audit logging for Cloud Storage at the organization level, with no additional configuration.

Enable Data Access audit logging for Cloud Storage at the organization level, and configure exempted principals to include users of the CustomerService folder.

Enable Data Access audit logging for Cloud Storage for all projects and folders other than the CustomerService folder.

Full Access

Question # 17

Your team uses Cloud Build for all CI/CO pipelines. You want to use the kubectl builder for Cloud Build to deploy new images to Google Kubernetes Engine (GKE). You need to authenticate to GKE while minimizing development effort. What should you do?

Assign the Container Developer role to the Cloud Build service account.

Specify the Container Developer role for Cloud Build in the cloudbuild.yaml file.

Create a new service account with the Container Developer role and use it to run Cloud Build.

Create a separate step in Cloud Build to retrieve service account credentials and pass these to kubectl.

Full Access

Question # 18

You are creating a CI/CD pipeline to perform Terraform deployments of Google Cloud resources Your CI/CD tooling is running in Google Kubernetes Engine (GKE) and uses an ephemeral Pod for each pipeline run You must ensure that the pipelines that run in the Pods have the appropriate Identity and Access Management (1AM) permissions to perform the Terraform deployments You want to follow Google-recommended practices for identity management What should you do?

Choose 2 answers

Create a new Kubernetes service account, and assign the service account to the Pods Use Workload Identity to authenticate as the Google service account

Create a new JSON service account key for the Google service account store the key as a Kubernetes secret, inject the key into the Pods, and set the boogle_application_credentials environment variable

Create a new Google service account, and assign the appropriate 1AM permissions

Create a new JSON service account key for the Google service account store the key in the secret management store for the CI/CD tool and configure Terraform to use this key for authentication

Assign the appropriate 1AM permissions to the Google service account associated with the Compute Engine VM instances that run the Pods

Full Access

Question # 19

Your Cloud Run application writes unstructured logs as text strings to Cloud Logging. You want to convert the unstructured logs to JSON-based structured logs. What should you do?

A Install a Fluent Bit sidecar container, and use a JSON parser.

Install the log agent in the Cloud Run container image, and use the log agent to forward logs to Cloud Logging.

Configure the log agent to convert log text payload to JSON payload.

Modify the application to use Cloud Logging software development kit (SDK), and send log entries with a jsonPay10ad field.

Full Access

Answer:

Explanation:

The correct answer is D. Modify the application to use Cloud Logging software development kit (SDK), and send log entries with a jsonPayload field.

Cloud Logging SDKs are libraries that allow you to write structured logs from your Cloud Run application. You can use the SDKs to create log entries with a jsonPayload field, which contains a JSON object with the properties of your log entry.The jsonPayload field allows you to use advanced features of Cloud Logging, such as filtering, querying, and exporting logs based on the properties of your log entry1.

To use Cloud Logging SDKs, you need to install the SDK for your programming language, and then use the SDK methods to create and send log entries to Cloud Logging.For example, if you are using Node.js, you can use the following code to write a structured log entry with a jsonPayload field2:

// Imports the Google Cloud client library

const {Logging} = require('@google-cloud/logging');

// Creates a client

const logging = new Logging();

// Selects the log to write to

const log = logging.log('my-log');

// The data to write to the log

const text = 'Hello, world!';

const metadata = {

// Set the Cloud Run service name and revision as labels

labels: {

service_name: process.env.K_SERVICE || 'unknown',

revision_name: process.env.K_REVISION || 'unknown',

// Set the log entry payload type and value

jsonPayload: {

message: text,

timestamp: new Date(),

};

// Prepares a log entry

const entry = log.entry(metadata);

// Writes the log entry

await log.write(entry);

console.log(`Logged: ${text}`);

Using Cloud Logging SDKs is the best way to convert unstructured logs to structured logs, as it provides more flexibility and control over the format and content of your log entries.

Using a Fluent Bit sidecar container is not a good option, as it adds complexity and overhead to your Cloud Run application.Fluent Bit is a lightweight log processor and forwarder that can be used to collect and parse logs from various sources and send them to different destinations3. However, Cloud Run does not support sidecar containers, so you would need to run Fluent Bit as part of your main container image. This would require modifying your Dockerfile and configuring Fluent Bit to read logs from supported locations and parse them as JSON. This is more cumbersome and less reliable than using Cloud Logging SDKs.

Using the log agent in the Cloud Run container image is not possible, as the log agent is not supported on Cloud Run. The log agent is a service that runs on Compute Engine or Google Kubernetes Engine instances and collects logs from various applications and system components. However, Cloud Run does not allow you to install or run any agents on its underlying infrastructure, as it is a fully managed service that abstracts away the details of the underlying platform.

Storing the password directly in the code is not a good practice, as it exposes sensitive information and makes it hard to change or rotate the password. It also requires rebuilding and redeploying the application each time the password changes, which adds unnecessary work and downtime.

[References:, 1:Writing structured logs | Cloud Run Documentation | Google Cloud, 2:Write structured logs | Cloud Run Documentation | Google Cloud, 3: Fluent Bit - Fast and Lightweight Log Processor & Forwarder, : Logging Best Practices for Serverless Applications - Google Codelabs, : About the logging agent | Cloud Logging Documentation | Google Cloud, : Cloud Run FAQ | Google Cloud, , , , , ]

Question # 20

You are using Terraform to manage infrastructure as code within a Cl/CD pipeline You notice that multiple copies of the entire infrastructure stack exist in your Google Cloud project, and a new copy is created each time a change to the existing infrastructure is made You need to optimize your cloud spend by ensuring that only a single instance of your infrastructure stack exists at a time. You want to follow Google-recommended practices What should you do?

Create a new pipeline to delete old infrastructure stacks when they are no longer needed

Confirm that the pipeline is storing and retrieving the terraform. if state file from Cloud Storage with the Terraform gcs backend

Verify that the pipeline is storing and retrieving the terrafom.tfstat* file from a source control

Update the pipeline to remove any existing infrastructure before you apply the latest configuration

Full Access

Question # 21

Your company runs an e-commerce business. The application responsible for payment processing has structured JSON logging with the following schema:

Capture and access of logs from the payment processing application is mandatory for operations, but the jsonPayload.user_email field contains personally identifiable information (PII). Your security team does not want the entire engineering team to have access to PII. You need to stop exposing PII to the engineering team and restrict access to security team members only. What should you do?

Apply a jsonPayload.user_email exclusion filter to the _Default bucket.

Apply the conditional role binding resource.name.extract("locations/global/buckets/(bucket)/") == "_Default" to the _Default bucket.

Apply a jsonPayload.user_email restricted field to the _Default bucket. Grant the Log Field Accessor role to the security team members.

Modify the application to toggle inclusion of user_email when the log_user_email environment variable is set to true. Restrict the engineering team members who can change the production environment variable by using the CODEOWNERS file.

Full Access

Question # 22

You work for a global organization and are running a monolithic application on Compute Engine You need to select the machine type for the application to use that optimizes CPU utilization by using the fewest number of steps You want to use historical system metncs to identify the machine type for the application to use You want to follow Google-recommended practices What should you do?

Use the Recommender API and apply the suggested recommendations

Create an Agent Policy to automatically install Ops Agent in all VMs

Install the Ops Agent in a fleet of VMs by using the gcloud CLI

Review the Cloud Monitoring dashboard for the VM and choose the machine type with the lowest CPU utilization

Full Access

Answer:

Explanation:

Â The best option for selecting the machine type for the application to use that optimizes CPU utilization by using the fewest number of steps is to use the Recommender API and apply the suggested recommendations. The Recommender API is a service that provides recommendations for optimizing your Google Cloud resources, such as Compute Engine instances, disks, and firewalls. You can use the Recommender API to get recommendations for changing the machine type of your Compute Engine instances based on historical system metrics, such as CPU utilization. You can also apply the suggested recommendations by using the Recommender API or Cloud Console. This way, you can optimize CPU utilization by using the most suitable machine type for your application with minimal effort.

Your CTO has asked you to implement a postmortem policy on every incident for internal use. You want to define what a good postmortem is to ensure that the policy is successful at your company. What should you do?

Choose 2 answers

Ensure that all postmortems include what caused the incident, identify the person or team responsible for

causing the incident. and how to prevent a future occurrence of the incident.

Ensure that all postmortems include what caused the incident, how the incident could have been worse, and how to prevent a future occurrence of the incident.

Ensure that all postmortems include the severity of the incident, how to prevent a future occurrence of the incident. and what caused the incident without naming internal system components.

Ensure that all postmortems include how the incident was resolved and what caused the incident without naming customer information.

Ensure that all postmortems include all incident participants in postmortem authoring and share postmortems as widely as possible,

Answer: BE

The correct answers are B and E.

A good postmortem should include what caused the incident, how the incident could have been worse, and how to prevent a future occurrence of the incident1. This helps to identify the root cause of the problem, the impact of the incident, and the actions to take to mitigate or eliminate the risk of recurrence.

A good postmortem should also include all incident participants in postmortem authoring and share postmortems as widely as possible2. This helps to foster a culture of learning and collaboration, as well as to increase the visibility and accountability of the incident response process.

Answer A is incorrect because it assigns blame to a person or team, which goes against the principle of blameless postmortems2. Blameless postmortems focus on finding solutions rather than pointing fingers, and encourage honest and constructive feedback without fear of punishment.

Answer C is incorrect because it omits how the incident could have been worse, which is an important factor to consider when evaluating the severity and impact of the incident1. It also avoids naming internal system components, which makes it harder to understand the technical details and root cause of the problem.

Answer D is incorrect because it omits how to prevent a future occurrence of the incident, which is the main goal of a postmortem1. It also avoids naming customer information, which may be relevant for understanding the impact and scope of the incident.

Your uses Jenkins running on Google Cloud VM instances for CI/CD. You need to extend the functionality to use infrastructure as code automation by using Terraform. You must ensure that the Terraform Jenkins instance is authorized to create Google Cloud resources. You want to follow Google-recommended practices- What should you do?

Add the auth application-default command as a step in Jenkins before running the Terraform commands.

Create a dedicated service account for the Terraform instance. Download and copy the secret key value to the GOOGLE environment variable on the Jenkins server.

Confirm that the Jenkins VM instance has an attached service account with the appropriate Identity and Access Management (IAM) permissions.

use the Terraform module so that Secret Manager can retrieve credentials.

Answer: C

The correct answer is C.

Confirming that the Jenkins VM instance has an attached service account with the appropriate Identity and Access Management (IAM) permissions is the best way to ensure that the Terraform Jenkins instance is authorized to create Google Cloud resources.This follows the Google-recommended practice of using service accounts to authenticate and authorize applications running on Google Cloud1.Service accounts are associated with private keys that can be used to generate access tokens for Google Cloud APIs2.By attaching a service account to the Jenkins VM instance, Terraform can use the Application Default Credentials (ADC) strategy to automatically find and use the service account credentials3.

Answer A is incorrect because the auth application-default command is used to obtain user credentials, not service account credentials.User credentials are not recommended for applications running on Google Cloud, as they are less secure and less scalable than service account credentials1.

Answer B is incorrect because it involves downloading and copying the secret key value of the service account, which is not a secure or reliable way of managing credentials.The secret key value should be kept private and not exposed to any other system or user2. Moreover, setting the GOOGLE environment variable on the Jenkins server is not a valid way of providing credentials to Terraform.Terraform expects the credentials to be either in a file pointed by the GOOGLE_APPLICATION_CREDENTIALS environment variable, or in a provider block with the credentials argument3.

Answer D is incorrect because it involves using the Terraform module for Secret Manager, which is a service that stores and manages sensitive data such as API keys, passwords, and certificates. While Secret Manager can be used to store and retrieve credentials, it is not necessary or sufficient for authorizing the Terraform Jenkins instance. The Terraform Jenkins instance still needs a service account with the appropriate IAM permissions to access Secret Manager and other Google Cloud resources.

You are analyzing Java applications in production. All applications have Cloud Profiler and Cloud Trace installed and configured by default. You want to determine which applications need performance tuning. What should you do?

Choose 2 answers

A. Examine the wall-clock time and the CPU time Of the application. If the difference is substantial, increase the CPU resource allocation.

B. Examine the wall-clock time and the CPU time of the application. If the difference is substantial, increase the memory resource allocation.

C. 17 Examine the wall-clock time and the CPU time of the application. If the difference is substantial, increase the local disk storage allocation.

D. O Examine the latency time, the wall-clock time, and the CPU time of the application. If the latency time is slowly burning down the error budget, and the difference between wall-clock time and CPU time is minimal, mark the application for optimization.

E. Examine the heap usage Of the application. If the usage is low, mark the application for optimization.

Answer: AD

The correct answers are A and D.

Examine the wall-clock time and the CPU time of the application. If the difference is substantial, increase the CPU resource allocation. This is a good way to determine if the application is CPU-bound, meaning that it spends more time waiting for the CPU than performing actual computation.Increasing the CPU resource allocation can improve the performance of CPU-bound applications1.

Examine the latency time, the wall-clock time, and the CPU time of the application. If the latency time is slowly burning down the error budget, and the difference between wall-clock time and CPU time is minimal, mark the application for optimization. This is a good way to determine if the application is I/O-bound, meaning that it spends more time waiting for input/output operations than performing actual computation.Increasing the CPU resource allocation will not help I/O-bound applications, and they may need optimization to reduce the number or duration of I/O operations2.

Answer B is incorrect because increasing the memory resource allocation will not help if the application is CPU-bound or I/O-bound. Memory allocation affects how much data the application can store and access in memory, but it does not affect how fast the application can process that data.

Answer C is incorrect because increasing the local disk storage allocation will not help if the application is CPU-bound or I/O-bound. Disk storage affects how much data the application can store and access on disk, but it does not affect how fast the application can process that data.

Answer E is incorrect because examining the heap usage of the application will not help to determine if the application needs performance tuning. Heap usage affects how much memory the application allocates for dynamic objects, but it does not affect how fast the application can process those objects. Moreover, low heap usage does not necessarily mean that the application is inefficient or unoptimized.

You deployed an application into a large Standard Google Kubernetes Engine (GKE) cluster. The application is stateless and multiple pods run at the same time. Your application receives inconsistent traffic. You need to ensure that the user experience remains consistent regardless of changes in traffic. and that the resource usage of the cluster is optimized.

What should you do?

Configure a cron job to scale the deployment on a schedule.

Configure a Horizontal Pod Autoscaler.

Configure a Vertical Pod Autoscaler.

Configure cluster autoscaling on the node pool.

Answer: B

Question # 23

You are running a web application that connects to an AlloyDB cluster by using a private IP address in your default VPC. You need to run a database schema migration in your CI/CD pipeline by using Cloud Build before deploying a new version of your application. You want to follow Google-recommended security practices. What should you do? Â

Set up a Cloud Build private pool to access the database through a static external IP address. Configure the database to only allow connections from this IP address. Execute the schema migration script in the private pool.

Create a service account that has permission to access the database. Configure Cloud Build to use this service account and execute the schema migration script in a private pool.

Add the database username and encrypted password to the application configuration file. Use these credentials in Cloud Build to execute the schema migration script.

Add the database username and password to Secret Manager. When running the schema migration script, retrieve the username and password from Secret Manager.

Full Access

Answer:

Explanation:

To securely connect Cloud Build to an AlloyDB cluster using a private IP address and adhere to Google-recommended security practices, you need to address two main aspects:

Network Connectivity:Ensuring Cloud Build can reach the private IP of the AlloyDB cluster.

Authentication/Credential Management:Securely authenticating Cloud Build to the AlloyDB cluster.

Let's break down why Option B is the most suitable:

Cloud Build Private Pool:AlloyDB is accessed via a private IP in your VPC. Cloud Build's default build environment runs on Google-managed infrastructure outside your VPC and cannot directly access private IP addresses. To enable this, you must use aCloud Build private pool. A private pool can be configured with VPC peering to your default VPC, allowing build steps running within that pool to access resources like your AlloyDB cluster via their private IPs. Option B correctly includes "execute the schema migration script in a private pool." Â

Service Account with Permissions (IAM Database Authentication):AlloyDB supports IAM database authentication. This is a Google-recommended security practice because it allows you to manage database access using Google Cloud's Identity and Access Management (IAM) rather than relying on traditional database passwords. Â

You would create a dedicated service account for Cloud Build (or use the private pool's service account).

This service account would be granted the necessary IAM roles to connect to the AlloyDB instance (e.g., roles/alloydb.client) and a database-level IAM role for login (e.g., roles/alloydb.user or roles/alloydb.admin depending on the permissions needed for schema migration). Â

Cloud Build would then be configured to use this service account. The "permission to access the database" in Option B refers to these IAM permissions. This method avoids managing and distributing database passwords.

Analyzing the options:

A. Set up a Cloud Build private pool to access the database through a static external IP address...

While using a private pool is correct for network access, routing this through a staticexternalIP for a resource that has aprivateIP is generally not the first-choice secure pattern if direct private access is feasible. It adds complexity and a potential external exposure point, even if firewalled. The aim is to keep traffic within the private network as much as possible.

B. Create a service account that has permission to access the database. Configure Cloud Build to use this service account and execute the schema migration script in a private pool.

This option correctly combines the use of aprivate pool(for private IP network access) with aservice account having permissions(strongly implying IAM database authentication for AlloyDB, which is a best practice). This is a secure and robust approach.

C. Add the database username and encrypted password to the application configuration file...

Storing credentials, even if "encrypted" (the method and key management for encryption are unspecified and problematic), in application configuration files checked into source control or packaged with the application is a significant security risk and not a recommended practice.

D. Add the database username and password to Secret Manager. When running the schema migration script, retrieve the username and password from Secret Manager.

UsingSecret Managerto store database usernames and passwords is a Google-recommended practiceifyou are using password-based authentication. However, this optionalonedoes not solve the network connectivity issue for Cloud Build to reach the private IP of AlloyDB. You would still need a private pool. While D is good for secret management, B offers a more comprehensive solution that includes both the network aspect and implies a more modern authentication method (IAM database auth). If the question forced a choice between only doing secure credential storage (D) or doing IAM auth + private networking (B), B is more complete for the overall task. Â

Conclusion:Option B is the most aligned with Google-recommended security practices as it addresses both the necessary private network connectivity via a Cloud Build private pool and promotes the use of IAM-based database authentication for AlloyDB, which is generally preferred over managing passwords.

References (General Concepts):

Cloud Build Private Pools for VPC Access:Google Cloud documentation for Cloud Build explicitly details using private pools to connect to resources in a VPC network.

See:https://www.google.com/search?q=https://cloud.google.com/build/docs/private-pools/accessing-private-resources-with-private-pools

AlloyDB IAM Database Authentication:Google Cloud documentation for AlloyDB highlights IAM database authentication as a secure method.

See:https://www.google.com/search?q=https://cloud.google.com/alloydb/docs/iam-authentication

Secret Manager:If password authentication were the only option, Secret Manager would be the recommended way to store those credentials.

See:https://cloud.google.com/secret-manager

Option B synergizes the benefits of private networking and modern IAM-based authentication for a comprehensive secure solution.

Question # 24

You are developing a Node.js utility on a workstation in Cloud Workstations by using Code OSS. The utility is a simple web page, and you have already confirmed that all necessary firewall rules are in place. You tested the application by starting it on port 3000 on your workstation in Cloud Workstations, but you need to be able to access the web page from your local machine. You need to follow Google-recommended security practices. What should you do?

Allow public IP addresses in the Cloud Workstations configuration.

Use a browser running on a bastion host VM.

Run the gcloud compute start-iap-tunnel command to the Cloud Workstations VM.

Click the preview link in the Code OSS panel.

Full Access

Question # 25

You built a serverless application by using Cloud Run and deployed the application to your production environment You want to identify the resource utilization of the application for cost optimization What should you do?

Use Cloud Trace with distributed tracing to monitor the resource utilization of the application

Use Cloud Profiler with Ops Agent to monitor the CPU and memory utilization of the application

Use Cloud Monitoring to monitor the container CPU and memory utilization of the application

Use Cloud Ops to create logs-based metrics to monitor the resource utilization of the application

Full Access

Question # 26

You are responsible for the reliability of a high-volume enterprise application. A large number of users report that an important subset of the applicationâ€™s functionality â€“ a data intensive reporting feature â€“ is consistently failing with an HTTP 500 error. When you investigate your applicationâ€™s dashboards, you notice a strong correlation between the failures and a metric that represents the size of an internal queue used for generating reports. You trace the failures to a reporting backend that is experiencing high I/O wait times. You quickly fix the issue by resizing the backendâ€™s persistent disk (PD). How you need to create an availability Service Level Indicator (SLI) for the report generation feature. How would you define it?

As the I/O wait times aggregated across all report generation backends

As the proportion of report generation requests that result in a successful response

As the applicationâ€™s report generation queue size compared to a known-good threshold

As the reporting backend PD throughout capacity compared to a known-good threshold

Full Access

Question # 27

Your team is designing a new application for deployment into Google Kubernetes Engine (GKE). You need to set up monitoring to collect and aggregate various application-level metrics in a centralized location. You want to use Google Cloud Platform services while minimizing the amount of work required to set up monitoring. What should you do?

Publish various metrics from the application directly to the Slackdriver Monitoring API, and then observe these custom metrics in Stackdriver.

Install the Cloud Pub/Sub client libraries, push various metrics from the application to various topics, and then observe the aggregated metrics in Stackdriver.

Install the OpenTelemetry client libraries in the application, configure Stackdriver as the export destination for the metrics, and then observe the application's metrics in Stackdriver.

Emit all metrics in the form of application-specific log messages, pass these messages from the containers to the Stackdriver logging collector, and then observe metrics in Stackdriver.

Full Access

Question # 28

You have migrated an e-commerce application to Google Cloud Platform (GCP). You want to prepare the application for the upcoming busy season. What should you do first to prepare for the busy season?

Load teat the application to profile its performance for scaling.

Enable AutoScaling on the production clusters, in case there is growth.

Pre-provision double the compute power used last season, expecting growth.

Create a runbook on inflating the disaster recovery (DR) environment if there is growth.

Full Access

Question # 29

You are monitoring a service that uses n2-standard-2 Compute Engine instances that serve large files. Users have reported that downloads are slow. Your Cloud Monitoring dashboard shows that your VMS are running at peak network throughput. You want to improve the network throughput performance. What should you do?

Deploy a Cloud NAT gateway and attach the gateway to the subnet of the VMS.

Add additional network interface controllers (NICs) to your VMS.

Change the machine type for your VMS to n2-standard-8.

Deploy the Ops Agent to export additional monitoring metrics.

Full Access

Question # 30

Your company is creating a new cloud-native Google Cloud organization. You expect this Google Cloud organization to first be used by a small number of departments and then expand to be used by a large number of departments. Each department has a large number of applications varying in size. You need to design the VPC network architecture. Your solution must minimize the amount of management required while remaining flexible enough for development teams to quickly adapt to their evolving needs. What should you do?

Create a separate VPC for each department and connect the VPCs with VPC Network Peering.

Create a separate VPC for each department and use Private Service Connect to connect the VPCs.

Create a separate VPC for each application and use Private Service Connect to connect the VPCs.

Create a separate VPC for each department and connect the VPCs with Cloud VPN.

Full Access

Question # 31

You support a production service that runs on a single Compute Engine instance. You regularly need to spend time on recreating the service by deleting the crashing instance and creating a new instance based on the relevant image. You want to reduce the time spent performing manual operations while following Site Reliability Engineering principles. What should you do?

File a bug with the development team so they can find the root cause of the crashing instance.

Create a Managed Instance Group with a single instance and use health checks to determine the system status.

Add a Load Balancer in front of the Compute Engine instance and use health checks to determine the system status.

Create a Stackdriver Monitoring dashboard with SMS alerts to be able to start recreating the crashed instance promptly after it has crashed.

Full Access

Question # 32

You need to run a business-critical workload on a fixed set of Compute Engine instances for several months. The workload is stable with the exact amount of resources allocated to it. You want to lower the costs for this workload without any performance implications. What should you do?

Purchase Committed Use Discounts.

Migrate the instances to a Managed Instance Group.

Convert the instances to preemptible virtual machines.

Create an Unmanaged Instance Group for the instances used to run the workload.

Full Access

Question # 33

Your company's security team needs to have read-only access to Data Access audit logs in the _Required bucket You want to provide your security team with the necessary permissions following the principle of least privilege and Google-recommended practices. What should you do?

Assign the roles/logging, viewer role to each member of the security team

Assign the roles/logging. viewer role to a group with all the security team members

Assign the roles/logging.privateLogViewer role to each member of the security team

Assign the roles/logging.privateLogviewer role to a group with all the security team members

Full Access

Question # 34

Your organization wants to increase the availability target of an application from 99 9% to 99 99% for an investment of $2 000 The application's current revenue is S1,000,000 You need to determine whether the increase in availability is worth the investment for a single year of usage What should you do?

Calculate the value of improved availability to be $900, and determine that the increase in availability is not worth the investment

Calculate the value of improved availability to be $1 000 and determine that the increase in availability is not worth the investment

Calculate the value of improved availability to be $1 000 and determine that the increase in availability is worth the investment

Calculate the value of improved availability to be $9,000. and determine that the increase in availability is worth the investment

Full Access

Question # 35

You are on-call for an infrastructure service that has a large number of dependent systems. You receive an alert indicating that the service is failing to serve most of its requests and all of its dependent systems with hundreds of thousands of users are affected. As part of your Site Reliability Engineering (SRE) incident management protocol, you declare yourself Incident Commander (IC) and pull in two experienced people from your team as Operations Lead (OLJ and Communications Lead (CL). What should you do next?

Look for ways to mitigate user impact and deploy the mitigations to production.

Contact the affected service owners and update them on the status of the incident.

Establish a communication channel where incident responders and leads can communicate with each other.

Start a postmortem, add incident information, circulate the draft internally, and ask internal stakeholders for input.

Full Access

Question # 36

You recently configured an App Hub application. You are able to see the managed instance group, backend service, and URL map listed in App Hub, but you do not see the forwarding rule. You must ensure that the forwarding rule is listed. What should you do?

Attach the project containing the forwarding rule as an App Hub service project.

Enable the App Hub API in the project containing the forwarding rule.

Configure the forwarding rule to forward to the correct target proxy.

Full Access

Answer:

Explanation:

Comprehensive and Detailed Explanation From General Google Cloud Knowledge:

App Hub allows you to organize and discover services and applications within your Google Cloud environment. For App Hub to recognize and display resources as components of an "application," these resources often need to be explicitly registered or discovered as "services" within that application's configuration. While App Hub can automatically discover some resources (like GKE workloads, Cloud Run services), for other resources, or to establish specific relationships, manual registration or more detailed configuration is sometimes required.

Option A (Attach the project containing the forwarding rule as an App Hub service project): While App Hub works across projects (host project for the application, service projects for services and workloads), simply attaching the project might not be sufficient for App Hub to automatically pick up and categorize every resource like a forwarding rule specifically for a defined application without further context. The forwarding rule needs to be associated with a service within the App Hub application.

Option B (Enable the App Hub API in the project containing the forwarding rule): The App Hub API needs to be enabled in projects where you want to manage App Hub resources (applications, services, workloads). If it wasn't enabled, you likely wouldn't be able to see any resources from that project. Since other resources are visible, this is less likely the root cause for a single missing resource, though it's a prerequisite for App Hub to function at all with that project.

Option C (Configure the forwarding rule to forward to the correct target proxy): While correct configuration of the forwarding rule is essential for its operational functionality, App Hub's ability to list the forwarding rule is more about its discovery and registration within App Hub's model rather than its traffic-directing correctness. An incorrectly configured forwarding rule that is properly registered might still appear in App Hub, perhaps with an error status.

Option D (Register the forwarding rule as a service in the application configuration): App Hub applications are composed of "services," and these services are in turn composed of "workloads" or other discovered/registered resources. A forwarding rule is typically an entry point or part of the infrastructure for a service. Explicitly registering it or the resource it points to (which then allows App Hub to trace back to the forwarding rule) as a service or part of a service within the application configuration would make it visible and properly cataloged by App Hub. App Hub discovers resources by looking for specific labels or by manual registration. If it's not automatically discovered as part of a recognized workload (like a GCE instance group service exposed via a load balancer), explicit registration is often the way to make it appear.

Reference (Based on general App Hub functionality):

App Hub discovers resources that are part of registered applications and their services. Services in App Hub can be based on various Google Cloud resources. If a resource like a forwarding rule isn't automatically linked to a displayed workload, it might need to be explicitly defined as a service or part of a service.

From the Google Cloud documentation on App Hub concepts:

"Applications are the core organizational unit in App Hub. An application represents a logical system that delivers business value... Services represent the logical components of an application... Workloads are instances of your services running on Google Cloud infrastructure. App Hub automatically discovers workloads for supported resource types or you can manually register them."

Forwarding rules are associated with load balancing, which exposes services. If the service that the forwarding rule points to is correctly registered and identified by App Hub, associated infrastructure like the forwarding rule should typically be discoverable. If it's not, ensuring the service it fronts is correctly registered and that App Hub understands this link is key. Option D aligns with this concept of ensuring the relevant component (which the forwarding rule is part of) is registered within the application structure.

You can find more information in the official Google Cloud documentation regarding App Hub:

App Hub overview: https://cloud.google.com/app-hub/docs/overview

Registering services and workloads: Documentation would detail how different resources are discovered or need to be registered.

Question # 37

You are responsible for creating and modifying the Terraform templates that define your Infrastructure. Because two new engineers will also be working on the same code, you need to define a process and adopt a tool that will prevent you from overwriting each other's code. You also want to ensure that you capture all updates in the latest version. What should you do?

â€¢ Store your code in a Git-based version control system.â€¢ Establish a process that allows developers to merge their own changes at the end of each day.â€¢ Package and upload code lo a versioned Cloud Storage bucket as the latest master version.

â€¢ Store your code in a Git-based version control system.â€¢ Establish a process that includes code reviews by peers and unit testing to ensure integrity and functionality before integration of code.â€¢ Establish a process where the fully integrated code in the repository becomes the latest master version.

â€¢ Store your code as text files in Google Drive in a defined folder structure that organizes the files.â€¢ At the end of each day. confirm that all changes have been captured in the files within the folder structure.â€¢ Rename the folder structure with a predefined naming convention that increments the version.

â€¢ Store your code as text files in Google Drive in a defined folder structure that organizes the files.â€¢ At the end of each day, confirm that all changes have been captured in the files within the folder structure and create a new .zip archive with a predefined naming convention.â€¢ Upload the .zip archive to a versioned Cloud Storage bucket and accept it as the latest version.

Full Access

Question # 38

Your company is migrating its production systems to Google Cloud. You need to implement site reliability engineering (SRE) practices during the migration to minimize customer impact from potential future incidents. Which two SRE practices should you implement?

Choose 2 answers

Ensure that full autonomy and permissions are only granted to the on-call team.

Automate common tasks to analyze key impact information and intelligently suggest mitigating actions for the on-call team.

Ensure that all teams can modify the production environment to resolve issues.

Create an alerting mechanism for your SRE team based on your system's internal behavior.

Create up-to-date playbooks with instructions for debugging and mitigating issues.

Full Access

Answer:

B, E

Explanation:

Comprehensive and Detailed Explanation From General SRE Principles and Google Cloud Knowledge:

Site Reliability Engineering (SRE) emphasizes reliability, automation, and a data-driven approach to operations. The goal is to minimize the "time to detect" (TTD) and "time to resolve" (TTR) for incidents.

Option A (Ensure that full autonomy and permissions are only granted to the on-call team): While the on-call team needs appropriate permissions to act decisively during an incident, granting full autonomy and only to them can be a bottleneck and goes against the principle of least privilege if not carefully scoped. Broader teams might need specific, controlled access for their responsibilities. SRE encourages empowering teams but within a structured framework.

Option B (Automate common tasks to analyze key impact information and intelligently suggest mitigating actions for the on-call team): This is a core SRE practice. Automation reduces toil, speeds up response, and ensures consistency. Analyzing impact and suggesting mitigations helps the on-call team resolve issues faster and more effectively.

Option C (Ensure that all teams can modify the production environment to resolve issues): This is generally a bad practice and against SRE principles of controlled changes and reducing the blast radius of errors. Production changes should be managed, audited, and ideally automated, not open to modification by all teams, as this increases the risk of unintended incidents.

Option D (Create an alerting mechanism for your SRE team based on your system's internal behavior): While alerting is crucial, SRE emphasizes alerting on symptoms that affect users (Service Level Objectives - SLOs) rather than just internal behavior or causes. Alerting solely on internal behavior can lead to alert fatigue and may not correlate directly with user impact. Good alerting focuses on user-facing impact first.

Option E (Create up-to-date playbooks with instructions for debugging and mitigating issues): Playbooks (or runbooks) are essential in SRE. They document known issues, troubleshooting steps, and mitigation procedures. Keeping them up-to-date ensures that on-call engineers can respond to incidents quickly and consistently, even for less common issues, thereby minimizing customer impact.

Therefore, automating incident response tasks (B) and maintaining clear, actionable playbooks (E) are two key SRE practices to implement for minimizing customer impact.

Reference (Based on SRE principles):

The SRE books by Google (e.g., "Site Reliability Engineering: How Google Runs Production Systems") heavily emphasize automation to reduce toil and the importance of playbooks for incident management.

Google Cloud SRE solutions: https://cloud.google.com/sre

Specifically, regarding playbooks and automation:"Playbooks should be living documents, updated regularly as systems change and new incidents provide new lessons."

"SREs aim to automate repetitive tasks (toil) to free up time for engineering projects that improve reliability."

Question # 39

Your team deploys applications to three Google Kubernetes Engine (GKE) environments development staging and production You use GitHub reposrtones as your source of truth You need to ensure that the three environments are consistent You want to follow Google-recommended practices to enforce and install network policies and a logging DaemonSet on all the GKE clusters in those environments What should you do?

Use Google Cloud Deploy to deploy the network policies and the DaemonSet Use Cloud Monitoring to trigger an alert if the network policies and DaemonSet drift from your source in the repository.

Use Google Cloud Deploy to deploy the DaemonSet and use Policy Controller to configure the network policies Use Cloud Monitoring to detect drifts from the source in the repository and Cloud Functions tocorrect the drifts

Use Cloud Build to render and deploy the network policies and the DaemonSet Set up Config Sync to sync the configurations for the three environments

Use Cloud Build to render and deploy the network policies and the DaemonSet Set up a Policy Controller to enforce the configurations for the three environments

Full Access

Question # 40

Some of your production services are running in Google Kubernetes Engine (GKE) in the eu-west-1 region. Your build system runs in the us-west-1 region. You want to push the container images from your build system to a scalable registry to maximize the bandwidth for transferring the images to the cluster. What should you do?

Push the images to Google Container Registry (GCR) using the gcr.io hostname.

Push the images to Google Container Registry (GCR) using the us.gcr.io hostname.

Push the images to Google Container Registry (GCR) using the eu.gcr.io hostname.

Push the images to a private image registry running on a Compute Engine instance in the eu-west-1 region.

Full Access

Question # 41

You are the on-call Site Reliability Engineer for a microservice that is deployed to a Google Kubernetes Engine (GKE) Autopilot cluster. Your company runs an online store that publishes order messages to Pub/Sub and a microservice receives these messages and updates stock information in the warehousing system. A sales event caused an increase in orders, and the stock information is not being updated quickly enough. This is causing a large number of orders to be accepted for products that are out of stock You check the metrics for the microservice and compare them to typical levels.

You need to ensure that the warehouse system accurately reflects product inventory at the time orders are placed and minimize the impact on customers What should you do?

Decrease the acknowledgment deadline on the subscription

Add a virtual queue to the online store that allows typical traffic levels

Increase the number of Pod replicas

Increase the Pod CPU and memory limits

Full Access

Question # 42

You have deployed a fleet Of Compute Engine instances in Google Cloud. You need to ensure that monitoring metrics and logs for the instances are visible in Cloud Logging and Cloud Monitoring by your company's operations and cyber

security teams. You need to grant the required roles for the Compute Engine service account by using Identity and Access Management (IAM) while following the principle of least privilege. What should you do?

Grant the logging.editor and monitoring.metricwriter roles to the Compute Engine service accounts.

Grant the Logging. admin and monitoring . editor roles to the Compute Engine service accounts.

Grant the logging. logwriter and monitoring. editor roles to the Compute Engine service accounts.

Grant the logging. logWriter and monitoring. metricWriter roles to the Compute Engine service accounts.

Full Access

Answer:

Explanation:

The correct answer is D. Grant the logging.logWriter and monitoring.metricWriter roles to the Compute Engine service accounts.

According to the Google Cloud documentation, the Compute Engine service account is a Google-managed service account that is automatically created when you enable the Compute Engine API1.This service account is used by default to run your Compute Engine instances and access other Google Cloud services on your behalf1.To ensure that monitoring metrics and logs for the instances are visible in Cloud Logging and Cloud Monitoring, you need to grant the following IAM roles to the Compute Engine service account23:

The logging.logWriter role allows the service account to write log entries to Cloud Logging4.

The monitoring.metricWriter role allows the service account to write custom metrics to Cloud Monitoring5.

These roles grant the minimum permissions that are needed for logging and monitoring, following the principle of least privilege. The other roles are either unnecessary or too broad for this purpose.For example, the logging.editor role grants permissions to create and update logs, log sinks, and log exclusions, which are not required for writing log entries6. The logging.admin role grants permissions to delete logs, log sinks, and log exclusions, which are not required for writing log entries and may pose a security risk if misused. The monitoring.editor role grants permissions to create and update alerting policies, uptime checks, notification channels, dashboards, and groups, which are not required for writing custom metrics.

[Reference:, Service accounts, Service accounts.Setting up Stackdriver Logging for Compute Engine, Setting up Stackdriver Logging for Compute Engine.Setting up Stackdriver Monitoring for Compute Engine, Setting up Stackdriver Monitoring for Compute Engine.Predefined roles, Predefined roles.Predefined roles, Predefined roles.Predefined roles, Predefined roles. [Predefined roles], Predefined roles. [Predefined roles], Predefined roles., , , , , ]

Question # 43

You are writing a postmortem for an incident that severely affected users. You want to prevent similar incidents in the future. Which two of the following sections should you include in the postmortem? (Choose two.)

An explanation of the root cause of the incident

A list of employees responsible for causing the incident

A list of action items to prevent a recurrence of the incident

Your opinion of the incidentâ€™s severity compared to past incidents

Copies of the design documents for all the services impacted by the incident

Full Access

Question # 44

You are the Operations Lead for an ongoing incident with one of your services. The service usually runs at around 70% capacity. You notice that one node is returning 5xx errors for all requests. There has also been a noticeable increase in support cases from customers. You need to remove the offending node from the load balancer pool so that you can isolate and investigate the node. You want to follow Google-recommended practices to manage the incident and reduce the impact on users. What should you do?

1. Communicate your intent to the incident team.2. Perform a load analysis to determine if the remaining nodes can handle the increase in traffic offloaded from the removed node, and scale appropriately.3. When any new nodes report healthy, drain traffic from the unhealthy node, and remove the unhealthy node from service.

1. Communicate your intent to the incident team.2. Add a new node to the pool, and wait for the new node to report as healthy.3. When traffic is being served on the new node, drain traffic from the unhealthy node, and remove the old node from service.

1 . Drain traffic from the unhealthy node and remove the node from service.2. Monitor traffic to ensure that the error is resolved and that the other nodes in the pool are handling the traffic appropriately.3. Scale the pool as necessary to handle the new load.4. Communicate your actions to the incident team.

1 . Drain traffic from the unhealthy node and remove the old node from service.2. Add a new node to the pool, wait for the new node to report as healthy, and then serve traffic to the new node.3. Monitor traffic to ensure that the pool is healthy and is handling traffic appropriately.4. Communicate your actions to the incident team.

Full Access

Answer:

Explanation:

The correct answer is A. Communicate your intent to the incident team. Perform a load analysis to determine if the remaining nodes can handle the increase in traffic offloaded from the removed node, and scale appropriately. When any new nodes report healthy, drain traffic from the unhealthy node, and remove the unhealthy node from service.

This answer follows the Google-recommended practices for incident management, as described in the Chapter 9 - Incident Response, Google SRE Book1. According to this source, some of the best practices are:

Maintain a clear line of command. Designate clearly defined roles. Keep a working record of debugging and mitigation as you go. Declare incidents early and often.

Communicate your intent before taking any action that might affect the service or the incident response. This helps to avoid confusion, duplication of work, or unintended consequences.

Perform a load analysis before removing a node from the load balancer pool, as this might affect the capacity and performance of the service. Scale the pool as necessary to handle the expected load.

Drain traffic from the unhealthy node before removing it from service, as this helps to avoid dropping requests or causing errors for users.

Answer A follows these best practices by communicating the intent to the incident team, performing a load analysis and scaling the pool, and draining traffic from the unhealthy node before removing it.

Answer B does not follow the best practice of performing a load analysis before adding or removing nodes, as this might cause overloading or underutilization of resources.

Answer C does not follow the best practice of communicating the intent before taking any action, as this might cause confusion or conflict with other responders.

Answer D does not follow the best practice of draining traffic from the unhealthy node before removing it, as this might cause errors for users.

[References:, 1:Chapter 9 - Incident Response, Google SRE Book, , , , , ]

Question # 45

Your organization wants to collect system logs that will be used to generate dashboards in Cloud Operations for their Google Cloud project. You need to configure all current and future Compute Engine instances to collect the system logs and you must ensure that the Ops Agent remains up to date. What should you do?

Use the gcloud CLI to install the Ops Agent on each VM listed in the Cloud Asset Inventory

Select all VMs with an Agent status of Not detected on the Cloud Operations VMs dashboard Then select Install agents

Use the gcloud CLI to create an Agent Policy.

Install the Ops Agent on the Compute Engine image by using a startup script

Full Access

Question # 46

You are leading a DevOps project for your organization. The DevOps team is responsible for managing the service infrastructure and being on-call for incidents. The Software Development team is responsible for writing, submitting, and reviewing code. Neither team has any published SLOs. You want to design a new joint-ownership model for a service between the DevOps team and the Software Development team. Which responsibilities should be assigned to each team in the new joint-ownership model?

Option A

Option B

Option C

Option D

Full Access

Answer:

Explanation:

The correct answer is D. Option D.

According to the DevOps best practices, a joint-ownership model for a service between the DevOps team and the Software Development team should follow these principles12:

The DevOps team and the Software Development team should share the responsibility and collaboration for managing the service infrastructure, performing code reviews, and adopting and sharing SLOs for the service.

The DevOps team and the Software Development team should have end-to-end ownership of the service, from design to development to deployment to operation to maintenance.

The DevOps team and the Software Development team should use common tools and processes to facilitate communication, coordination, and feedback.

The DevOps team and the Software Development team should align their goals and incentives with the business outcomes and customer satisfaction.

Option D is the only option that reflects these principles. Option D assigns both teams the responsibilities of managing the service infrastructure, performing code reviews, and adopting and sharing SLOs for the service. Option D also implies that both teams have end-to-end ownership of the service, as they are involved in every stage of the service lifecycle.Option D also encourages both teams to use common tools and processes, such as GitLab3, to collaborate and communicate effectively. Option D also aligns both teams with the business outcomes and customer satisfaction, as they use SLOs to measure and improve the service quality.

The other options are incorrect because they do not follow the DevOps best practices. Option A is incorrect because it assigns only the DevOps team the responsibility of managing the service infrastructure, which creates a silo between the two teams and reduces their collaboration. Option A also does not assign any responsibility for adopting and sharing SLOs for the service, which means that both teams lack a common metric for measuring and improving the service quality. Option B is incorrect because it assigns only the Software Development team the responsibility of performing code reviews, which creates a gap between the two teams and reduces their feedback. Option B also does not assign any responsibility for adopting and sharing SLOs for the service, which means that both teams lack a common metric for measuring and improving the service quality. Option C is incorrect because it assigns both teams the same responsibilities as option A and option B, which combines their drawbacks.

[Reference:, 5 key organizational models for DevOps teams | GitLab, 5 key organizational models for DevOps teams | GitLab.Building a Culture of Full-Service Ownership - DevOps.com, Building a Culture of Full-Service Ownership - DevOps.com.GitLab, GitLab., , , , , ]

Question # 47

You have a pool of application servers running on Compute Engine. You need to provide a secure solution that requires the least amount of configuration and allows developers to easily access application logs for troubleshooting. How would you implement the solution on GCP?

â€¢ Deploy the Stackdriver logging agent to the application servers.â€¢ Give the developers the IAM Logs Viewer role to access Stackdriver and view logs.

â€¢ Deploy the Stackdriver logging agent to the application servers.â€¢ Give the developers the IAM Logs Private Logs Viewer role to access Stackdriver and view logs.

â€¢ Deploy the Stackdriver monitoring agent to the application servers.â€¢ Give the developers the IAM Monitoring Viewer role to access Stackdriver and view metrics.

â€¢ Install the gsutil command line tool on your application servers.â€¢ Write a script using gsutil to upload your application log to a Cloud Storage bucket, and then schedule it to run via cron every 5 minutes.â€¢ Give the developers IAM Object Viewer access to view the logs in the specified bucket.

Full Access

Question # 48

You use Spinnaker to deploy your application and have created a canary deployment stage in the pipeline. Your application has an in-memory cache that loads objects at start time. You want to automate the comparison of the canary version against the production version. How should you configure the canary analysis?

Compare the canary with a new deployment of the current production version.

Compare the canary with a new deployment of the previous production version.

Compare the canary with the existing deployment of the current production version.

Compare the canary with the average performance of a sliding window of previous production versions.

Full Access

Question # 49

Your applicationâ€™s performance in Google Cloud has degraded since the last release. You suspect that downstream dependencies might be causing some requests to take longer to complete. You need to investigate the issue with your application to determine the cause. What should you do?

Configure Cloud Trace in your application.

Configure Error Reporting in your application.

Configure Cloud Profiler in your application.

Configure Google Cloud Managed Service for Prometheus in your application.

Full Access

Question # 50

You are developing a strategy for monitoring your Google Cloud Platform (GCP) projects in production using Stackdriver Workspaces. One of the requirements is to be able to quickly identify and react to production environment issues without false alerts from development and staging projects. You want to ensure that you adhere to the principle of least privilege when providing relevant team members with access to Stackdriver Workspaces. What should you do?

Grant relevant team members read access to all GCP production projects. Create Stackdriver workspaces inside each project.

Grant relevant team members the Project Viewer IAM role on all GCP production projects. Create Slackdriver workspaces inside each project.

Choose an existing GCP production project to host the monitoring workspace. Attach the production projects to this workspace. Grant relevant team members read access to the Stackdriver Workspace.

Create a new GCP monitoring project, and create a Stackdriver Workspace inside it. Attach the production projects to this workspace. Grant relevant team members read access to the Stackdriver Workspace.

Full Access

Question # 51

Your company runs applications in Google Kubernetes Engine (GKE). Several applications rely on ephemeral volumes. You noticed some applications were unstable due to the DiskPressure node condition on the worker nodes. You need

to identify which Pods are causing the issue, but you do not have execute access to workloads and nodes. What should you do?

Check the node/ephemeral_storage/used_bytes metric by using Metrics Explorer.

Check the metric by using Metrics Explorer.

Locate all the Pods with emptyDir volumes. use the df-h command to measure volume disk usage.

Locate all the Pods with emptyDir volumes. Use the du -sh * command to measure volume disk usage.

Full Access

Question # 52

You are using Stackdriver to monitor applications hosted on Google Cloud Platform (GCP). You recently deployed a new application, but its logs are not appearing on the Stackdriver dashboard.

You need to troubleshoot the issue. What should you do?

Confirm that the Stackdriver agent has been installed in the hosting virtual machine.

Confirm that your account has the proper permissions to use the Stackdriver dashboard.

Confirm that port 25 has been opened in the firewall to allow messages through to Stackdriver.

Confirm that the application is using the required client library and the service account key has proper permissions.

Full Access

Question # 53

You are responsible for the reliability of a custom-built, distributed file storage service that your company uses internally. This service handles thousands of file uploads and downloads daily. You need to define a service level indicator (SLI) to measure the reliability of your service usage and configure alerts to be notified of potential issues. Which SLI should you use to measure the reliability of the service?

Average request latency of API calls (e.g. get, put, list)

Average size of objects stored in your service

Ratio of successful API calls to the total number of attempted API calls

Number of successful file uploads and downloads per minute

Full Access

Question # 54

The new version of your containerized application has been tested and is ready to be deployed to production on Google Kubernetes Engine (GKE) You could not fully load-test the new version in your pre-production environment and you need to ensure that the application does not have performance problems after deployment Your deployment must be automated What should you do?

Deploy the application through a continuous delivery pipeline by using canary deployments Use Cloud Monitoring to look for performance issues, and ramp up traffic as supported by the metrics

Deploy the application through a continuous delivery pipeline by using blue/green deployments Migrate traffic to the new version of the application and use Cloud Monitoring to look for performance issues

Deploy the application by using kubectl and use Config Connector to slowly ramp up traffic between versions. Use Cloud Monitoring to look for performance issues

Deploy the application by using kubectl and set the spec. updatestrategy. type field to RollingUpdate Use Cloud Monitoring to look for performance issues, and run the kubectl rollback command if there are any issues.

Full Access

Question # 55

You are configuring your CI/CD pipeline natively on Google Cloud. You want builds in a pre-production Google Kubernetes Engine (GKE) environment to be automatically load-tested before being promoted to the production GKE environment. You need to ensure that only builds that have passed this test are deployed to production. You want to follow Google-recommended practices. How should you configure this pipeline with Binary Authorization?

Create an attestation for the builds that pass the load test by requiring the lead quality assurance engineer to sign the attestation by using a key stored in Cloud Key Management Service (Cloud KMS).

Create an attestation for the builds that pass the load test by using a private key stored in Cloud Key Management Service (Cloud KMS) authenticated through Workload Identity.

Create an attestation for the builds that pass the load test by using a private key stored in Cloud Key Management Service (Cloud KMS) with a service account JSON key stored as a Kubernetes Secret.

Create an attestation for the builds that pass the load test by requiring the lead quality assurance engineer to sign the attestation by using their personal private key.

Full Access

Question # 56

You support a user-facing web application When analyzing the application's error budget over the previous six months you notice that the application never consumed more than 5% of its error budget You hold a SLO review with business stakeholders and confirm that the SLO is set appropriately You want your application's reliability to more closely reflect its SLO What steps can you take to further that goal while balancing velocity, reliability, and business needs?

Choose 2 answers

Add more serving capacity to all of your application's zones

Implement and measure all other available SLIs for the application

Announce planned downtime to consume more error budget and ensure that users are not depending on a tighter SLO

Have more frequent or potentially risky application releases

Tighten the SLO to match the application's observed reliability

Full Access

Question # 57

DevOps team responsibilitiesManage the service infrastructureBe on-call for incidentsPerform code reviewsSoftware Development team responsibilitiesSubmit code to be reviewed by the DevOps teamPublish the SLOs that the DevOps team must meet

DevOps team responsibilitiesManage the service infrastructurePerform code reviewsSoftware Development team responsibilitiesSubmit code to be reviewed by the DevOps teamBe on-call for incidentsPublish the SLOs that the DevOps team must meet

DevOps team responsibilitiesShared responsibilities for code reviewsSoftware Development team responsibilitiesManage the service infrastructureBe on-call for incidents on a rotation basisAdopt and publish SLOs for the serviceSubmit code to be reviewed

DevOps team responsibilitiesManage the service infrastructureBe on-call for incidentsSoftware Development team responsibilitiesAdopt and publish SLOs for the serviceSubmit code to be reviewedShared responsibilities for code reviews

Full Access

Question # 58

Your company runs services by using Google Kubernetes Engine (GKE). The GKE clusters in the development environment run applications with verbose logging enabled. Developers view logs by using the kubect1 logs

command and do not use Cloud Logging. Applications do not have a uniform logging structure defined. You need to minimize the costs associated with application logging while still collecting GKE operational logs. What should you do?

Run the gcloud container clusters update --loggingâ€”SYSTEM command for the development cluster.

Run the gcloud container clusters update logging=WORKLOAD command for the development cluster.

Run the gcloud logging sinks update _Defau1t --disabled command in the project associated with the development environment.

Add the severity >= DEBUG resource. type "k83 container" exclusion filter to the Default logging sink in the project associated with the development environment.

Full Access

New Year Sale Special - Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: mxmas70

Professional-Cloud-DevOps-Engineer Google Cloud Certified - Professional Cloud DevOps Engineer Exam Question and Answers

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Answer:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer: