GCP GCP-PDE Free Practice Questions — Page 3

Q: You want to process payment transactions in a point-of-sale application that will run on Google Cloud Platform. Your user base could grow exponentially, but you do not want to manage infrastructure scaling. Which Google database service should you use?

D. Cloud Datastore. Cloud Datastore (now Firestore in Datastore mode) is a fully managed, serverless NoSQL database that scales automatically — no infrastructure management needed. It suits transactional, high-growth applications. Cloud SQL (A) requires instance sizing and manual scaling. BigQuery (B) is an analytics warehouse, not a transactional database. Cloud Bigtable (C) is for massive analytical/time-series workloads and requires cluster management.

Question 1

You are designing a basket abandonment system for an ecommerce company. The system will send a message to a user based on these rules: ✑ No interaction by the user on the site for 1 hour Has added more than $30 worth of products to the basket ✑ Has not completed a transaction You use Google Cloud Dataflow to process the data and decide if a message should be sent. How should you design the pipeline?

Answer

Show Answer & Explanation

Correct Answer: C. Use a session window with a gap time duration of 60 minutes.

A session window groups events by user activity and closes after a defined period of inactivity — exactly matching the rule "no interaction for 1 hour." When the session closes after 60 minutes of inactivity, Dataflow can evaluate the basket state and trigger a message. Fixed windows (A) split time into equal chunks regardless of user activity. Sliding windows (B) overlap and don't represent inactivity gaps. Global windows (D) collect all data and don't naturally model per-user inactivity.

Question 2

Your company handles data processing for a number of different clients. Each client prefers to use their own suite of analytics tools, with some allowing direct query access via Google BigQuery. You need to secure the data so that clients cannot see each other's data. You want to ensure appropriate access to the data. Which three steps should you take? (Choose three.)

Answer

Show Answer & Explanation

Correct Answers: B. Load data into a different dataset for each client.; D. Restrict a client's dataset to approved users.; F. Use the appropriate identity and access management (IAM) roles for each client's users.

Loading each client's data into a separate dataset (B) provides natural isolation boundaries in BigQuery. Restricting each dataset to approved users (D) ensures clients can't cross-access. Using proper IAM roles (F) enforces permissions at the dataset or project level. Option A (partitions) doesn't isolate clients — partitions exist within one table. Option C is backwards — datasets contain tables, not the other way. Option E (service account only) would block clients from querying their own data directly.

Question 3

You want to process payment transactions in a point-of-sale application that will run on Google Cloud Platform. Your user base could grow exponentially, but you do not want to manage infrastructure scaling. Which Google database service should you use?

Answer

Show Answer & Explanation

Correct Answer: D. Cloud Datastore

Cloud Datastore (now Firestore in Datastore mode) is a fully managed, serverless NoSQL database that scales automatically — no infrastructure management needed. It suits transactional, high-growth applications. Cloud SQL (A) requires instance sizing and manual scaling. BigQuery (B) is an analytics warehouse, not a transactional database. Cloud Bigtable (C) is for massive analytical/time-series workloads and requires cluster management.

Question 4

You want to use a database of information about tissue samples to classify future tissue samples as either normal or mutated. You are evaluating an unsupervised anomaly detection method for classifying the tissue samples. Which two characteristic support this method? (Choose two.)

Answer

Show Answer & Explanation

Correct Answers: A. There are very few occurrences of mutations relative to normal samples.; C. You expect future mutations to have different features from the mutated samples in the database.

Unsupervised anomaly detection works best when mutations are rare (A) — the model learns "normal" and flags deviations. It also works when future mutations are different from known mutations (C), because the method doesn't rely on labeled examples — it detects novelty. Option B (equal occurrences) favors supervised learning. Option D (similar future mutations to known ones) favors supervised classification. Option E (labeled data exists) is the defining characteristic of supervised learning, not unsupervised.

Question 5

You need to store and analyze social media postings in Google BigQuery at a rate of 10,000 messages per minute in near real-time. Initially, design the application to use streaming inserts for individual postings. Your application also performs data aggregations right after the streaming inserts. You discover that the queries after streaming inserts do not exhibit strong consistency, and reports from the queries might miss in-flight data. How can you adjust your application design?

Answer

Show Answer & Explanation

Correct Answer: A. Re-write the application to load accumulated data every 2 minutes.

BigQuery streaming inserts have a known limitation: data is not immediately available for queries due to the streaming buffer. Aggregations right after inserts will miss in-flight rows. Switching to micro-batch loads every 2 minutes ensures data is fully committed before queries run, achieving near-real-time with strong consistency. Option B (batch per individual message) defeats the purpose of batching. Option C (Cloud SQL export) adds unnecessary complexity and latency. Option D (waiting twice the latency) is unreliable and not a design solution.

GCP GCP-PDE Free Practice Questions — Page 3

Ready for the Full GCP-PDE Experience?