Win IT Exam with Last Dumps 2025


Google Professional-Data Exam

Page 32/32
Viewing Questions 311 319 out of 319 Questions
100.00%

Question 311
Your chemical company needs to manually check documentation for customer order. You use a pull subscription in Pub/Sub so that sales agents get details from the order. You must ensure that you do not process orders twice with different sales agents and that you do not add more complexity to this workflow. What should you do?
A. Use a Deduplicate PTransform in Dataflow before sending the messages to the sales agents.
B. Create a transactional database that monitors the pending messages.
C. Use Pub/Sub exactly-once delivery in your pull subscription.
D. Create a new Pub/Sub push subscription to monitor the orders processed in the agent's system.

Question 312
You are migrating your on-premises data warehouse to BigQuery. As part of the migration, you want to facilitate cross-team collaboration to get the most value out of the organization’s data. You need to design an architecture that would allow teams within the organization to securely publish, discover, and subscribe to read-only data in a self-service manner. You need to minimize costs while also maximizing data freshness. What should you do?
A. Use Analytics Hub to facilitate data sharing.
B. Create authorized datasets to publish shared data in the subscribing team's project.
C. Create a new dataset for sharing in each individual team’s project. Grant the subscribing team the bigquery.dataViewer role on the dataset.
D. Use BigQuery Data Transfer Service to copy datasets to a centralized BigQuery project for sharing.

Question 313
You want to migrate an Apache Spark 3 batch job from on-premises to Google Cloud. You need to minimally change the job so that the job reads from Cloud Storage and writes the result to BigQuery. Your job is optimized for Spark, where each executor has 8 vCPU and 16 GB memory, and you want to be able to choose similar settings. You want to minimize installation and management effort to run your job. What should you do?
A. Execute the job as part of a deployment in a new Google Kubernetes Engine cluster.
B. Execute the job from a new Compute Engine VM.
C. Execute the job in a new Dataproc cluster.
D. Execute as a Dataproc Serverless job.

Question 314
You are configuring networking for a Dataflow job. The data pipeline uses custom container images with the libraries that are required for the transformation logic preinstalled. The data pipeline reads the data from Cloud Storage and writes the data to BigQuery. You need to ensure cost-effective and secure communication between the pipeline and Google APIs and services. What should you do?
A. Disable external IP addresses from worker VMs and enable Private Google Access.
B. Leave external IP addresses assigned to worker VMs while enforcing firewall rules.
C. Disable external IP addresses and establish a Private Service Connect endpoint IP address.
D. Enable Cloud NAT to provide outbound internet connectivity while enforcing firewall rules.

Question 315
You are using Workflows to call an API that returns a 1KB JSON response, apply some complex business logic on this response, wait for the logic to complete, and then perform a load from a Cloud Storage file to BigQuery. The Workflows standard library does not have sufficient capabilities to perform your complex logic, and you want to use Python's standard library instead. You want to optimize your workflow for simplicity and speed of execution. What should you do?
A. Create a Cloud Composer environment and run the logic in Cloud Composer.
B. Create a Dataproc cluster, and use PySpark to apply the logic on your JSON file.
C. Invoke a Cloud Function instance that uses Python to apply the logic on your JSON file.
D. Invoke a subworkflow in Workflows to apply the logic on your JSON file.


Question 316
You are administering a BigQuery on-demand environment. Your business intelligence tool is submitting hundreds of queries each day that aggregate a large (50 TB) sales history fact table at the day and month levels. These queries have a slow response time and are exceeding cost expectations. You need to decrease response time, lower query costs, and minimize maintenance. What should you do?
A. Build authorized views on top of the sales table to aggregate data at the day and month level.
B. Enable BI Engine and add your sales table as a preferred table.
C. Build materialized views on top of the sales table to aggregate data at the day and month level.
D. Create a scheduled query to build sales day and sales month aggregate tables on an hourly basis.

Question 317
You have several different unstructured data sources, within your on-premises data center as well as in the cloud. The data is in various formats, such as Apache Parquet and CSV. You want to centralize this data in Cloud Storage. You need to set up an object sink for your data that allows you to use your own encryption keys. You want to use a GUI-based solution. What should you do?
A. Use BigQuery Data Transfer Service to move files into BigQuery.
B. Use Storage Transfer Service to move files into Cloud Storage
C. Use Dataflow to move files into Cloud Storage
D. Use Cloud Data Fusion to move files into Cloud Storage.

Question 318
You are using BigQuery with a regional dataset that includes a table with the daily sales volumes. This table is updated multiple times per day. You need to protect your sales table in case of regional failures with a recovery point objective (RPO) of less than 24 hours, while keeping costs to a minimum. What should you do?
A. Schedule a daily export of the table to a Cloud Storage dual or multi-region bucket.
B. Schedule a daily copy of the dataset to a backup region.
C. Schedule a daily BigQuery snapshot of the table.
D. Modify ETL job to load the data into both the current and another backup region.

Question 319
You are preparing an organization-wide dataset. You need to preprocess customer data stored in a restricted bucket in Cloud Storage. The data will be used to create consumer analyses. You need to follow data privacy requirements, including protecting certain sensitive data elements, while also retaining all of the data for potential future use cases. What should you do?
A. Use the Cloud Data Loss Prevention API and Dataflow to detect and remove sensitive fields from the data in Cloud Storage. Write the filtered data in BigQuery.
B. Use customer-managed encryption keys (CMEK) to directly encrypt the data in Cloud Storage. Use federated queries from BigQuery. Share the encryption key by following the principle of least privilege.
C. Use Dataflow and the Cloud Data Loss Prevention API to mask sensitive data. Write the processed data in BigQuery.
D. Use Dataflow and Cloud KMS to encrypt sensitive fields and write the encrypted data in BigQuery. Share the encryption key by following the principle of least privilege.



Premium Version