Win IT Exam with Last Dumps 2025


Google Professional-Data Exam

Page 29/32
Viewing Questions 281 290 out of 319 Questions
90.62%

Question 281
You work for a large ecommerce company. You store your customer's order data in Bigtable. You have a garbage collection policy set to delete the data after 30 days and the number of versions is set to 1. When the data analysts run a query to report total customer spending, the analysts sometimes see customer data that is older than 30 days. You need to ensure that the analysts do not see customer data older than 30 days while minimizing cost and overhead. What should you do?
A. Set the expiring values of the column families to 29 days and keep the number of versions to 1.
B. Use a timestamp range filter in the query to fetch the customer's data for a specific range.
C. Schedule a job daily to scan the data in the table and delete data older than 30 days.
D. Set the expiring values of the column families to 30 days and set the number of versions to 2.

Question 282
You are using a Dataflow streaming job to read messages from a message bus that does not support exactly-once delivery. Your job then applies some transformations, and loads the result into BigQuery. You want to ensure that your data is being streamed into BigQuery with exactly-once delivery semantics. You expect your ingestion throughput into BigQuery to be about 1.5 GB per second. What should you do?
A. Use the BigQuery Storage Write API and ensure that your target BigQuery table is regional.
B. Use the BigQuery Storage Write API and ensure that your target BigQuery table is multiregional.
C. Use the BigQuery Streaming API and ensure that your target BigQuery table is regional.
D. Use the BigQuery Streaming API and ensure that your target BigQuery table is multiregional.

Question 283
You have created an external table for Apache Hive partitioned data that resides in a Cloud Storage bucket, which contains a large number of files. You notice that queries against this table are slow. You want to improve the performance of these queries. What should you do?
A. Change the storage class of the Hive partitioned data objects from Coldline to Standard.
B. Create an individual external table for each Hive partition by using a common table name prefix. Use wildcard table queries to reference the partitioned data.
C. Upgrade the external table to a BigLake table. Enable metadata caching for the table.
D. Migrate the Hive partitioned data objects to a multi-region Cloud Storage bucket.

Question 284
You have a network of 1000 sensors. The sensors generate time series data: one metric per sensor per second, along with a timestamp. You already have 1 TB of data, and expect the data to grow by 1 GB every day. You need to access this data in two ways. The first access pattern requires retrieving the metric from one specific sensor stored at a specific timestamp, with a median single-digit millisecond latency. The second access pattern requires running complex analytic queries on the data, including joins, once a day. How should you store this data?
A. Store your data in BigQuery. Concatenate the sensor ID and timestamp, and use it as the primary key.
B. Store your data in Bigtable. Concatenate the sensor ID and timestamp and use it as the row key. Perform an export to BigQuery every day.
C. Store your data in Bigtable. Concatenate the sensor ID and metric, and use it as the row key. Perform an export to BigQuery every day.
D. Store your data in BigQuery. Use the metric as a primary key.

Question 285
You have 100 GB of data stored in a BigQuery table. This data is outdated and will only be accessed one or two times a year for analytics with SQL. For backup purposes, you want to store this data to be immutable for 3 years. You want to minimize storage costs. What should you do?
A. 1. Create a BigQuery table clone.
2. Query the clone when you need to perform analytics.
B. 1. Create a BigQuery table snapshot.
2. Restore the snapshot when you need to perform analytics.
C. 1. Perform a BigQuery export to a Cloud Storage bucket with archive storage class.
2. Enable versioning on the bucket.
3. Create a BigQuery external table on the exported files.
D. 1. Perform a BigQuery export to a Cloud Storage bucket with archive storage class.
2. Set a locked retention policy on the bucket.
3. Create a BigQuery external table on the exported files.


Question 286
You have thousands of Apache Spark jobs running in your on-premises Apache Hadoop cluster. You want to migrate the jobs to Google Cloud. You want to use managed services to run your jobs instead of maintaining a long-lived Hadoop cluster yourself. You have a tight timeline and want to keep code changes to a minimum. What should you do?
A. Move your data to BigQuery. Convert your Spark scripts to a SQL-based processing approach.
B. Rewrite your jobs in Apache Beam. Run your jobs in Dataflow.
C. Copy your data to Compute Engine disks. Manage and run your jobs directly on those instances.
D. Move your data to Cloud Storage. Run your jobs on Dataproc.

Question 287
You are administering shared BigQuery datasets that contain views used by multiple teams in your organization. The marketing team is concerned about the variability of their monthly BigQuery analytics spend using the on-demand billing model. You need to help the marketing team establish a consistent BigQuery analytics spend each month. What should you do?
A. Create a BigQuery Enterprise reservation with a baseline of 250 slots and autoscaling set to 500 for the marketing team, and bill them back accordingly.
B. Establish a BigQuery quota for the marketing team, and limit the maximum number of bytes scanned each day.
C. Create a BigQuery reservation with a baseline of 500 slots with no autoscaling for the marketing team, and bill them back accordingly.
D. Create a BigQuery Standard pay-as-you go reservation with a baseline of 0 slots and autoscaling set to 500 for the marketing team, and bill them back accordingly.

Question 288
You are part of a healthcare organization where data is organized and managed by respective data owners in various storage services. As a result of this decentralized ecosystem, discovering and managing data has become difficult. You need to quickly identify and implement a cost-optimized solution to assist your organization with the following:
• Data management and discovery
• Data lineage tracking
• Data quality validation
How should you build the solution?
A. Use BigLake to convert the current solution into a data lake architecture.
B. Build a new data discovery tool on Google Kubernetes Engine that helps with new source onboarding and data lineage tracking.
C. Use BigQuery to track data lineage, and use Dataprep to manage data and perform data quality validation.
D. Use Dataplex to manage data, track data lineage, and perform data quality validation.

Question 289
You have data located in BigQuery that is used to generate reports for your company. You have noticed some weekly executive report fields do not correspond to format according to company standards. For example, report errors include different telephone formats and different country code identifiers. This is a frequent issue, so you need to create a recurring job to normalize the data. You want a quick solution that requires no coding. What should you do?
A. Use Cloud Data Fusion and Wrangler to normalize the data, and set up a recurring job.
B. Use Dataflow SQL to create a job that normalizes the data, and that after the first run of the job, schedule the pipeline to execute recurrently.
C. Create a Spark job and submit it to Dataproc Serverless.
D. Use BigQuery and GoogleSQL to normalize the data, and schedule recurring queries in BigQuery.

Question 290
You are designing a messaging system by using Pub/Sub to process clickstream data with an event-driven consumer app that relies on a push subscription. You need to configure the messaging system that is reliable enough to handle temporary downtime of the consumer app. You also need the messaging system to store the input messages that cannot be consumed by the subscriber. The system needs to retry failed messages gradually, avoiding overloading the consumer app, and store the failed messages after a maximum of 10 retries in a topic. How should you configure the Pub/Sub subscription?
A. Increase the acknowledgement deadline to 10 minutes.
B. Use immediate redelivery as the subscription retry policy, and configure dead lettering to a different topic with maximum delivery attempts set to 10.
C. Use exponential backoff as the subscription retry policy, and configure dead lettering to the same source topic with maximum delivery attempts set to 10.
D. Use exponential backoff as the subscription retry policy, and configure dead lettering to a different topic with maximum delivery attempts set to 10.



Premium Version