Question 71
You are migrating data from a legacy on-premises MySQL database to Google Cloud. The database contains various tables with different data types and sizes, including large tables with millions of rows and transactional data. You need to migrate this data while maintaining data integrity, and minimizing downtime and cost. What should you do?
A. Set up a Cloud Composer environment to orchestrate a custom data pipeline. Use a Python script to extract data from the MySQL database and load it to MySQL on Compute Engine.
B. Export the MySQL database to CSV files, transfer the files to Cloud Storage by using Storage Transfer Service, and load the files into a Cloud SQL for MySQL instance.
C. Use Database Migration Service to replicate the MySQL database to a Cloud SQL for MySQL instance.
D. Use Cloud Data Fusion to migrate the MySQL database to MySQL on Compute Engine.
Question 72
You created a customer support application that sends several forms of data to Google Cloud. Your application is sending:
1. Audio files from phone interactions with support agents that will be accessed during trainings.
2. CSV files of users’ personally identifiable information (Pll) that will be analyzed with SQL.
3. A large volume of small document files that will power other applications.
You need to select the appropriate tool for each data type given the required use case, while following Google-recommended practices. Which should you choose?
A. 1. Cloud Storage
2. CloudSQL for PostgreSQL
3. Bigtable
B. 1. Filestore
2. Cloud SQL for PostgreSQL
3. Datastore
C. 1. Cloud Storage
2. BigQuery
3. Firestore
D. 1. Filestore
2. Bigtable
3. BigQuery
Question 73
You are working on a project that requires analyzing dally social media data. You have 100 GB of JSON formatted data stored in Cloud Storage that keeps growing. You need to transform and load this data into BigQuery for analysis. You want to follow the Google-recommended approach. What should you do?
A. Use Cloud Data Fusion to transfer the data into BigOuery raw tables, and use SQL to transform it.
B. Use Dataflow to transform the data and write the transformed data to BigQuery.
C. Manually download the data from Cloud Storage. Use a Python script to transform and upload the data into BigQuery.
D. Use Cloud Run functions to transform and load the data into BigOuery.
Question 74
You have a Cloud SQL for PostgreSQL database that stores sensitive historical financial data. You need to ensure that the data is uncorrupted and recoverable in the event that the primary region is destroyed. The data is valuable, so you need to prioritize recovery point objective (RPO) over recovery time objective (RTO). You want to recommend a solution that minimizes latency for primary read and write operations. What should you do?
A. Configure the Cloud SQL for PostgreSQL instance for multi-region backup locations.
B. Configure the Cloud SQL for PostgreSQL instance for regional availability (HA) with synchronous replication to a secondary instance in a different zone.
C. Configure the Cloud SQL for PostgreSQL instance for regional availability (HA) with asynchronous replication to a secondary instance in a different region.
D. Configure the Cloud SQL for PostgreSQL instance for regional availability (HA). Back up the Cloud SQL for PostgreSQL database hourly to a Cloud Storage bucket in a different region.
Question 75
You work for a healthcare company. You have a daily ETL pipeline that extracts patient data from a legacy system, transforms it, and loads it into BigQuery for analysis. The pipeline currently runs manually using a shell script. You want to automate this process and add monitoring to ensure pipeline observability and troubleshooting insights. You want one centralized solution, using open-source tooling, without rewriting the ETL code. What should you do?
A. Create a Cloud Run function that runs the pipeline daily. Monitor the function's execution using Cloud Monitoring.
B. Configure Cloud Dataflow to implement the ETL pipeline, and use Cloud Scheduler to trigger the Dataflow pipeline daily. Monitor the pipeline's execution using the Dataflow job monitoring interface and Cloud Monitoring.
C. Use Cloud Scheduler to trigger a Dataproc job to execute the pipeline daily. Monitor the job's progress using the Dataproc job web interface and CloudMonitoring.
D. Create a direct acyclic graph (DAG) in Cloud Composer to orchestrate a pipeline trigger daily. Monitor the pipeline's execution using the Apache Airflow web interface and Cloud Monitoring.
Question 76
Your company has an on-premises file server with 5 TB of data that needs to be migrated to Google Cloud. The network operations team has mandated that you can only use up to 250 Mbps of the total available bandwidth for the migration. You need to perform an online migration to Cloud Storage. What should you do?
A. Use the gcloud storage cp command to copy all files from on-premises to Cloud Storage using the --daisy-chain option.
B. Use Storage Transfer Service to configure an agent-based transfer. Set the appropriate bandwidth limit for the agent pool.
C. Request a Transfer Appliance, copy the data to the appliance, and ship it back to Google Cloud.
D. Use the gcloud storage cp command to copy all files from on-premises to Cloud Storage using the --no-clobber option.
Question 77
Following a recent company acquisition, you inherited an on-premises data infrastructure that needs to move to Google Cloud. The acquired system has 250 Apache Airflow directed acyclic graphs (DAGs) orchestrating data pipelines. You need to migrate the pipelines to a Google Cloud managed service with minimal effort. What should you do?
A. Create a Google Kubernetes Engine (GKE) standard cluster and deploy Airflow as a workload. Migrate all DAGs to the new Airflow environment.
B. Create a Cloud Data Fusion instance. For each DAG, create a Cloud Data Fusion pipeline.
C. Create a new Cloud Composer environment and copy DAGs to the Cloud Composer dags/ folder.
D. Convert each DAG to a Cloud Workflow and automate the execution with Cloud Scheduler.
Question 78
You created a curated dataset of market trends in BigQuery that you want to share with multiple external partners. You want to control the rows and columns that each partner has access to. You want to follow Google-recommended practices. What should you do?
A. Publish the dataset in Analytics Hub. Grant dataset-level access to each partner by using subscriptions.
B. Grant each partner read access to the BigQuery dataset by using IAM roles.
C. Create a separate Cloud Storage bucket for each partner. Export the dataset to each bucket and assign each partner to their respective bucket. Grant bucketlevel access by using IAM roles.
D. Create a separate project for each partner and copy the dataset into each project. Publish each dataset in Analytics Hub. Grant dataset-level access to each partner by using subscriptions.
Question 79
Your company stores historical data in Cloud Storage. You need to ensure that all data is saved in a bucket for at least three years. What should you do?
A. Set temporary object holds.
B. Set a bucket retention policy.
C. Change the bucket storage class to Archive.
D. Enable Object Versioning.
Question 80
Your company wants to implement a data transformation (ETL) pipeline for their BigQuery data warehouse. You need to identify a managed transformation solution that allows users to develop with SQL and JavaScript, has version control, allows for modular code, and has data quality checks. What should you do?
A. Use Dataform to define the transformations in SQLX.
B. Use Dataproc to create an Apache Spark cluster and implement the transformations by using PySpark SQL.
C. Create a Cloud Composer environment, and orchestrate the transformations by using the BigQueryInsertJob operator.
D. Create BigQuery scheduled queries to define the transformations in SQL.