Data Warehousing with Microsoft Azure Synapse Analytics Coursera Quiz Answers
In this article i am gone to share Data Warehousing with Microsoft Azure Synapse Analytics All Weeks Quiz Answers with you..
Enrol Link: Data Warehousing with Microsoft Azure Synapse Analytics
Data Warehousing with Microsoft Azure Synapse Analytics Coursera Quiz Answers
WEEK 1 QUIZ ANSWERS
Knowledge check
Question 1)
The process of building a modern data warehouse typically includes Data Ingestion and Preparation.
You have recently deployed Azure Synapse Analytics. You now have a requirement to ingest data code-free. Which of the following tools can be used to perform this task?
- Azure Data Factory
- Power BI
- Azure Databricks
Question 2)
Which of the following would be a valid reason for adding a staging area into the architecture of a modern data warehouse?
Select all options that apply.
- To join data from different source systems
- To make data analytics available directly from the staging area
- Enable the ingestion of source systems based on different schedules
- To reduce contention on source systems
Question 3)
When ingesting raw data in batch from new data sources, which of the following data formats are natively supported by Synapse Analytics?
Select all options that apply.
- JSON
- Parquet
- ORC
- Scala
- CSV
Question 4)
Processing data that arrives in real-time /near real-time is also referred to as streaming data processing. Azure offers purpose-built stream ingestion services such as Azure IoT Hub and Azure Event Hubs. To collect messages from these or similar services, and process them, you can use which of the following features?
Select all options that apply.
- Azure Functions
- Azure Databricks
- Azure Stream Analytics
- Azure IoT Central
Question 5)
Which technology is typically used as a staging area in a modern data warehousing architecture?
- Azure Synapse SQL Pools
- Azure Data Lake
- Azure Synapse Spark Pools.
Question 6
Which of the following is a Big Data Solution that stores data in a relational table format with columnar storage?
- Azure Synapse Spark Pools
- Azure Synapse SQL Pools
Knowledge check
Question 1)
Which component enables you to perform code free transformations in Azure Synapse Analytics?
- Synapse Studio
- Synapse Mapping data flow
- Synapse Copy activity
Question 2)
Which transformation in the Mapping Data Flow is used to routes data rows to different streams based on matching conditions?
- Lookup
- Conditional Split
- GetMetadata activity
Question 3)
Which transformation is used to load data into a destination data store or compute resource?
- Source
- Sink
- Window
Question 4)
True or False
When data is stored in Data Lake Storage Gen2, the file size, number of files, and folder structure can have an impact on performance.
- False
- True
Question 5)
When working with Data Lake Storage Gen2 many small files can negatively affect performance. The recommended file size for Data Lake Storage Gen2 is between which of the following sizes?
- 256MB to 100GB
- 256MB to 1GB
- 10GB to 100GB
- 1GB to 10GB
Question 6)
When building data flows in Azure Synapse you can enable debug mode., When Debug mode is enabled Synapse automatically turns on which of the following?
- Serverless cluster
- Dedicated SQL Pool
- Spark cluster
Visit this link: Data Warehousing with Microsoft Azure Synapse Analytics Week 1 | Test prep Quiz Answer
WEEK 2 QUIZ ANSWERS
Knowledge check
Question 1)
A Star schema is a modeling approach widely adopted by relational data warehouses. It requires modelers to classify their model tables as either dimension or fact. Which of the following are features of dimension tables?
Select all options that apply.
- A dimension table describes business entities
- A dimension stores numeric measure columns
- A dimension table contains a key column (or columns)
Question 2)
Which of the following are true in respect of fact tables?
Select all options that apply.
- Fact tables store observations or events
- A fact table contains numeric measure columns
- A fact table contains dimension key columns that relate to dimension tables.
- A fact table describes business entities
Question 3)
Since Synapse Analytics is a massively parallel processing (MPP) system, you need to consider how data is distributed in your table design. What is the recommended distribution option for Fact tables?
- Replicate
- Clustered Columnstore Index
- Clustered Index
- Hash-distribution
Question 4)
Which of the following statements are true in respect of a Star schema?
Select all options that apply.
- In a Star schema Cube processing might be slow because of the complex join.
- A Star schema will have a fact table surrounded by dimension tables which are in turn surrounded by dimension tables.
- Star schema dimension tables are a denormalized data structure.
- Star schemas have a high level of Data redundancy.
- A Star schema contains a fact table surrounded by dimension tables.
Question 5)
Examine the following statement and select the missing word with an entry from those supplied below:
A time dimension table is one of the most consistently used dimension tables. This type of table enables consistent _____________ for temporal analysis and reporting
- distribution
- indexing
- uniqueness
- granularity
Question 6)
What distribution option would be best for a sales fact table that will contain billions of records?
- DISTRIBUTION = HASH([SalesOrderNumber])
- DISTRIBUTION = HEAP
- DISTRIBUTION = REPLICATE
Visit this link: Data Warehousing with Microsoft Azure Synapse Analytics Week 2 | Test prep Quiz Answer
WEEK 3 QUIZ ANSWERS
Knowledge check
Question 1)
Which Workload Management capability manages minimum and maximum resource allocations during peak periods?
- Workload Isolation
- Workload Importance
- Workload Containment
Question 2)
Select from the options below to complete the missing text in the following statement.
A data warehouse that is built on a Massively Parallel Processing (MPP) system are built for processing and analyzing large datasets. As such they perform well with ____________ that can be distributed across compute nodes and storage.
Select from the options to complete the missing text.
- Fewer and larger batch sizes
- Multiple small batch sizes
Question 3)
Resource classes are pre-determined resource limits in Synapse SQL pool that govern compute resources and concurrency for query execution. Resource classes can help you configure resources for your queries by setting limits on the number of queries that run concurrently and, on the compute-resources assigned to each query.
Which of the following statements are correct?
Select all options that apply.
- Larger resource classes increase concurrency, but reduce the maximum memory per query,
- Larger resource classes increase the maximum memory per query but reduce concurrency.
- Smaller resource classes reduce the maximum memory per query but increase concurrency.
- Smaller resource classes reduce the concurrency but increase maximum memory per query.
Question 4)
SQL Pools have the concept of concurrency slots, which manage the allocation of memory to connected users. To optimize the load execution operations, you should consider which of the following? Select all options that apply.
- Reducing or minimizing the number of simultaneous load jobs that are running.
- Assigning higher resource classes that reduce the number of active running tasks.
- Increase the number of simultaneous load jobs that are running.
- Assigning lower resource classes that reduce the number of active running tasks.
Question 5)
In Synapse SQL pools workload importance influences the order in which a request gets access to resources. There are five levels of importance. Which of the following are valid levels of importance? Select all options that apply.
- above_normal
- low
- below_normal
- normal
- very high
- high
Question 6)
How does splitting source files help maintain good performance when loading into Synapse Analytics?
- Compute node to storage segment alignment
- Optimized processing of smaller file sizes
- Reduced possibility of data corruptions
Knowledge check
Question 1)
Azure Synapse Analytics is a high performing Massively Parallel Processing (MPP) engine that is built with loading and querying large datasets in mind.
You have received calls from users reporting that the data in the reports they are producing appears to be out of date. Which of the following is the most likely to cause of out-of-date information being presented in user reports?
- Poor load performance
- Low concurrency
- Poor query performance
Question 2)
What are the three main table distributions available in Synapse Analytics SQL Pools called?
Select all options that apply.
- Hash distribution
- Replicated tables
- Block Distribution
- Round robin distribution
Question 3)
True or False
When a table is created, by default the data structure has no indexes and is called a heap
- False
- True
Question 4)
Select from the following options to complete the missing word in the sentence.
Dedicated SQL Pools create a _______________________ index when no index options are specified on a table
- Non-clustered
- Clustered columnstore
- Clustered
Question 5)
Materialized views are prewritten queries with joins and filters whose definition is saved and the results persisted to pools. Which of the following pools are the results for Materialized views persisted to?
- Serverless SQL Pool
- Dedicated SQL Pool
Question 6)
In Azure Synapse SQL you should enable result-set caching when you expect results from queries to return the same values. This option stores a copy of the result set on the control node so that queries do not need to pull data from the storage subsystem or compute nodes.
By default, data within the result-set cache is expired and purged by the dedicated SQL pool after how many hours of not being accessed?
- 36 Hours
- 12 Hours
- 48 Hours
- 24 Hours
Visit this link: Data Warehousing with Microsoft Azure Synapse Analytics Week 3 | Test prep Quiz Answer
Visit this link: Data Warehousing with Microsoft Azure Synapse Analytics Week 3 | Test prep Quiz Answer
WEEK 4 QUIZ ANSWERS
Knowledge check
Question 1)
The interoperability between Apache Spark and SQL helps you to directly explore and analyze which of the following types of files? Select all options that apply.
- CSV
- JSON
- YAML
- TSV
- Parquet
Question 2)
The Azure Synapse Apache Spark pool to Synapse SQL connector is a data source implementation for Apache Spark. Which of the following is used to efficiently transfer data between the Spark cluster and the Synapse SQL instance?
- Azure Data Lake Storage Generation 2 and XML.
- Azure Data Lake Storage Generation 2 and JSON.
- Azure Data Lake Storage Generation 2 and PolyBase.
Question 3)
True or False
SQL and Apache Spark share the same underlying metadata store.
- True
- False
Question 4)
To write data to a dedicated SQL Pool, you use the Write API. The Write API creates a table in the dedicated SQL pool. Which of the following is used to load the data into the table that was created?
- ORC
- JSON
- Parquet
- Polybase
Question 5)
In what language can the Azure Synapse Apache Spark to Synapse SQL connector be used?
- .Net
- Scala
- SQL
- Python
Question 6)
When is it unnecessary to use import statements for transferring data between a dedicated SQL and Apache Spark pool?
- Use token-based authentication.
- When using the integrated notebook experience from Azure Synapse Studio.
- When using the PySpark connector.
Knowledge check
Question 1)
The Develop hub in Azure Synapse Studio is an interface you can use for developing a variety of solutions against an Azure Synapse Analytics instance. In this area, you can create which of the following objects? Select all options that apply.
- Notebooks
- Azure Synapse Pipelines
- Power BI datasets and reports
- Synapse Workspace
- SQL Scripts
Question 2)
Visual Studio 2019 SQL Server Data Tools (SSDT) has which of the following features? Select all options that apply.
- Create Database projects within Serverless SQL pools
- Create Database projects within dedicated SQL pools
- integrate with source control systems
- Native integration with Azure DevOps
Question 3)
Azure Synapse Analytics supports querying both relational and non-relational data using Transact SQL. The Azure Synapse SQL query language supports different features based on the resource model being used. Which of the following T-SQL Statements are supported on both Dedicated and Serverless Pools? Select all options that apply.
- Cross database queries
- Transactions
- INSERT statement
- SELECT statement
Question 4)
Examine the following statement and select from the options below to complete the sentence.
Synapse dedicated SQL Pools supports JSON format data to be stored using standard _________ table columns
- VARCHAR
- NVARCHAR
- CHAR
- TEXT
Question 5)
What Transact-SQL function is used to perform a HyperLogLog function?
- COUNT_DISTINCT_APPROX
- COUNT
- APPROX_COUNT_DISTINCT
Question 6)
In Azure Synapse Studio Develop hub you can define Spark Job definitions. Which of the following languages can be used to define job definitions? Select all options that apply.
- Scala
- PySpark
- Transact-SQL
- .NET Spark
Practice Quiz
Question 1)
The Azure Synapse Apache Spark to Synapse SQL connector is designed to efficiently transfer data between which of the following?
- Serverless Apache Spark pools and Serverless SQL pools in Azure Synapse.
- Serverless Apache Spark pools and Dedicated SQL pools in Azure Synapse.
- Dedicated Apache Spark pools and Serverless SQL pools in Azure Synapse.
Question 2)
The Azure Synapse Studio experience provides an integrated notebook experience. Within this notebook experience, you can attach a SQL or Apache Spark pool, and develop and execute transformation pipelines using which of the following?
- JSON
- Python
- SparkSQL
- Scala
Question 3)
In Azure Synapse Analytics the authentication process between two systems can be seamless. However, there are some prerequisites. Which of the following role memberships are required to successfully authenticate? Select all options that apply.
- The account used needs to be a member of the Storage Blob Data Contributor role on the default storage account.
- The account used needs to be a member of the db_exporter role on the default storage account
- The account used needs to be a member of db_exporter role in the database or SQL pool from which you transfer data to or from.
- The account used needs to be a member of Storage Blob Data Contributor role in the database or SQL pool from which you transfer data to or from.
Question 4)
You have a requirement to transfer data to a dedicated SQL pool that is outside of the workspace of Synapse Analytics. To establish and transfer data to a dedicated SQL pool that is outside of the workspace which form of Authentication can be used?
- Azure AD and SQL Authentication
- Azure AD only
- SQL Authentication Only
- None of the above
Question 5)
When is it unnecessary to use import statements for transferring data between a dedicated SQL and Apache Spark pool?
- Use the PySpark connector
- When using the integrated notebook experience from Azure Synapse Studio.
- Use token-based authentication.
Question 6)
To write data to a dedicated SQL Pool, you use the Write API. The Write API creates a table in the dedicated SQL pool. Which of the following is used to load the data into the table that was created?
- ORC
- Parquet
- Polybase
- JSON
Question 7)
In Azure Synapse Studio Develop hub you can define Spark Job definitions. Which of the following languages can be used to define job definitions? Select all options that apply.
- Transact-SQL
- Scala
- .NET Spark
- PySpark
Question 8)
Azure Data Studio is a cross-platform tool to connect and query on-premise and cloud data platforms on windows, macOS, and Linux. Synapse Analytics supports using Azure Data Studio for connecting and querying Synapse SQL on which of the following configurations?
- Only Serverless SQL Pool resources
- Both dedicated and Serverless SQL Pool resources
- Only dedicated SQL Pool resources
Question 9)
Azure Synapse Analytics supports Approximate execution using Hyperlog accuracy to reduce latency when executing queries with large datasets. Approximate execution is used to speed up the execution of queries with a compromise for a small reduction in accuracy. What percentage accuracy of true cardinality on average will the result return when using Approximate execution?
- 1%
- 2%
- 6%
- 4%
Question 10)
What Transact-SQL function verifies if a piece of text is valid JSON?
- JSON_QUERY
- JSON_VALUE
- ISJSON
WEEK 5 QUIZ ANSWERS
Knowledge check
Question 1)
In Azure Synapse Analytics you can scale a Synapse SQL pool through which of the following? Select all options that apply.
- PowerShell
- Azure Synapse Studio
- Transact-SQL
- Parquet
- JSON
- Azure portal
Question 2)
Apache Spark pools for Azure Synapse Analytics uses an Autoscale feature that automatically scales the number of nodes in a cluster instance up and down. Autoscale continuously monitors the Spark instance and collects metrics. Which of the following conditions will trigger Autoscale to scale up? Select all options that apply.
- Total pending CPU is greater than total free CPU for more than 1 minute.
- Total pending memory is less than total free memory for more than 2 minutes.
- Total pending memory is greater than total free memory for more than 1 minute
- Total pending CPU is less than total free CPU for more than 2 minutes
Question 3)
Dedicated SQL pool workload management in Azure Synapse consists of three high-level concepts which gives you more control over how your workload utilizes system resources. Which of the following influences the order in which a request gets access to resources?
- Workload Classification
- Workload Isolation
- Workload Importance
Question 4)
Which ALTER DATABASE statement parameter allows a dedicated SQL pool to scale?
- CHANGE
- MODIFY
- SCALE
Question 5)
Which Dynamic Management View enables you to view the active connections against a dedicated SQL pool?
- DBCC PDW_SHOWEXECUTIONPLAN
- sys.dm_pdw_dms_workers
- sys.dm_pdw_exec_requests
Question 6)
Which workload management feature allows workload policies to be applied to requests through assigning resource classes and importance.
- Workload classification
- Workload isolation
- Workload importance
Knowledge check
Question 1)
In a dedicated SQL pool in Azure Synapse Analytics a distributed table appears as a single table with rows spread across multiple distributions. Across how many distributions are rows stored?
- 60
- 80
- 20
- 40
Question 2)
Examine the following statement and select from the listed options to complete the sentence.
A columnstore index scans a table by scanning column segments of individual rowgroups. Maximizing the number of rows in each rowgroup enhances query performance. For best query performance, the goal is to maximize the number of rows per rowgroup in a columnstore index.
Columnstore indexes achieve good performance when rowgroups have at least ____________ rows.
- 1,000,000 rows
- 10,000 rows
- 1,048,576 rows
- 100,000 rows
Question 3)
SQL pool in Azure Synapse supports standard and materialized views. Which of the following are features of Materialized views? Select all options that apply.
- Speed to retrieve view data from complex queries is Slow.
- View content is pre-processed and stored in SQL pool during view creation. The view is updated as data is added to the underlying tables.
- View content is generated each time the view is used.
- Speed to retrieve view data from complex queries is Fast.
Question 4)
What would be the best approach to investigate if the data at hand is unevenly allocated across all distributions?
- Grouping the data based on partitions and counting rows with a T-SQL query.
- Monitor query speeds by testing the same query for each partition.
- Using DBCC PDW_SHOWSPACEUSED to see the number of table rows that are stored in each of the 60 distributions.
Question 5)
To achieve improved query performance, which of the following would be the best data type for storing data that contains less than 128 characters?
- VARCHAR(128)
- NVARCHAR(128)
- VARCHAR(MAX)
Question 6)
Which of the following statements is a benefit of materialized views?
- Reducing the execution time for complex queries with JOINs and aggregate functions.
- Increased high availability
- Increased resiliency benefits
Knowledge check
Question 1)
What features are provided when using a managed workspace virtual network? Select all options that apply.
- Your workspace is network isolated from other workspaces.
- You don’t have to configure inbound NSG rules on your own Virtual Networks to allow Azure Synapse management traffic to enter your Virtual Network.
- You will need to create a subnet for your Spark clusters based on peak load.
- Management of the virtual network is offloaded to Azure Synapse.
Question 2)
Azure Synapse Analytics enables you to connect to its various components through endpoints. You can set up managed private endpoints to access these components in a secure manner known as private links. Which of the following statements are true in respect of Private Endpoints? Select all options that apply.
- You must have an Azure Synapse workspace with a Managed workspace Virtual Network.
- When you use a private link, traffic between your Virtual Network and workspace traverses entirely over the Microsoft backbone network.
- When you use a private link, traffic between your Virtual Network and workspace traverses over the public Internet network.
- You can manage the private endpoints in the Azure Synapse Studio manage hub.
Question 3)
When can you choose to enable managed virtual networks?
- Only when you are creating a new Azure Synapse Workspace.
- When creating a new Workspace or modifying an existing Workspace.
- At any time for an existing Workspace.
Question 4)
Conditional access is a feature that enables you to define the conditions under which a user can connect to your Azure subscription and access services. Conditional access policies use signals as a basis to determine if conditional access should first be applied. Common signals include? Select all options that apply.
- Multi Factor Authentication
- Microsoft Cloud App Security (MCAS)
- IP address information
- User or group membership names
- Device platforms or type
Question 5)
You work at a bank as a service representative in a call center. Due to compliance, any caller must identify themselves by providing several digits of their credit card number. In this scenario, the full credit card number should not be fully exposed to the service representative in the call center. To limit visibility, so that you would have a query that only gives as a result the last four digits of the credit card number, which of the following would you implement in Azure Synapse Analytics?
- Row Level Security (RLS)
- Dynamic Data Masking
- Column Level Security
Question 6)
You want to configure a private endpoint. You open up Azure Synapse Studio, go to the manage hub, and see that the private endpoint is greyed out. Why is the option not available?
- A conditional access policy has to be defined first.
- Azure Synapse Studio does not support the creation of private endpoints.
- A managed virtual network has not been created.
Visit this link: Data Warehousing with Microsoft Azure Synapse Analytics Week 5 | Test prep Quiz Answer
WEEK 6 QUIZ ANSWERS
Visit this link: Data Warehousing with Microsoft Azure Synapse Analytics Week 1 | Test prep Quiz Answer