Microsoft Azure Databricks for Data Engineering Week 2 | Test prep Quiz Answers
In this article i am gone to share Coursera Course: Microsoft Azure Databricks for Data Engineering Week 2 | Test prep Quiz Answers with you..
Enrol Link: Microsoft Azure Databricks for Data Engineering
Microsoft Azure Databricks for Data Engineering Week 2 | Test prep Quiz Answers
Test prep Quiz Answers
Question 1)
How do you list files in DBFS within a notebook?
- %fs dir /my-file-path
- %fs ls /my-file-path
- ls /my-file-path
Question 2)
How do you infer the data types and column names when you read a JSON file?
- spark.read.option(“inferData”, “true”).json(jsonFile)
- spark.read.option(“inferSchema”, “true”).json(jsonFile)
- spark.read.inferSchema(“true”).json(jsonFile)
Question 3)
Which of the following SparkSession functions returns a DataFrameReader?
- emptyDataFrame(..)
- read(..)
- createDataFrame(..)
- readStream(..)
Question 4)
When using a notebook and a spark session. We can read a CSV file. Which of the following can be used to view the first couple thousand characters of a file?
- %fs head /mnt/training/wikipedia/pageviews/pageviews_by_second.tsv
- %fs dir /mnt/training/wikipedia/pageviews/
- %fs ls /mnt/training/wikipedia/pageviews/
Question 5)
You have created an Azure Databricks cluster, and you have access to a source file.
fileName = “dbfs:/mnt/training/wikipedia/clickstream/2015_02_clickstream.tsv”
You need to determine the structure of the file. Which of the following commands will assist with determining what the column and data types are?
- .option(“header”, “false”)
- .option(“inferSchema”, “false”)
- .option(“header”, “true”)
- .option(“inferSchema”, “true”)
Question 6)
In an Azure Databricks workspace you run the following command:
%fs head /mnt/training/wikipedia/pageviews/pageviews_by_second.tsv
The partial output from this command is as follows:
[Truncated to first 65536 bytes]
“timestamp” “site” “requests”
“2015-03-16T00:09:55” “mobile” 1595
“2015-03-16T00:10:39” “mobile” 1544
“2015-03-16T00:19:39” “desktop” 2460
“2015-03-16T00:38:11” “desktop” 2237
“2015-03-16T00:42:40” “mobile” 1656
“2015-03-16T00:52:24” “desktop” 2452
Which of the following pieces of information can be inferred from the command and the output? Select all that apply.
- the file is a comma separated or CSV file
- The column is Tab separated
- Two columns are strings, and one column is a number
- The file has a header
- The file has no header
- All columns are strings
Question 7)
In an Azure Databricks you wish to create a temporary view that will be accessible to multiple notebooks. Which of the following commands will provide this feature?
- createOrReplaceGlobalTempView(..)
- createOrReplaceTempView(set_scope “Global”)
- createOrReplaceTempView(..)
Question 8)
Which of the following is true in respect of Parquet Files? Select all that apply.
- Designed for performance on small data sets
- Open Source
- Efficient data compression
- D: Is a splittable “file format”.
- Is a Row-Oriented data store
- E: Is a Column-Oriented data store