Microsoft Azure Databricks for Data Engineering Week 4 | Test prep Quiz Answers
In this article i am gone to share Coursera Course: Microsoft Azure Databricks for Data Engineering Week 4 | Test prep Quiz Answers with you..
Enrol Link: Microsoft Azure Databricks for Data Engineering
Microsoft Azure Databricks for Data Engineering Week 4 | Test prep Quiz Answers
Test prep Quiz Answers
Question 1)
When importing files in an Azure Databricks notebook, which of the following formats are supported? Select all that apply.
- .html
- .Yaml
- .Zip
- .scala
- .dbc
- .ORC
Question 2)
Examine the following code. From the options below select the correct syntax to complete line 3 so that it will return an instance of a DataFrame in a spark notebook in Azure Databricks.
1: pagecountsEnAllDF = (spark
2: .read
3: ____________________________ # Returns an instance of DataFrame
4: .cache()
5: )
6: print(pagecountsEnAllDF)
- .DataFrame(parquetFile)
- .read(parquetFile)
- .parquet(parquetFile)
- .cache(parquetFile)
Question 3)
Examine the following piece of code taken from a notebook in an Azure Databricks.
Complete line 4 so that 15 rows of data will be displayed, and the columns will not be truncated.
1: sortedDF = (pagecountsEnAllDF
2: .orderBy(“requests”)
3: )
4: SortedDF. __________
- sortedDF.print(15, False)
- sortedDF.print(15)
- sortedDF.show(15)
- sortedDF.show(15, False)
Question 4)
Which command will order by a column in descending order?
- df.orderBy(“requests desc”)
- df.orderBy(col(“requests”).desc())
- df.orderBy(“requests”).desc()
Question 5)
Which command specifies a column value in a DataFrame’s filter? Specifically, filter by a productType column where the value is equal to book?
- df.col(“productType”).filter(“book”)
- df.filter(“productType = ‘book'”)
- df.filter(col(“productType”) == “book”)
Question 6)
When using the Column Class, which command filters are based on the end of a column value? For example, a column named verb and filtered by words ending with “ing”?
- df.filter().col(“verb”).like(“%ing”)
- df.filter(col(“verb”).endswith(“ing”))
- df.filter(“verb like ‘%ing'”)
Question 7)
Which of the listed methods for renaming a DataFrame’s column are correct? Select two.
- df.alias(“timestamp”, “dateCaptured”)
- df.toDF(“dateCaptured”)
- df.select(col(“timestamp”).alias(“dateCaptured”))
Question 8)
You need to find the average of sales transactions by storefront. Which of the following aggregates would you use?
- df.groupBy(col(“storefront”)).avg(“completedTransactions”)
- df.groupBy(col(“storefront”)).avg(col(“completedTransactions”))
- df.select(col(“storefront”)).avg(“completedTransactions”)
Question 9)
In Azure Databricks you are about to do some ETL on a file you have received from a customer. The file contains data about people, including:
first, middle, and last names
gender
birth date
Social Security number
Salary
You discover that the file contains some duplicate records and you have been instructed to remove any duplicates. The dropDuplicates() command will more than likely create a shuffle. To help reduce the number of post-shuffle partitions which of the following commands should you run?
- spark.conf.set(“spark.sql.shuffle.partitions”, 8)
- spark.sql.conf.set(“spark.shuffle.partitions”, 8)
- spark.conf.set(“spark.sql.partitions”, 8)
Question 10)
You need to change a column name from “dob” to “DateOfBirth” on a spark DataFrame. Which of the following syntax is valid?
- .RenameColumn(“dob”,”DateOfBirth”)
- .ColumnRename(“dob”,”DateOfBirth”)
- .withColumnRenamed(“dob”,”DateOfBirth”)