Question
Questions Q1 Spark catalog has database(s), local and external tables, functions, etc. Show the current database you are connected in Spark. Q2 List all tables
Questions
Q1
Spark catalog has database(s), local and external tables, functions, etc. Show the current database you are connected in Spark.
Q2
List all tables under the current database.
Q3
Show all the databases available in Spark.
Q4
Using sales table that you have discovered in Q2, find the total Quantity and sum of the SaleAmount for each combination of State and City.
Q5
Below are some of the data types of the sales DataFrame that you have interacted with. Your collegue find out that some of the data types are not set properly.
>>> df.printSchema() root |-- RowID: string (nullable = true) |-- OrderID: string (nullable = true) |-- OrderDate: string (nullable = true) ... |-- SaleAmount: float (nullable = true) |-- CustomerName: string (nullable = true) ... |-- WageMargin: string (nullable = true)
Convert OrderDate to DateType, SaleAmount to DecimalType, and WageMargin to FloatType.
Q6
We want to add another column to this dataset, the initals of the customer's name and surname. However, there is no built in function for it, so you have to make one yourself. Register this function as a UDF. This serializes the function and sends it to executors to be able to transform DataFrame records. Then using this UDF, create a new column with name CustomerNameInitials. For example, if name is John Smith, the new value should be JS.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started