Question
Notice the use of the WITH keyword. This feature is known as Common Table Expressions . The easiest way to think of this is that
Notice the use of the WITH keyword. This feature is known as Common Table Expressions. The easiest way to think of this is that it creates a named temporary table which gets used later as a nested query. This is important is it is crucial to our issue. MySQL docs for this feature: https://dev.mysql.com/doc/refman/8.0/en/with.html
Also notice that there are several operators in this query. They are specific to the Amazon Redshift database this query was written for an will not work in MySQL. These are the :: for casting types and || for concatenation. There are a couple of other features specific to Redshift such as ILIKE as a case insensitive version of LIKE and the syntax for INTERVAL being slightly different. A version of this query modified to work with MySQL will be provided below.
One more feature to examine is the CASE statement. This is not relevant to this optimization task but a highly useful feature which functions similar to SWITCH statements in programming languages. https://www.w3schools.com/SQL/func_mysql_case.asp
Now that you've looked through the original query, and realize it is a series of smaller queries nested, let's try the MySQL version.
First, run the creates.sql
initialization query.
Second, load the converted query: OptimizeQueryMySQL.sql
The query should run no problem, despite no data entered in the table. We don't need data, we need the query plan. Try running EXPLAIN on this query. A query this complicated gives a messy result which makes optimization difficult. If this was your starting point, the the best approach would be to pull out each temporary table and explain in isolation, step by step.
But since this is originally from Redshift, let's look at Redshift's EXPLAIN results: QueryPlan.txt
What you should draw your attention to are the COST values. While these are arbitrary, they are not meaningless. Find the section of the query where this grows the largest, that is the slowest part of the query in need of optimization. (Save yourself some time as search for other_publisher_views) As this query had been previously reviewed for proper indexing, all queries should be indexed so we need to find another optimization.
Since this is already indexed, draw your attention to the ROWS returned. If you look at the next query which makes use of these results, you'll notice the ROWS is much smaller. So our problem is we are looking at way more data than we need to. We are pulling over 100,000 rows when we are trying to find 4 of them.
Now let's extract the critical section to work with, along with simplified versions of its dependencies:
WITH listing_info_raw AS ( SELECT '' AS property_id_sha, '' AS mc_export_property_id_sha, '(Rental)' AS pretty_address, ' (Rental)' AS display_address ), periods AS ( SELECT true AS is_current_period, CAST('2020-11-09' AS date) AS start_date, CAST('2020-11-15' AS date) AS end_date ), other_publisher_views AS ( SELECT clv.date, clv.streeteasy_page_views, clv.trulia_page_views, clv.zillow_page_views, clv.realtor_page_views FROM agent_insights_lt.agent_insights__combined_listing_views AS clv JOIN listing_info_raw ON clv.property_id_sha = listing_info_raw.property_id_sha ), other_publisher_views_by_period AS ( SELECT is_current_period, start_date, end_date, coalesce(sum(streeteasy_page_views), 0) AS streeteasy_page_views, coalesce(sum(trulia_page_views), 0) AS trulia_page_views, coalesce(sum(zillow_page_views), 0) AS zillow_page_views, coalesce(sum(realtor_page_views), 0) AS realtor_page_views FROM periods LEFT JOIN other_publisher_views ON other_publisher_views.date BETWEEN periods.start_date AND periods.end_date ) SELECT * FROM other_publisher_views_by_period;
Please optimize this section of the query, which includes other_publisher_views & other_publisher_views_by_period, thankyou
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started