Question
How to improve the F1 score from 0.87 to > 0.93 of the following sql (in psql) to solve the issues with restaurants database as
How to improve the F1 score from 0.87 to > 0.93 of the following sql (in psql) to solve the issues with restaurants database as described below this query. String operations that psql provides can be used: http://www.postgresql.org/docs/9.3/static/functions-string.html (Also, Record Linkage functions levenshtein(string1, string2)) and Jaccard similarity (similarity(string1, string2) may be used).
-- This is the default naive match query. -- REPLACE IT WITH BETTER QUERY SELECT A.id, B.id FROM restaurants A, restaurants B WHERE A.name = B.name AND A.id < B.id;
The restaurants(id, name, addr, city, type) table in the restaurants database has been merged improperly! It contains data from two different restaurant listing services, and there are duplicates among the rows, e.g.:
6,aqua,252 california st.,san francisco,seafood 54,aqua,252 california st.,san francisco,american (new) Just as a reference: below is the query used to create Restaurants table :
CREATE TABLE restaurants ( id INT NOT NULL PRIMARY KEY, name VARCHAR(200) NOT NULL, addr VARCHAR(200) NOT NULL, city VARCHAR(200) NOT NULL, type VARCHAR(100) );
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started