Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Note : Use python to solve the question ====PLEASE ANSWER Q2 ONLY==== -Data and Text mining- Doc1: When someone is infected with the Coronavirus, the

Note : Use python to solve the question

====PLEASE ANSWER Q2 ONLY====

-Data and Text mining-

Doc1: When someone is infected with the Coronavirus, the symptoms that appear can range from fever to loss of the ability to smell and taste. However, there are also those who do not experience symptoms of COVID-19 at all which is called asymptomatic.

Doc2: The loss of the ability to smell or anosmia has been a symptom of COVID-19 for a long time. Recently, a new symptom appeared in the form of parosmia, which is a condition in which patients detect bad odors through their sense of smell.

Doc3: The new variant of Corona found in the UK is said to be more contagious and is spreading rapidly to various countries. To date, 22 countries have detected a new variant of Corona in their region.

Doc4: Indonesia is one of the largest archipelagic countries in the world. Consisting of more than 17,000 islands stretching from Sabang to Merauke, hold priceless assets of wealth. Thousands of islands are lined up to form an elongated coastline with a very attractive stretch of clean white sand. Rolling waves ranging from small waves to large waves suitable for surfing sports lovers are all available in Indonesia.

Doc5: Developing the potential for cultural and historical tourism can indeed be done by renovating buildings or historical sites and supporting facilities and infrastructure for these attractions. However, this effort cannot run optimally and sustainably if the community especially the local community does not participate and care.

Based on the data above:

Q1 ) Calculate the following TF-IDF in each document:

  • Symptoms

  • Corona

  • Virus

  • Covid-19

  • Country

  • Public / society

Q2) If someone performs a query using the keyword Symptoms of Covid-19 then only Doc1 and Doc2 are relevant. Meanwhile, if you use the keyword "Corona Variant" then only Doc1 and Doc3 are relevant. By using the TF-IDF and Cosine Similarity methods, prove it!

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Pro SQL Server Wait Statistics

Authors: Enrico Van De Laar

1st Edition

1484211391, 9781484211397

More Books

Students also viewed these Databases questions