Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

shell scripting solve as quickly i want to give you upvote Shell Scripting Project - Text Summarization using Sentence Centrality Extractive summarization works by choosing

shell scripting
solve as quickly
i want to give you upvote image text in transcribed
Shell Scripting Project - Text Summarization using Sentence Centrality Extractive summarization works by choosing a subset of sentences from the original document that contains the main contents. Several techniques presented in the literature to handle extractive text summarization. Centrality concept is one of the most used technique. In this approach the document is tokenized into sentences based on (17) punctuation marks, then the similarity of each pair of sentences is computed. Finally, top scored sentences are selected based on the summary ratio. Formally, let D denote a document consisting of a sequence of sentences (51.52...), and Sim, is the similarity score for each pair (s.s). The degree centrality for sentence s can be defined as: Centrality(s) - Ee... Simu After obtaining the centrality score for each sentence, sentences are sorted in reverse order and the top ranked ones are included in the summary Procedure: First: the program ask user to enter name of the text file and summary ratio. In the second step, the text file will go into a sequence of text preprocessing including Sentence tokenization based on .1?) punctuation marks. Convert to small letters Remove Stop words from both sentences. Stop Words are words, which do not contain important information. Use the following list of these words: [l, a, an, as, at, the, by, in, for, of, on, that Remove the duplication of words from both sentences. In other words, each word will appear once per sentence. Third: compute similarity between each pair of sentences. The similarity calculated as the size of the intersection of words between the two sentences divided by the size of the union of the two sentences: (S1 S2) Sim = (SI U SZ) A value "o" means the two sentences are completely dissimilar, " that they are identical, and values between 0 and 1 representing a degree of similarity Fourth: Compute centrality of each sentence. . Finally: top ranked sentences are selected based on the summary ratio and then written to a file named summary.txt. Note that the selected sentences are sorted based on their centrality score

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database Machine Performance Modeling Methodologies And Evaluation Strategies Lncs 257

Authors: Francesca Cesarini ,Silvio Salza

1st Edition

3540179429, 978-3540179429

Students also viewed these Databases questions

Question

Are there any questions that you want to ask?

Answered: 1 week ago

Question

2. Discuss the steps in preparing a manager to go overseas.

Answered: 1 week ago

Question

8. Measure the effectiveness of the succession planning process.

Answered: 1 week ago