Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Define build_ds(file) in python that reads the file and builds the datastructure which is needed for further processing. The datastructure returned should be a 2D

Define build_ds(file) in python that reads the file and builds the datastructure which is needed for further processing. The datastructure returned should be a 2D matrix (list of lists), where each sentence of the text is a row of the matrix, and each word of the line - an element of the row. Any text that is not composed of letters-only (e.g. numbers, punctuations marks) should be excluded (use the function with re module which takes a string and returns a boolean - whether it consists of letters only (both lower and upper case).). The only sentence border is the newline.

sample.txt:

human language is filled with ambiguities .

usage exceptions , variations in sentence structurethese just a few of the irregularities of human language that take humans years to learn .

but that programmers must teach natural language-driven applications to recognize and understand accurately from the start, if those applications are going to be useful.

The output should like this:

ds = build_ds (" sample .txt ")

ds [:2]

Out [1]: [[human , language , is , filled , with , ambiguities ], [ usage , exceptions , variations , in , sentence , 'just ', 'a ', ... , to ' , learn ]]

#Task

def build_ds(file): """ Read a file. Build a datastructure - 2D matrix (list of lists). Rows should be sentences, cells should contain the words. :param file: a txt file where the text is all lowercase and sentence border punctuation marks have spaces around them :rtype: 2D list """

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Larry Ellison Database Genius Of Oracle

Authors: Craig Peters

1st Edition

0766019748, 978-0766019744

More Books

Students also viewed these Databases questions