Answered step by step

Verified Expert Solution

Link Copied!

Question

1 Approved Answer

Posted on Sep 24, 2024

# BEGIN - DO NOT CHANGE THESE IMPORTS/CONSTANTS OR IMPORT ADDITIONAL PACKAGES. from torch.utils.data import Dataset # END - DO NOT CHANGE THESE IMPORTS/CONSTANTS OR

# BEGIN - DO NOT CHANGE THESE IMPORTS/CONSTANTS OR IMPORT ADDITIONAL PACKAGES.

from torch.utils.data import Dataset

# END - DO NOT CHANGE THESE IMPORTS/CONSTANTS OR IMPORT ADDITIONAL PACKAGES.

# HeadlineDataset

# This class takes a Pandas DataFrame and wraps in a Torch Dataset.

# Read more about Torch Datasets here:

# https://pytorch.org/tutorials/beginner/basics/data_tutorial.html

class HeadlineDataset(Dataset):

# initialize this class with appropriate instance variables

def __init__(self, vocab, df, max_length=50):

# For this method: We would *strongly* recommend storing the dataframe

# itself as an instance variable, and keeping this method

# very simple. Leave processing to __getitem__.

# Sometimes, however, it does make sense to preprocess in

# __init__. If you are curious as to why, read the aside at the

# bottom of this cell.

## YOUR CODE STARTS HERE (~3 lines of code) ##

return

## YOUR CODE ENDS HERE ##

# return the length of the dataframe instance variable

def __len__(self):

df_len = None

## YOUR CODE STARTS HERE (1 line of code) ##

## YOUR CODE ENDS HERE ##

return df_len

# __getitem__

# Converts a dataframe row (row["tokenized"]) to an encoded torch LongTensor,

# using our vocab map created using generate_vocab_map. Restricts the encoded

# headline length to max_length.

# The purpose of this method is to convert the row - a list of words - into

# a corresponding list of numbers.

# i.e. using a map of {"hi": 2, "hello": 3, "UNK": 0}

# this list ["hi", "hello", "NOT_IN_DICT"] will turn into [2, 3, 0]

# returns:

# tokenized_word_tensor - torch.LongTensor

# A 1D tensor of type Long, that has each

# token in the dataframe mapped to a number.

# These numbers are retrieved from the vocab_map

# we created in generate_vocab_map.

# **IMPORTANT**: if we filtered out the word

# because it's infrequent (and it doesn't exist

# in the vocab) we need to replace it w/ the UNK

# token

# curr_label - int

# Binary 0/1 label retrieved from the DataFrame.

def __getitem__(self, index: int):

tokenized_word_tensor = None

curr_label = None

## YOUR CODE STARTS HERE (~3-7 lines of code) ##

## YOUR CODE ENDS HERE ##

return tokenized_word_tensor, curr_label

# Completely optional aside on preprocessing in __init__.

# Sometimes the compute bottleneck actually ends up being in __getitem__.

# In this case, you'd loop over your dataset in __init__, passing data

# to __getitem__ and storing it in another instance variable. Then,

# you can simply return the preprocessed data in __getitem__ instead of

# doing the preprocessing.

# There is a tradeoff though: can you think of one?

Step by Step Solution

There are 3 Steps involved in it

Step: 1

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

Step: 3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Databases Demystified

Authors: Andrew Oppel

1st Edition

0072253649, 9780072253641

More Books

Students also viewed these Databases questions

Question

★★★★★

Construct and interpret a 95% condence interval about M - W using the data from Problem 8. How might a marketing executive with McDonalds use this information? Wait Time at McDonald's Drive-Through...

Answered: 1 week ago

Question

★★★★★

The input port of the circuit in Fig. 19.79 is connected to a 10-V dc voltage source while the output port is terminated by a 5 - resistor. Find the voltage across the 5 - resistor by using h...

Answered: 1 week ago

Question

★★★★★

For the jobs of security guard, and valet, develop five additional situational, five behavioral, and five job knowledge questions, with descriptive good/average/poor answers.

Answered: 1 week ago

Question

★★★★★

a. Gloucester Company budgets sales of $11,750,000, fixed costs of $2,115,000, and variable costs of $6,815,000. What is the contribution margin ratio for Gloucester Company? b. If the contribution...

Answered: 1 week ago

Question

★★★★★

# BEGIN - DO NOT CHANGE THESE IMPORTS/CONSTANTS OR IMPORT ADDITIONAL PACKAGES. from torch.utils.data import Dataset # END - DO NOT CHANGE THESE IMPORTS/CONSTANTS OR IMPORT ADDITIONAL PACKAGES. #...

Answered: 1 week ago

Question

★★★★★

The manufacturing costs of Ackerman Industries for the first three months of the year follow: Total Costs Units Produced January $196,560 1,690 units February 286,320 2,690 March 305,760 4,290 Using...

Answered: 1 week ago

Question

★★★★★

EGR ratio=(EGR mass flow rate)/(intake mass flow rate) is equal to (intake CO2 concentration (ppm))/(exhaust CO2 concentration (ppm)) how they both can be considered equal? any assumptions?

Answered: 1 week ago

Question

★★★★★

PYTHON Code I am having so much many problems with this problem. Please show me how to code this using a dummy dataset with the same feature names as mine 1) Using PYTHON code report the number of...

Answered: 1 week ago

Question

★★★★★

Find the surface area revolution about the x-axis for y = x^2 over the interval [0,4]

Answered: 1 week ago

Question

★★★★★

5 14.28 points Book Print Heferences Brief Exercise 11-4 (Algo) Assessing the magnitude of operating leverage LO 11-4 The following income statement relates to Riley Company for the year. Sales...

Answered: 1 week ago

Question

★★★★★

Model the motion of an aging spring by replacing the spring constant k with a decreasing function ke-at, where a is a positive constant and t is time. Refine the model of the motion of a spring to...

Answered: 1 week ago

Question

★★★★★

5. Proofread everything before it goes live. Model good practice and verify spelling, grammar, and punctuation before releasing content. The handbook should represent the professional image of HR and...

Answered: 1 week ago

Question

★★★★★

LO3 Discuss issues associated with employee privacy, free speech, and whistle-blowing.

Answered: 1 week ago

Question

★★★★★

LO6 Outline approaches to employee discipline and termination of employment.

Answered: 1 week ago

Previous Question Next Question