Question
State-of-the-art machine learning models use millions, sometimes billions, of parameters. When training our models, the number of distinct examples we need is proportional to the
State-of-the-art machine learning models use millions, sometimes billions, of parameters. When training our models, the number of distinct examples we need is proportional to the number of parameters our model has.
Invariance is a trait of a neural network model (for instance, an image classification model) that can robustly classify objects even when the objects are placed in different orientations, have different sizes, and illumination differences. Many of these potential differences in our dataset can be created artificially, instead of collecting more images. Methods could be to alter brightness or contrast of the image, stretch or skew operations, or a variety of translation methods.
But what about textual data for NLP projects? Possibilities include synonym replacement, which replaces words with words that mean the same thing, random inserts, swaps, and deletions, among others.
For your assignment, submit a Python script that will take any text dataset and augment it in some way to expand the dataset. Submission must include a script that will augment any text dataset within its folder. Please include a sample un-augmented dataset with your submission.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started