Question
Using Linux commands, I am trying to make a histogram of all three word sequences (trigram) in a database given. It needs to be sorted
Using Linux commands, I am trying to make a histogram of all three word sequences (trigram) in a database given. It needs to be sorted in decreasing order of occurrence, it needs to be case insensitive and punctuation needs to be ignored. There should be a column where it counts the number of times the trigram occurred, a column where it calculates the percentage each was used and a column that keeps a running sum of the percentages in the percent column.
The output of your command line should be:
Trigram | Frequency | ||
No. | Percentage | Cumulative | |
see jane run | 3 | 37.5000% | 37.5000% |
jane run see | 2 | 25.0000% | 62.5000% |
run see john | 1 | 12.5000% | 75.0000% |
see john run | 1 | 12.5000% | 87.5000% |
run see jane | 1 | 12.5000% | 100.0000% |
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started