Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Assignment Background Life is full of wonders and uncertainties that motivates us to find answers. However, not everybody is qualified to answer such questions. We

image text in transcribedimage text in transcribedimage text in transcribedimage text in transcribedimage text in transcribed

Assignment Background Life is full of wonders and uncertainties that motivates us to find answers. However, not everybody is qualified to answer such questions. We need to have a medium where various like- minded individuals with experience of certain topics to share and contribute their acquired knowledge via research articles. One well known medium to publish these articles is through research journals. These journals are centered on certain topics or categories (Biology, Computer Science, etc.), where some categories are more popular than others. This popularity is measured by their citation rate or impact factor For this project, the program reads files with the citation data of each category and the impact factor of various journals, print the average citation rate of the top 20 categories and plot the top 10 journal impact factor. The data used for this project was extracted from InCites Journal Citation Reports (https:/licr.incites.thomsonreuters.com/JCRLandingPageAction.action) Project Specifications 1. You must implement the following functions a) open file0 prompts the user for a filename to read data from. An error message should be shown if the file cannot be opened. This function will loop until it receives proper input and successfully opens the file. It returns a file pointer b) read journal_file(fp) reads the file object containing journal names and their impact factor. We are only interested in the journal name (str), total number of citations (int)(a.k.a. "cites"), and the impact factor (float). It returns a list of tuples, sorted by the impact factor (descending order) (Hint: place your sort outside your loop, otherwise it will take too long and your Mimir test will timeout and fail). Only read the first 30 characters for the journal name. Make sure to remove all commas from all strings with numeric values. See Notes and Hints. Make sure to check that the entries in citations and impact factor are of type int and float respectively. If it is not a valid entry, don't include it in the list. c) read category file(fp) reads the file object containing the citation data from over 200 categories. We are only interested in the category (str), number of journals (int), and total number of citations (int). Only read the first 30 characters for the category. For each category, you need to calculate the average citation per journal (float). It returns a list of tuples containing the category, the number of journals, the total number of citations and the average citation per journal, sorted by category alphabetically (ascending order) (Hint: place your sort outside your loop, otherwise it will take too long and your Mimir test will timeout and fail). Make sure to remove all commas from all strings witlh numeric values. See Notes and Hints. If it is not a valid entry, don't include it in the IS d) display table(data) receives the list of tuples with the citation data per category. It prints the data and an extra row with the totals of each column separate row of 85 '-'. It returns nothing. The formatting for the rows is "{:30s {:>10,d):>18,d; {:>25,.3f". Each row has a total of 85 characters. The table headings should have the following formatting: Title ("Citation Data of the Top 20 Categories"): d from the table by a :85s" Headings 'Category',Journals, 'Total Citations', 'Citation per Journal'): :30:>12s)>18s]:>25s]" e) sort data(data, column) This function receives the list of tuples with the citation data per category and the column index to sort by. The column index must start at 0. Use the itemgetter function to sort by the indicated column. It returns the list sorted by the selected column. If sorted by category then the table should be sorted alphabetically in ascending order, otherwise it should be sorted in descending order f) prepare plot data(data) takes the top 10 journals with the highest impact factor and separates them into two lists (names and impact factor). The impact factor needs to be of type float. It returns a tuple of two lists: (names_list, impact factor list) g) plot data(name, data) plots a bar chart of the top 10 journals with the highest impact factor. This function is already provided for this project. h) main) This function will call all other functions in this project. First it will open and read both files (first the category and then the journal files) .Then it prompts the user for a column index and will re-prompt for a column index until it is valid value has been entered (all digits and between 1 and 4) instructor data [('REVIEWS OF MODERN PHYSICS', 41133, 33.177), ("NEW ASTRONOMY REVIEWS. 922, 6.154), ("Nonlinear Analysis-Hybrid Syst. 812, 3.192) ('Microfluidics and Nanofluidics', 4089, 2.537) EXTREMOPHILES', 2718, 2.346), 'JOURNAL OF COMPUTING IN CIVIL ', 1541, 1.855), Journal of Neurosurgery-Pediat', 2567, 1.757), 'WORLD BANK RESEARCH OBSERVER, 784, 1.667), 'Archaeological and Anthropolog, 301, 1.636),'SKELETAL RADIOLOGY',4318, 1.527), ARCHAEOMETRY, 2112, 1.364), ('Molecular &Cellular Toxicolog', 299, 1.24), 'Applied Geophysics',417, 0.804), "Fixed Point Theory', 304, 0.581), 'China Communications', 275, 0.424)] student data-read journal file (fp) assert instructor data student data read category file function test: fp open ('category-impact_small.csv ', 'r') student_data - read_category_file (fp) instructordata [('ACOUSTICS ' , 32, 138295, 4321.71875), 'AGRICULTURE, MULTIDISCIPLINARY, 57, 170336, 2988.3508771929824), - BIOCHEMISTRY & MOLECULAR BIOLO', 289, 3273965, 11328.598615916955), CHEMISTRY, MULTIDISCIPLINARY, 163, 2825242, 17332.77300613497), ENGINEERING, ENVIRONMENTAL 50, 510092, 10201.84), ENGINEERING, GEOLOGICAL, 35, 76977, 2199.342857142857)ENVIRONMENTAL SCIENCES', 225, 1412031, 6275.693333333334), ('ETHNIC STUDIES, 15, 11308, 753.8666666666667),'GEOGRAPHY, PHYSICAL', 49, 191491, 3907.9795918367345, GEOLOGY', 47, 102891, 2189.1702127659573), 'GERIATRICS & GERONTOLOGY, 49, 171259, 3495.081632653061), HEALTH CARE SCIENCES&SERVICE, 88, 272255, 3093.806818181818), ('MATERIALS SCIENCE, COATINGS&', 18, 209367, 11631.5), 'MATERIALS SCIENCE, TEXTILES', 23, 35426, 1540.2608695652175), 'PSYCHOLOGY, APPLIED', 79, 173846, 2200.5822784810125) ] assert instructor data student data sort data function test: student data['ONCOLOGY', 213, 1634966, 7675.896713615023), CHEMISTRY, MEDICINAL59, 425363, 7209.542372881356) , BIOTECHNOLOGY &APPLIED MICROBIOLOGY, 161, 1103236, 6852.39751552795), DEVELOPMENTAL BIOLOGY,41, 273038, 6659.463414634146),BEHAVIORAL SCIENCES, 51, 305160, 5983.529411764706),'MEDICINE, RESEARCH &EXPERIMENTAL, 124, 694043, 5597.120967741936)PHYSICS, MATHEMATICAL, 53, 283825, 5355.188679245283), 'METALLURGY &METALLURGICAL ENGINEERING, 73, 360924, 4944.164383561644), ('NUCLEAR SCIENCE &TECHNOLOGY, 32, 149291, 4665.34375, MARINE & FRESHWATER BIOLOGY', 104, 399530, 3841.6346153846152), AGRONOMY, 83, 237099, 2856.614457831325), 'VETERINARY SCIENCES', 138, 277519, 2011.0072463768115), 'SOCIOLOGY142, 178756, 1258.8450704225352),'HISTORY&PHILOSOPHY OF SCIENCE, 44, 22128, 502.90909090909093) ] Test Case 1: Please enter a valid filename: category_impact_2017.csv Please enter a valid filename: journal_impact_2017.csv Column number to sort data (1-category, 2-journals, 3-citations, 4-average citations): 1 Citation Data of the Top 20 Categories Category ACOUSTICS AGRICULTURAL ECONOMICS&POLIC AGRICULTURAL ENGINEERING AGRICULTURE, DAIRY & ANIMAL SC AGRICULTURE, MULTIDISCIPLINARY AGRONOMY ALLERGY ANATOMY&MORPHOLOGY ANDROLOGY ANESTHESIOLOGY ANTHROPOLOGY AREA STUDIES ASTRONOMY&ASTROPHYSICS AUDIOLOGY&SPEECH-LANGUAGE PA AUTOMATION & CONTROL SYSTEMS BEHAVIORAL SCIENCES BIOCHEMICAL RESEARCH METHODS BIOCHEMISTRY & MOLECULAR BIOLO BIODIVERSITY CONSERVATION BIOLOGY Journals Total Citations 174,802 24,021 166, 334 192,794 210,711 287,102 127,991 65,760 8,410 201,325 126,983 38,910 1,071,345 100,231 350,086 356, 259 797,638 3,625,819 207,782 491,775 Citation per Journal 5,638.774 1,413.000 11,881.000 3,213.233 3,696.684 3,300.023 4,740.407 3,131.429 1,401.667 6,494.355 1,493.918 572.206 16,232.500 4,009.240 5,739.115 6,985.471 10,096.684 12, 374.809 3,645.298 5,785.588 31 17 14 60 57 87 27 21 31 85 68 25 61 51 79 293 57 85 TOTAL 1,221 8,626,078 111,845.400 Do you want to plot the journal data (yeso)? no Assignment Background Life is full of wonders and uncertainties that motivates us to find answers. However, not everybody is qualified to answer such questions. We need to have a medium where various like- minded individuals with experience of certain topics to share and contribute their acquired knowledge via research articles. One well known medium to publish these articles is through research journals. These journals are centered on certain topics or categories (Biology, Computer Science, etc.), where some categories are more popular than others. This popularity is measured by their citation rate or impact factor For this project, the program reads files with the citation data of each category and the impact factor of various journals, print the average citation rate of the top 20 categories and plot the top 10 journal impact factor. The data used for this project was extracted from InCites Journal Citation Reports (https:/licr.incites.thomsonreuters.com/JCRLandingPageAction.action) Project Specifications 1. You must implement the following functions a) open file0 prompts the user for a filename to read data from. An error message should be shown if the file cannot be opened. This function will loop until it receives proper input and successfully opens the file. It returns a file pointer b) read journal_file(fp) reads the file object containing journal names and their impact factor. We are only interested in the journal name (str), total number of citations (int)(a.k.a. "cites"), and the impact factor (float). It returns a list of tuples, sorted by the impact factor (descending order) (Hint: place your sort outside your loop, otherwise it will take too long and your Mimir test will timeout and fail). Only read the first 30 characters for the journal name. Make sure to remove all commas from all strings with numeric values. See Notes and Hints. Make sure to check that the entries in citations and impact factor are of type int and float respectively. If it is not a valid entry, don't include it in the list. c) read category file(fp) reads the file object containing the citation data from over 200 categories. We are only interested in the category (str), number of journals (int), and total number of citations (int). Only read the first 30 characters for the category. For each category, you need to calculate the average citation per journal (float). It returns a list of tuples containing the category, the number of journals, the total number of citations and the average citation per journal, sorted by category alphabetically (ascending order) (Hint: place your sort outside your loop, otherwise it will take too long and your Mimir test will timeout and fail). Make sure to remove all commas from all strings witlh numeric values. See Notes and Hints. If it is not a valid entry, don't include it in the IS d) display table(data) receives the list of tuples with the citation data per category. It prints the data and an extra row with the totals of each column separate row of 85 '-'. It returns nothing. The formatting for the rows is "{:30s {:>10,d):>18,d; {:>25,.3f". Each row has a total of 85 characters. The table headings should have the following formatting: Title ("Citation Data of the Top 20 Categories"): d from the table by a :85s" Headings 'Category',Journals, 'Total Citations', 'Citation per Journal'): :30:>12s)>18s]:>25s]" e) sort data(data, column) This function receives the list of tuples with the citation data per category and the column index to sort by. The column index must start at 0. Use the itemgetter function to sort by the indicated column. It returns the list sorted by the selected column. If sorted by category then the table should be sorted alphabetically in ascending order, otherwise it should be sorted in descending order f) prepare plot data(data) takes the top 10 journals with the highest impact factor and separates them into two lists (names and impact factor). The impact factor needs to be of type float. It returns a tuple of two lists: (names_list, impact factor list) g) plot data(name, data) plots a bar chart of the top 10 journals with the highest impact factor. This function is already provided for this project. h) main) This function will call all other functions in this project. First it will open and read both files (first the category and then the journal files) .Then it prompts the user for a column index and will re-prompt for a column index until it is valid value has been entered (all digits and between 1 and 4) instructor data [('REVIEWS OF MODERN PHYSICS', 41133, 33.177), ("NEW ASTRONOMY REVIEWS. 922, 6.154), ("Nonlinear Analysis-Hybrid Syst. 812, 3.192) ('Microfluidics and Nanofluidics', 4089, 2.537) EXTREMOPHILES', 2718, 2.346), 'JOURNAL OF COMPUTING IN CIVIL ', 1541, 1.855), Journal of Neurosurgery-Pediat', 2567, 1.757), 'WORLD BANK RESEARCH OBSERVER, 784, 1.667), 'Archaeological and Anthropolog, 301, 1.636),'SKELETAL RADIOLOGY',4318, 1.527), ARCHAEOMETRY, 2112, 1.364), ('Molecular &Cellular Toxicolog', 299, 1.24), 'Applied Geophysics',417, 0.804), "Fixed Point Theory', 304, 0.581), 'China Communications', 275, 0.424)] student data-read journal file (fp) assert instructor data student data read category file function test: fp open ('category-impact_small.csv ', 'r') student_data - read_category_file (fp) instructordata [('ACOUSTICS ' , 32, 138295, 4321.71875), 'AGRICULTURE, MULTIDISCIPLINARY, 57, 170336, 2988.3508771929824), - BIOCHEMISTRY & MOLECULAR BIOLO', 289, 3273965, 11328.598615916955), CHEMISTRY, MULTIDISCIPLINARY, 163, 2825242, 17332.77300613497), ENGINEERING, ENVIRONMENTAL 50, 510092, 10201.84), ENGINEERING, GEOLOGICAL, 35, 76977, 2199.342857142857)ENVIRONMENTAL SCIENCES', 225, 1412031, 6275.693333333334), ('ETHNIC STUDIES, 15, 11308, 753.8666666666667),'GEOGRAPHY, PHYSICAL', 49, 191491, 3907.9795918367345, GEOLOGY', 47, 102891, 2189.1702127659573), 'GERIATRICS & GERONTOLOGY, 49, 171259, 3495.081632653061), HEALTH CARE SCIENCES&SERVICE, 88, 272255, 3093.806818181818), ('MATERIALS SCIENCE, COATINGS&', 18, 209367, 11631.5), 'MATERIALS SCIENCE, TEXTILES', 23, 35426, 1540.2608695652175), 'PSYCHOLOGY, APPLIED', 79, 173846, 2200.5822784810125) ] assert instructor data student data sort data function test: student data['ONCOLOGY', 213, 1634966, 7675.896713615023), CHEMISTRY, MEDICINAL59, 425363, 7209.542372881356) , BIOTECHNOLOGY &APPLIED MICROBIOLOGY, 161, 1103236, 6852.39751552795), DEVELOPMENTAL BIOLOGY,41, 273038, 6659.463414634146),BEHAVIORAL SCIENCES, 51, 305160, 5983.529411764706),'MEDICINE, RESEARCH &EXPERIMENTAL, 124, 694043, 5597.120967741936)PHYSICS, MATHEMATICAL, 53, 283825, 5355.188679245283), 'METALLURGY &METALLURGICAL ENGINEERING, 73, 360924, 4944.164383561644), ('NUCLEAR SCIENCE &TECHNOLOGY, 32, 149291, 4665.34375, MARINE & FRESHWATER BIOLOGY', 104, 399530, 3841.6346153846152), AGRONOMY, 83, 237099, 2856.614457831325), 'VETERINARY SCIENCES', 138, 277519, 2011.0072463768115), 'SOCIOLOGY142, 178756, 1258.8450704225352),'HISTORY&PHILOSOPHY OF SCIENCE, 44, 22128, 502.90909090909093) ] Test Case 1: Please enter a valid filename: category_impact_2017.csv Please enter a valid filename: journal_impact_2017.csv Column number to sort data (1-category, 2-journals, 3-citations, 4-average citations): 1 Citation Data of the Top 20 Categories Category ACOUSTICS AGRICULTURAL ECONOMICS&POLIC AGRICULTURAL ENGINEERING AGRICULTURE, DAIRY & ANIMAL SC AGRICULTURE, MULTIDISCIPLINARY AGRONOMY ALLERGY ANATOMY&MORPHOLOGY ANDROLOGY ANESTHESIOLOGY ANTHROPOLOGY AREA STUDIES ASTRONOMY&ASTROPHYSICS AUDIOLOGY&SPEECH-LANGUAGE PA AUTOMATION & CONTROL SYSTEMS BEHAVIORAL SCIENCES BIOCHEMICAL RESEARCH METHODS BIOCHEMISTRY & MOLECULAR BIOLO BIODIVERSITY CONSERVATION BIOLOGY Journals Total Citations 174,802 24,021 166, 334 192,794 210,711 287,102 127,991 65,760 8,410 201,325 126,983 38,910 1,071,345 100,231 350,086 356, 259 797,638 3,625,819 207,782 491,775 Citation per Journal 5,638.774 1,413.000 11,881.000 3,213.233 3,696.684 3,300.023 4,740.407 3,131.429 1,401.667 6,494.355 1,493.918 572.206 16,232.500 4,009.240 5,739.115 6,985.471 10,096.684 12, 374.809 3,645.298 5,785.588 31 17 14 60 57 87 27 21 31 85 68 25 61 51 79 293 57 85 TOTAL 1,221 8,626,078 111,845.400 Do you want to plot the journal data (yeso)? no

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database Processing

Authors: David M. Kroenke

12th Edition International Edition

1292023422, 978-1292023427

More Books

Students also viewed these Databases questions

Question

L A -r- P[N]

Answered: 1 week ago

Question

Are my points each supported by at least two subpoints?

Answered: 1 week ago