Answered step by step

Verified Expert Solution

Link Copied!

Question

1 Approved Answer

Posted on Sep 24, 2024

JAVA Trying to use buffered reader and buffered writer to find invalid tags and values in data. IIf Ris encounters a duplicate DO value while

JAVA Trying to use buffered reader and buffered writer to find invalid tags and values in data. IIf Ris encounters a duplicate DO value while reading the data file, it must provide an error message indicating that it encountered a duplicate and the title of the publication at which the duplicate was found, then move on to reading, processing, and outputting the next publication in data file. The exact format of the error must be as follows: Duplicate DO: r for Title: n. Second, If Ris encounters any invalid TAG or TY type, it must mark that ris entry (publication) as invalid. The exact error must be as follows: Invalid entry at DO or TY: r. If the DO is not a duplicate and the entry is valid, Ris should calculate an extra tag field SM. SM must summarize the references as SM UniqueCount: , NumOfAuthors: , LongestTagValue: , MostKW: (). UniqueCount is the number of the tag fields which has values. NumOfAuthors is the number of the authors for this reference. LongestTagValue is the tag that has the longest values. For LongestTagValue calculation, AB should not be counted. MostKW is the most repeated keyword value.

Here is what I have so far:

import java.io.BufferedReader;

import java.io.BufferedWriter;

import java.io.FileReader;

import java.io.FileWriter;

import java.io.IOException;

import java.util.ArrayList;

import java.util.HashMap;

import java.util.HashSet;

import java.util.Map;

import java.util.Set;

public class Example {

public static void main(String[] args) {

BufferedReader bf;

BufferedWriter bw;

try (BufferedReader in = new BufferedReader(new FileReader("references.ris"));

BufferedWriter out = new BufferedWriter(new FileWriter("output.txt"))) {

Set doSet = new HashSet<>();

String line, doValue = "", tiValue = "", longestTagValue = "", mostKW = "";

int uniqueCount = 0, numOfAuthors = 0, maxKWCount = 0;

boolean validEntry = true;

ArrayList validTags = new ArrayList<>();

ArrayList validTYValues = new ArrayList<>();

while ((line = in.readLine()) != null) {

if (line.substring(0, 2).equalsIgnoreCase("TY")) {

Map kwCount = new HashMap<>();

String tags [] = new String[100];

String values [] = new String[100];

int index = 0;

while(!line.isBlank()) {

String[] parts = line.split(" -", 2);

String tag = parts[0].trim();

tags[index] = tag;

String value = "";

if (parts.length == 2) {

value = parts[1].trim();

}

values[index] = value;

index++;

line = bf.readLine();

}

for(int i = 0; i < index; i++) {

}

if (validEntry) {

out.write("SM - UniqueCount: " + uniqueCount + ", NumOfAuthors: " + numOfAuthors + ", LongestTagValue: " + longestTagValue + ", MostKW: " + mostKW);}

}

catch (IOException e) {

System.err.println("An IO exception occurred: " + e.getMessage());

}

*********************************************************************************************

Data Example

TY - CONF

TI - Optimal trade-off between accuracy and network cost of distributed learning in Mobile Edge Computing: An analytical approach

T2 - 2017 IEEE 18th International Symposium on A World of Wireless, Mobile and Multimedia Networks (WoWMoM)

SP - 1

EP - 9

AU - L. Valerio

AU - A. Passarella

AU - M. Conti

PY - 2017

KW - Cloud computing

KW - Distributed databases

KW - Data analysis

KW - Analytical models

KW - Mobile communication

KW - Data mining

KW - Data models

DO - 10.1109/WoWMoM.2017.7974310

JO - 2017 IEEE 18th International Symposium on A World of Wireless, Mobile and Multimedia Networks (WoWMoM)

IS -

SN -

VO -

VL -

JA - 2017 IEEE 18th International Symposium on A World of Wireless, Mobile and Multimedia Networks (WoWMoM)

Y1 - 12-15 June 2017

AB - The most widely adopted approach for knowledge extraction from raw data generated at the edges of the Internet (e.g., by IoT or personal mobile devices) is through global cloud platforms, where data is collected from devices, and analysed. However, with the increasing number of devices spread in the physical environment, this approach rises several concerns. The data gravity concept, one of the basis of Fog and Mobile Edge Computing, points towards a decentralisation of computation for data analysis, whereby the latter is performed closer to where data is generated, for both scalability and privacy reasons. Hence, data produced by devices might be processed according to one of the following approaches: (i) directly on devices that collected it (ii) in the cloud, or (iii) through fog/mobile edge computing techniques, i.e., at intermediate nodes in the network, running distributed analytics after collecting subsets of the data. Clearly, (i) and (ii) are the two extreme cases of (iii). It is worth noting that the same analytics task executed at different collection points in the network, comes at different costs in terms of traffic generated over the network. Precisely, these costs refer to the traffic generated to move data towards the collection point selected (e.g. the Edge or the Cloud) and the one induced by the distributed analytics process. Until now, deciding if to use intermediate collection points, and which one they should be in order to both obtain a target accuracy and minimise the network traffic, is an open question. In this paper, we propose an analytical framework able to cope with this problem. Precisely, we consider learning tasks, and define a model linking the accuracy of the learning task performed with a certain set of collection points, with the corresponding network traffic. The model can be used to identify, given the specification of the learning problem (e.g. binary classification, regression, etc.), and its target accuracy, what is the optimal level for collecting data in order to minimise the total network cost. We validate our model through simulations in order to show that setting, in simulation, the level of intermediate collection indicated by our model, leads to the minimum cost for the target accuracy.

ER -