Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Can you please provide the solution for all parts of the below project??? Please read entire instruction section: all the information is below. instructions: Determining

Can you please provide the solution for all parts of the below project??? Please read entire instruction section: all the information is below.

instructions:

Determining Optimal Character Memorization Order

Summary: In this assignment, you will implement a topological sort algorithm as a way to reorder kanji (; Japanese logographic characters) into an order optimal for memorization.

1 Background

In this assignment, you will practice applying your knowledge of graph data structures and algorithms to determine an optimal order to learn logographic characters.

Consider the following scenario: as a native speaker of the English language, you wish to learn a language like Chinese () or Japanese (), which uses a special character set. Unlike the English language, which uses the Roman alphabet to encode phonetics, many east Asian languages use Hanzi derived characters, which are symbols with semantics bound to them (to use the technical term: logographic character). Often times, English speakers learning a language using logographic characters find themselves stumped by the apparently insurmountable problem of memorizing thousands of unique characters. They all look so different and yet so similar - can we hope to tell them apart or write them? Wouldn't it be nice if we could use our knowledge of algorithms and data structures to somehow address this problem so that English speakers would have a better shot at learning or ...?

One interesting dependency graph that can be constructed is a "component" graph for Hanzi characters, or Hanzi derived characters (e.g., Kanji). You see, complex characters are often built from simpler characters as components. This sub-structure is very useful! Components not only define a common appearance and stroke order but can indicate phonetics and/or semantics. Furthermore, there are considerably fewer components (hundreds) than actual characters (thousands). (Please note that we use the term component very generally here - it does not map to the traditional notion of a radical.) This sub-structure is particularly useful for people memorizing characters instead of looking at each character as a monolithic block, one can memorize the individual components, and then reuse that knowledge for more complex characters. The following graph is an example of this for the character ("method"):

Figure 1: Dependency graph for .

Reading this graph, we can see that is made from ("gone") and ("water") ( is written as here, a standard transformation). Recursively, is made from ("soil") and . Based on these dependencies, we would want to learn these characters in the order: . If we do that, then instead of learning each stroke in , we just have to remember "method=water+gone", which tells us to write and . (For more information on these ideas, see Remembering the Kanji by James Heisig.) That said, in order to make use of this recursive structure, we would have to learn characters in an order such that we also see simpler characters before we see the more complex characters that are built out of them. To solve this problem, we can produce a topological sort of a graph. A topological sort is an ordered list of vertices in graph such that all dependencies are listed before their dependent.

This document is separated into four sections: Background, Requirements, Testing, and Submission. You have almost finished reading the Background section already. In Requirements, we will discuss what is expected of you in this homework. In Testing, we give some sample output for the program. Lastly, Submission discusses how your source code should be submitted on BlackBoard.

2 Requirements [60 points, 8 extra credit]

In this assignment, you will implement an editable graph data structure and a new algorithm for finding the topological sort of a graph. Our goal will be to load two data files, and print out a topological order for the characters that they list. Attached to this post are a base file and two interfaces. Further down the BlackBoard page, you'll find a PDF describing the Java implementation of hashtables (you may find it useful), and a visualization of the dependencies in the data files. The data is formatted as follows:

data-kanji.txt (UTF-8 formatted) stores nodes:

Tab separated.

# prefixes comment lines.

Lines look like , e.g., "120 ", which indicates that character1 can be represented as the number characterID1. IDs are just integers.

data-components.txt (ASCII formatted) stores edges:

Tab separated.

# prefixes comment lines

Lines look like , e.g., "92 73", which indicates that character1 is a component of character2. In terms of programming, you will need to:

Create a new class called BetterDiGraph that implements the EditableDiGraph interface. See the interface file for details. [22 points]

Create a new class called IntuitiveTopological that implements the TopologicalSort interface. Use BetterDiGraph to store the graph. [20 points]

Instead of using DFS to find a topological sort, implement the following algorithm: "IntuitiveTopological". This algorithm works as follows: look at your graph, pick out a node with in-degree zero, add it to the topological ordering, and remove it from the graph. This process repeats until the graph is

file:///C:/Users/Ruben/AppData/Local/Temp/lyx_tmpdir.ZFhZdDPWUcot/lyx_tmpbuf0/ser222_04_02_hw02.xhtml 1/2 9/20/2018 LyX Document

empty.

Make sure to check for cycles before trying to generate a topological sort of the graph!

Complete the main method in LastNameMain.java. It should: [20 points]

Load data-kanji.txt, use it to populate a hashtable that maps IDs to characters, and add the IDs as nodes in the graph. Load data-components.txt, and use it to add edges to the graph being built.

Create an IntuitiveTopological object, and use it to sort the graph.

Display the characters in the ordering. Note that topological sort will produce a list of a IDs - you'll need to take the IDs and uses them to look up the correct character in the hashtable you populated earlier.

Extra Credit: add support for visualizing the graph that you generate. Most likely this will take the form of using a graph library such as GraphViz to render an image for the data you load. An example might look like the image below. [8 points]

If you find yourself adding import packages other than java.util.LinkedList, java.util.HashMap, java.util.NoSuchElementException, or java.io.*, please double check with your instructor that they may be used. !!

resources:

/**

* Program for generating kanji component dependency order via topological sort.

*

* @author (your name), Acuna

* @version (version)

*/

public class BaseMain {

/**

* Entry point for testing.

* @param args the command line arguments

*/

public static void main(String[] args) {

//TODO: implement this

//Freebie: this is one way to load the UTF8 formated character data.

//BufferedReader indexReader = new BufferedReader(new InputStreamReader(new FileInputStream(new File("data-kanji.txt")), "UTF8"));

}

}

import java.util.NoSuchElementException;

/**

* Implements an editable graph with sparse vertex support.

*

* @author Acuna

*/

public interface EditableDiGraph {

/**

* Adds an edge between two vertices, v and w. If vertices do not exist,

* adds them first.

*

* @param v source vertex

* @param w destination vertex

*/

void addEdge(int v, int w);

/**

* Adds a vertex to the graph. Does not allow duplicate vertices.

*

* @param v vertex number

*/

void addVertex(int v);

/**

* Returns the direct successors of a vertex v.

*

* @param v vertex

* @return successors of v

*/

Iterable getAdj(int v);

/**

* Number of edges.

*

* @return edge count

*/

int getEdgeCount();

/**

* Returns the in-degree of a vertex.

* @param v vertex

* @return in-degree.

* @throws NoSuchElementException exception thrown if vertex does not exist.

*/

int getIndegree(int v) throws NoSuchElementException;

/**

* Returns number of vertices.

* @return vertex count

*/

int getVertexCount();

/**

* Removes edge from graph. If vertices do not exist, does not remove edge.

*

* @param v source vertex

* @param w destination vertex

*/

void removeEdge(int v, int w);

/**

* Removes vertex from graph. If vertex does not exist, does not try to

* remove it.

*

* @param v vertex

*/

void removeVertex(int v);

/**

* Returns iterable object containing all vertices in graph.

*

* @return iterable object of vertices

*/

Iterable vertices();

/**

* Returns true if the graph contains at least one vertex.

*

* @return boolean

*/

boolean isEmpty();

/**

* Returns true if the graph contains a specific vertex.

*

* @param v vertex

* @return boolean

*/

boolean containsVertex(int v);

}

/**

* Interface for classes providing a topological sort of a digraph.

*

* @author Sedgewick and Wayne, Acuna

*/

public interface TopologicalSort

{

/**

* Returns an iterable object containing a topological sort.

* @return a topological sort.

*/

public Iterable order();

/**

* Returns true if the graph being sorted is a DAG, false otherwise.

* @return is graph a DAG

*/

public boolean isDAG();

}

txt file:data-kanji.txt

#heisignum kanji

120

119

118

117

116

115

114

113

112

111

110

109

108

#107 (boring node with in+out degree = 0)

106

#105

#104

#

#

101

100

99

98

97

96

95

94

93

92

91

90

89

88

87

86

85

84

83

82

#81

80

79

78

77

76

75

#74

73

72

71

#70

#63

48

30

#33

19

5

68

26

20

51

65

#6

15

17

#34

750

335

#50

#62

31

21

#49

40

59

18

57

37

# - algorithm design

1

2

3

4

text file end

text file: data-components.txt

#src dst

92 73

76 73

5 17

76 17

78 18

15 18

106 19

78 20

106 20

76 21

18 21

78 96

76 95

76 97

92 26

78 26

78 109

115 109

92 77

108 94

106 94

78 30

99 30

106 31

30 31

99 71

30 71

78 88

15 116

78 37

99 84

37 84

76 119

92 40

76 98

40 98

112 83

78 83

115 118

92 89

92 113

108 113

89 113

92 100

113 100

76 48

78 51

92 51

92 110

78 110

106 110

76 335

335 4

92 4

335 3

101 750

86 2

750 2

15 1

90 1

76 79

15 114

114 87

76 85

114 85

15 57

88 117

91 59

99 111

114 111

59 120

111 120

78 75

76 65

106 80

78 72

37 82

68 82

116 93

99 93

text file end

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database Systems Design Implementation And Management

Authors: Peter Robb,Carlos Coronel

5th Edition

061906269X, 9780619062699

More Books

Students also viewed these Databases questions

Question

5. Identify three characteristics of the dialectical approach.

Answered: 1 week ago