Answered step by step

Verified Expert Solution

Link Copied!

Question

1 Approved Answer

Posted on Sep 26, 2024

JAVA. Your task is to write a class called HTMLChecker. As you may be aware, many resources of the World Wide Web are written in

JAVA.

Your task is to write a class called HTMLChecker. As you may be aware, many resources of the World Wide Web are written in a markup language called HTML (Hypertext Markup Language). You do not need to know the details of HTML to complete this assignment. All you need to know is the following:

1. HTML files consist of tags and text. A tag is enclosed in < ... >. Tags determine how the text is formatted when a Web page is viewed by someone using a Web browser.

2. There are 2 types of tags; start tags and end tags. An end tag has a forward slash (/) immediately after the < symbol.

3. Here are some examples of tags:

: display text that follows in the form of an ordered list (i.e., bullet points that are numbered). The corresponding end tag is , which indicates the end of the ordered list.

: display text that follows as an unordered list. Instead of numbering the items in the list, bullet points are used. The corresponding end tag is .

: The text that follows is a list item, either in an ordered list or an unordered list. Although there is a corresponding end tag

exists, the use of this end tag is optional, as discussed below.

: Insert a paragraph break. It has an optional end tag

. : Place the text that follows on a new line. There is no corresponding
end

tag. : Display text in boldface. is the end tag.

,

, and

: Display text in different font sizes.

is the largest font, followed by

and

. All are larger than the normal font that a particular Web page uses. The end tags are as you would expect (

, etc.).

4. There are 3 types of start tags, which I will refer to as types a, b, and c. For start tags of type a, it is a requirement of HTML that a corresponding end tag must follow at some point in the HTML file. For example, an ordered list begins with the

start tag and must end with the end tag.

5. For type b start tags, the corresponding end tag is optional. The

and the

tags are examples tags for which the corresponding end tag is optional;

may or may not be followed by

(and likewise,

may or may not be followed by

6. For type c start tags, there is no corresponding end tag. is an example of such a tag; there is no
tag in HTML.

On the following page, you will find an example of valid HTML, and the way it is rendered by a browser. Note the following:

a. Spacing is determined strictly by tags, and by the size of the browser window. For example, in the first list item there is a line break between implementation in and the Java language because of the browser window size. On the other hand, even though the text Java and language in the .html file are on separate lines, no line break appears at that point in the IE rendering of the page. Finally, the Web browser has place a paragraph break between at the following times: and Section 201 students because of the

tag, and not due to any formatting of the text in the .html file.

b. HTML elements (which consist of a start tag, text, and in most cases an end tag) may be embedded within other elements. In the above HTML, there are several instance of ... appearing within other elements, as in

This course is about data structures and their implementation in the Java language.

CSC 300 Section 201, 210 Summer I 2017

This course is about data structures and their implementation in the Java language.

We will discuss Bags, Stacks, Queues, Lists, and Priority Queues

In addition, we will discuss the running times of various operations on different data structures.

The class meets at the following times:

Sections 201 students: Tuesdays and Thursdays from 5:45-9:00. Section 210 students: Online students view the lectures on D2L.

HTML is not valid if start tags and end tags are mixed in the incorrect order. For example:

This course is about data structures and their implementation in the Java

language.

In this case, the end tag precedes the tag, but since the boldface element Java

language is one of the list items, should precede .

Your task

Your job is to complete the validPage method of the HTMLChecker class. Note that this method is passed one parameter, which is a String. The String contains HTML. The method returns true or false, depending on whether or not the String consists of valid HTML or not. Your program must do the following:

1. Identify tags. You may assume that whenever the < symbol appears, it begins a tag (of course, end tags begin with . You may assume that the symbols < and > will not appear in the String unless they start or end a tag.

2. When your program encounters a type a or type b start tag, it should push the name of the tag onto a stack. For example, when it encounters the tag in the text above, it pushes the string html onto the stack (since only start tags will be placed on the stack, I suggest that you strip the < > from the tag). However, for type c start tags such as (which have no corresponding end tag), the tag name should not be pushed onto the stack.

3. When your program encounters an end tag, there are several possibilities:

a. If the stack is empty, then the HTML is invalid, and the validPage method should return false immediately.

b. Otherwise, pop thes stack. If the (popped) start tag matches the end tag, then the program should continue processing the rest of the String.

c. It the start tag and end tag do not match, then it depends on whether or not the start tag is type a or type b. In the case of type a tags, which require a corresponding end tag, your method should return false. For type b tags, the the stack should continue to be popped until (i) it is empty (in which case your method should return false); or (b) until a matching start tag is found (in which case your method should continue processing the String)

4. If your program reaches the end of the HTML String, then it should return true of the stack is empty or if every remaining start tag on the stack if type b.

Here is an example of how your validPage should work on a small subset of the example HTML above. Assume the String passed as a parameter to validPage is all on a single line.

HTMLChecker.validPage(

CSC 300

This course is about data structures and their implementation in the Java language.
We will discuss Bags, Stacks, Queues, Lists, and Priority Queues

In addition, we will discuss the running times of various operations on different data structures.)

Tag	Stack	Tag	Stack
	[center]		[p ol li]
	[center h3]		[p ol li b]
	[center h3 b]		[ol li]
	[center h3]		[ol li p]
	[center]		[ol li p li]
	[]	*	[ol]
	[p]		[]
	[p ol]		[p]

Note that when the tag is encountered (marked by *), several items are popped from the stack, since they are all type b start tags, which do not require end tags. If any of them were type a tags, at this point the validPage method would return false.

For this html String, in the end your validPage method should return true. This is valid HTML, because the only tag left on the stack (

) is a type b tag, for which a

is optional.

Code I am providing

If you download the hw3.jar file, you will see a partially completed version of the validHTML method which you must complete. Here is a portion of the code I am providing. The intent of the starter code is to relieve you of the burden of writing the string-manipulation part of the validHTML method (mainly, identifying and extracting tags, and determining if they are start or end tags), so that you can concentrate on the correct manipulation of the stack.

public static boolean validPage(String html) {

// the stack that we'll use to save start tags for // which we are (possibly) expecting a corresponding // end tag. Whether the end tag is required or // optional depends on the type of the start tag ("type a" // or "type b") Stack startTags = new Stack();

// find the index of the first start tag in html (the parameter // String) int tagStartPosition = html.indexOf('<'); boolean isStartTag;

while (tagStartPosition != -1) {

 // attempt to find the index of the end of the start tag

int tagEndPosition = html.indexOf('>', tagStartPosition+1); if (tagEndPosition == -1) return false;

 //extract the tag

String tag = html.substring(tagStartPosition+1, tagEndPosition);

 // determine if the tag is a start tag or and end tag

if (tag.charAt(0) == '/') { tag = tag.substring(1); isStartTag = false;

} else isStartTag = true;

 //fill in the rest of the code here

tagStartPosition = html.indexOf('<', tagEndPosition+1); }

 // now we've reached the end of the string. Determine if // the String contained valid HTML or not

return false; // replace }

In the JAR file, you will also find a class called HW3Tester. It calls validPage, passing it various examples of valid and invalid HTML. You can expect the html only to contain tags that I have specified in the starter code.

When your program is properly completed, the output of the main method should be

true false false true

You should test your code on other examples, since when the assignment is graded the grader may use additional examples as well.