PA5
James Madison University, Spring 2017 Semester
PA5: SEO Checker (Numbers and Strings)
Image Credit: Screen capture from seositecheckup.com
Part A (10 pts): Due Fri, Mar 31 at 11:59 PM
Readiness quiz on Canvas.
Late work will not be accepted. If you do not submit by Fri, Mar 31st at 11:59 PM, you will receive a zero for Part A.
Part B (30 pts): Due Monday, Apr 3 at 11:59 PM
Submit only your JUnit tests (SEOCheckerTest.java) via Web-CAT.
Late work will not be accepted. If you do not submit by Mon, Apr 3rd at 11:59 PM, you will receive a zero for Part B.
Part C (60 pts): Due Friday, Apr 7 at 11:59 PM
Submit SEOChecker.java and SEOCheckerTest.java via Web-CAT.
- -20% on Sat, April 8 by 11:59 PM
- -40% on Sun, April 9 by 11:59 PM
- Not accepted afterwards
You may (and should) revise your test cases in your final submission.
Objectives
- Use methods in the String class to solve problems.
- Build new strings from characters and substrings.
- Calculate statistics or properties of a given string.
- Convert strings to and from primitive data types.
Honor Code
This assignment should be viewed as a take-home exam and must be completed individually. Your work must conform to the JMU Honor Code. Authorized help is limited to general discussion on Piazza, the lab assistants assigned to CS 149/159, and the instructor. Copying work from another student or the Internet is an Honor Code violation and will be grounds for a reduced or failing grade in the course.
Background
This assignment will make use of what you have learned so far about strings and arrays to write a simple Search Engine Optimization (SEO) Checker app. There are many SEO tools out on the Internet, for example: https://seositecheckup.com/seo-audit/www.jmu.edu. These checklist-like apps take a set of criteria and scan your website to determine how optimized your site is for search engines, returning a numerical score for your site.
For PA5, you will be writing several methods to implement a rudimentary SEO checker. Most of your methods will take an array of strings that represent lines of an html code downloaded from the Internet. Just for fun, we created SEOCheckerDriver.java to read in any file you want. You can right-click any website, select "Save As...", put the file in your PA5 folder, and run it through your checker.
Clarifications
- html input files will be well balanced(i.e. all opening tags will have a closing tag on the same line of the file with the exception of html and body tags).
Requirements
Create two files from scratch: SEOChecker.java and SEOCheckerTest.java. Then add the following methods in the SEOChecker class. You may find it useful to implement additional methods to simplify your code, but doing so is not required.
-
countPairs
– given a string array and a tag name string, this method counts the number of occurrences of a pair of tags in the string array. Here are some example HTML Tags on W3Schools. For this assignment, a tag is any sequence of characters starting with '<' and ending with '>', containing only the tag name, and having an optional slash (to indicate the end of an element).Example: countTags({"<title>Test title</title>", "<p>Hello!</p>"}, "title") returns 1 , since there is one title tag with an opening and closing tag pair.
-
locateNextTag
– given a string array, a string tag name, and an index value, search the string array for the next occurrence of the tag, and return the index value for that line in the file.Example: locateNextTag({"<h1>Test</h1>", "<h2>SubTest</h2>", "<h2>Another Subtest</h2>", "<h3>Tertiary test</h3>"}, "h2", 0) returns 1, since the h2 tag is found in that index of the array.
-
scoreDocType
– given a string array, this method returns a double value score based on which DOCTYPE the html page is using. See HTML <!DOCTYPE> Declaration on W3Schools for the exact format of these strings. Return ten points for HTML 5, eight points for HTML 4.01 Strict, seven points for HTML 4.01 Transitional, and five points for HTML 4.01 Frameset. (You can ignore the other kinds of DOCTYPE tags.)Example: scoreDocType({"<!DOCTYPE html>"}) returns 10.0, since that tag indicates HTML 5.
-
scoreHeadingTags
– given a string array, this method scores five points for having just a single h1 tag, three points for having h2 tags, and two points if you have all three types of tags (h1, h2, h3) present.Example: scoreHeadingTags({"<h1>Test</h1>", "<h2>SubTest</h2>", "<h2>Another Subtest</h2>", "<h3>Tertiary test</h3>"}) returns 10.0, the sum of all three scores.
-
scoreImageTags
– given a string array, this method scores ten points if all image tags have "alt" and "title" attributes with nonempty content inside them. Otherwise you should sum the ratio out of 5 points for the counts of alt to image tags plus the ratio out of 5 points for the counts of image tags with title tags contained therein. NOTE: if there are no images in the file you should also return a 10 for correct handling of image tagsExample: scoreImageTags({"<img src=\"picture.jpg\" alt=\"A test photo\" title=\"Test image\">"}) returns 10.0, since it has both attributes.
-
scoreMetaTags
– given a string array, this method scores three points for having a meta "decription" tag, adds another three points for having a "content" attribute within that, and adds four points for having a nonempty string inside the content attribute.Example: scoreMetaTags({"<meta name=\"description\" content=\"Test\">"}) returns 10.0, since it satisfies all three of the requirements.
Before implementing these methods, you are strongly encouraged to create stubs first, submit them to Web-CAT, and make sure your code compiles on the server. That way, you won't waste any time implementing code for incorrect method signatures.