Programming Exercise I: A Focused Web Search Engine

(Industry-Level, Second-to-None Comprehensive Specifications)



Absolutely no copying others’ work
According to a study, students in computer courses learn much more by building large-scale exercises instead of many small-scale test programs, which give fragmented knowledge contrary to solid understanding of the system.
Development Requirements
When start developing the exercise, follow the three requirements below:

Due Date and Submission Methods
On or before Thursday, February 15, 2024. Send an email including to the instructor at wenchen@cs.und.edu to remind him the exercise is ready for grading.

Note that you are allowed to use any languages and tools for this exercise, but the exams will focus on PHP and MySQL unless otherwise specified.



Background and Objectives
A World Wide Web search engine includes the following three major components:
  • Crawlers, which visit and read every page on web sites, using hypertext links on each page to discover and read a site’s other pages,

  • Web page indexes, which are created from the pages that have been read, and

  • Search and ranking software, which is used to receive user’s search request, compares it to the entries in the index, and returns results to the user.

This exercise is for students to learn and practice the phases of data life cycle (including data collection, indexing, storage, search, and ranking) by implementing a focused web search engine.


The Requirements
The search engine performs three tasks: (i) crawling and data collection, (ii) indexing, and (ii) search and ranking, which include the requirements as follows:

An Example of System Interfaces
Note that the interfaces are just an example and may not meet the exercise requirements. You should design your own interfaces. The example without using an iframe can be found from here.


Evaluations
The following features will be considered when grading: