Here I ‘ll lay down the milestones involved in order to complete this project.
- Create a simple WSGI interface for handling reviewer logins.
- Create a data scraper for getting a few sample articles from the XML dump for analysis purposes.
- Modify the wikiwho code for extracting revision dates for every word in an article.
- Use the mediawiki API to find contributions from Wiki student editors. This is mainly because these edits may have to be assigned a higher score for review.
- Develop a ranking algorithm that assigns a score to each word based on age, context and contributor (in case of a student). Rank the words according to this score and extract out the top words/paragraphs for review based on a minimum threshold. Store this in database1.
- Develop an algorithm for tagging unclear/long passages. Store this in database2.
- Present the above two database contents to reviewers. Implement a voting scheme for deciding whether to finally implement changes.
- Test the codebase rigorously using all possible test cases.