Personal Web Revisitation by Context and Content Keywords with Relevance Feedback
Getting back to previously viewed web pages is a common yet uneasy task for users due to the large volume of personally accessed information on the web. This paper leverages human’s natural recall process of using episodic and semantic memory cues to facilitate recall, and presents a personal web revisitation technique called WebPagePrev through context and content keywords. Underlying techniques for context and content memories’ acquisition, storage, decay, and utilization for page re-finding are discussed. A relevance feedback mechanism is also involved to tailor to individual’s memory strength and revisitation habits. Our 6-month user study shows that: (1) Compared with the existing web revisitation tool Memento, History List Searching method, and Search Engine method, the proposed WebPagePrev delivers the best re-finding quality in finding rate (92.10%), average F1-measure (0.4318) and average rank error (0.3145). (2) Our dynamic management of context and content memories including decay and reinforcement strategy can mimic users’ retrieval and recall mechanism. With relevance feedback, the finding rate of WebPagePrev increases by 9.82%, average F1-measure increases by 47.09%, and average rank error decreases by 19.44% compared to stable memory management strategy. Among time, location, and activity context factors in WebPagePrev, activity is the best recall cue, and context+content based re-finding delivers the best performance, compared to context based re-finding and content based re-finding.
PROJECT OUTPUT VIDEO: (Click the below link to see the project output video):
- In the literature, a number of techniques and tools like bookmarks, history tools, search engines, metadata annotation and exploitation, and contextual recall systems have been developed to support personal web revisitation.
- The most closely related work of this study is Memento system, which unifies context and content to aid web revisitation. It defined the context of a web page as other pages in the browsing session that immediately precede or follow the current page, and then extracted topic-phrases from these browsed pages based on the Wikipedia topic list.
- Other closely related work enabled users to search for contextually related activities (e.g., time, location, concurrent activities, meetings, music playing, interrupting phone call, or even other files or web sites that were open at the same time), and find a target piece of information (often not semantically related) when that context was on. This body of research emphasizes episodic context cues in page recall.
DISADVANTAGES OF EXISTING SYSTEM:
- Uneasy task for users
- Large Volume of data, makes more complex
- Poor finding rate
- Low F1-measure
- Preparation for web revisitation. When a user accesses a web page, which is of potential to be revisited later by the user (i.e., page access time is over a threshold), the context acquisition and management module captures the current access context (i.e., time, location, activities inferred from the currently running computer programs) into a probabilistic context tree. Meanwhile, the content extraction and management module performs the unigrambased extraction from the displayed page segments and obtains a list of probabilistic content terms.
- The probabilities of acquired context instances and extracted content terms reflect how likely the user will refer to them as memory cues to get back to the previously focused page.
- Web revisitation. Later, when a user requests to get back to a previously focused page through context and/or content keywords, the re-access by context keywords module and re-access by content keywords module search the probabilistic context tree repository and probabilistic term list repository, respectively.
ADVANTAGES OF PROPOSED SYSTEM:
- This paper explores how to leverage our natural recall process of using episodic and semantic memory cues to facilitate personal web revisitation. Considering the differences of users in memorizing previous access context and page content cues, a relevance feedback mechanism is involved to enhance personal web revisitation performance.
- We present a personal web revisitation technique, called WebPagePrev, that allows users to get back to their previously focused pages through access context and page content keywords. Underlying techniques for context and content memories’ acquisition, storage, and utilization for web page recall are discussed.
- Dynamic tuning strategies to tailor to individual’s memorization strength and recall habits based on relevance feedback (e.g., weight preference calculation, decay rate adjustment, etc.) are developed for performance improvement.
- We evaluate the effectiveness of the proposed technique WebPagePrev, and report the findings (e.g., the importance of context and content factors) in web revisitation.
- System : Pentium Dual Core.
- Hard Disk : 120 GB.
- Monitor : 15’’ LED
- Input Devices : Keyboard, Mouse
- Ram : 1 GB
- Operating system : Windows 7.
- Coding Language : JAVA/J2EE
- Tool : Netbeans 7.2.1
- Database : MYSQL
Li Jin, Gangli Liu, Chaokun Wang and Ling Feng, Senior Member, IEEE, “Personal Web Revisitation by Context and Content Keywords with Relevance Feedback”, IEEE Transactions on Knowledge and Data Engineering, 2017.