สุวัฒน์ สถาพรพิริยะเดช. The Effect of Term Specificity on Performance Using Standard and Extended Boolean Information Retrieval System. Master's Degree(Information Technology). King Mongkut's University of Technology Thonburi. : King Mongkut's University of Technology Thonburi, 1997.
The Effect of Term Specificity on Performance Using Standard and Extended Boolean Information Retrieval System
Abstract:
This thesis presents a comprehensive and systematic study of Thai textbase
retrieval. About three thousand of Thai news articles were collected. The author then
generated ten requests and asked thirty 1\vo volunteers to vote whether the requests
were broad or narrow. Four of the most agreed requests (two broad and two narrow
requests) were selected for the experiment. All relevant articles of each request were
then exhaustively searched from the textbase. Sixteen participants including eight
computer-field, and eight general-field searchers involved in an experiment. The
experiment employed two different search models, standard and extended Boolean
models. In each group of searchers, four searchers used standard Boolean model and
the other four used extended Boolean model to perform an identical task. The task
was to search relevant articles in accordance to the two broad and two narrow
requests.
The study emphasizes on effect of three factors mentioned above (standard
vs. extended Boolean models, broad vs. narrow requests, and computer-field vs.
general-field searchers) on retrieval performance in Thai textbase retrieval. It, in
particular, focuses on standard Boolean model with specific term (narrow request)
and extended Boolean model with general term (broad request).
In addition, searcher's education levels and work background are also
examined to test their effects on perfonnance. In this study, performances are
measured in time (seconds), precision, recall and their variations. Few other
evaluations are also taken.
The results indicates that computer-field searchers obtained better
performance than that of general-field searchers (2-tail significant=O.OOl for average
time and precision). The study also finds that, based on the three thousand articles
employed in the experiment, there was no optimal similarity threshold for the
extended Boolean model. This resulted in no performance difference between the two
models. The study also shows search experience and characteristics of request
affected performance.
Overall the results of this thesis demonstrates that a retrieval process is a
complex process. The study also shows that several factors affected search
performance. It points that the future research need a larger textbase to prove the
performance difference of different Boolean models and add new features of search
tool to increase retrieval perfonnance. In addition, it is interesting to perform the
experiment on other Thai textbase.