New Re-ranking Approach in Merging Search Results
DOI:
https://doi.org/10.31449/inf.v43i2.2132Abstract
When merging query results from various information sources or from different search engines, popular methods based on available documents scores or on order ranks in returned lists, its can ensure the fast response, but results are often inconsistent. Another approach is downloading contents of top documents for re-indexing and re-ranking to create final ranked result list. This method guarantees better quality but is resource-consuming. In this paper, we compare two methods of merging search results: a) applying formulas to re-evaluate document based on different combinations of returned order ranks, documents titles and snippets; b) Top-Down Re-ranking algorithm (TDR) gradually downloads, calculates scores and adds top documents from each source into the final list. We propose also a new way to re-rank search results based on genetic programming and re-ranking learning. Experimental result shows that the proposed method is better than traditional methods in terms of both quality and time.References
Kurt I. Munson (2000), Internet Search Engines: Understanding Their Design to Improve Information Retrieval, Journal of Library Metadata, Volume 2, p.p. 47-60.
https://doi.org/10.1300/J141v02n03_04
M. Shokouhi and L. Si (2011), Foundations and Trends® in Information Retrieval, Federated Search, Volume 5 (No. 1), p.p. 101-107.
https://doi.org/10.1561/1500000010
J. Callan (2002), Distributed information retrieval, The Information Retrieval Series: Springer, INRE, Volume 7, p.p. 127-150.
https://doi.org/10.1007/0-306-47019-5_5
S. Wu, F. Crestani, Y. Bi (2006), Evaluating Score Normalization Methods in Data Fusion, Information Retrieval Technology, Proceedings of 3rd Asia Information Retrieval Symposium, AIRS 2006, Singapore, p.p. 642-648.
https://doi.org/10.1007/11880592_57
W. Shengli, B. Yaxin, Z. Xiaoqin (2011), The linear combination data fusion method in information retrieval, Lecture Notes in Computer Science book series (LNCS, volume 6861), pp. 219–233.
https://doi.org/10.1007/978-3-642-23091-2_20
S. Wu, S. McClean (2005), Data Fusion with Correlation Weights, Lecture Notes in Computer Science, Volume 3408/2005, p.p. 275-286.
https://doi.org/10.1007/978-3-540-31865-1_20
B. Xu, S. Luo, K. Sun (2012), Towards Multimodal Query in Web Service Search, 19th International Conference on Web Services, IEEE.
https://doi.org/10.1109/icws.2012.42
Y. Rasolofo, F. Abbaci, J. Savoy (2001), Approaches to collection selection and results merging for distributed information retrieval, CIKM'01 Proceedings of the 10th international conference on Information and knowledge management, ACM, p.p. 191 - 198.
https://doi.org/10.1145/502585.502618
L. Hang (2011), Learning to Rank for Information Retrieval and Natural Language Processing, Synthesis Lectures on Human Language Technologies, Morgan & Claypool Publishers, p.p. 1-113.
https://doi.org/10.2200/s00348ed1v01y201104hlt012
C. Koby, S. Yoram (2002), Pranking with Ranking, Advances in Neural Information Processing Systems 14, Volume 14, p.p. 641-647.
https://doi.org/10.7551/mitpress/1120.003.0087
M.R. Yousefi, T.M. Breuel (2012), Gated Boosting: Efficient Classifier Boosting and Combining, Lecture Notes in Computer Science, p.p. 262-265.
https://doi.org/10.1007/978-3-642-33347-7_28
L. Yu-Ting, L. Tie-Yan, Q. Tao, M. Zhi-Ming, L. Hang (2007), Supervised rank aggregation, Proceedings of the 16th international conference on World Wide Web - WWW ’07, p.p. 481–490.
https://doi.org/10.1145/1242572.1242638
K. Veningston, R. Shanmugalakshmi (2012), Enhancing personalized web search re-ranking algorithm by incorporating user profile, Third International Conference on Computing, Communication and Networking Technologies (ICCCNT'12).
https://doi.org/10.1109/icccnt.2012.6396036
P.A. Chirita, W. Nejdl, R. Paiu, C. Kohlschütter (2005), Using ODP metadata to personalize search, Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '05, p.p. 178--185.
https://doi.org/10.1145/1076034.1076067
T. Nasrin, H. Faili (2016), Automatic Wordnet Development for Low-Resource Languages using Cross-Lingual WSD, Journal of Artificial Intelligence Research, Volume 56, p.p. 61–87.
https://doi.org/10.1613/jair.4968
Y. Rasolofo, D. Hawking, J. Savoy (2003), Result Merging Strategies for a Current News MetaSearcher, Information Processing & Management, No 39(4), p.p. 581–609.
https://doi.org/10.1016/s0306-4573(02)00122-x
P.J. Angeline (1994), Genetic programming: On the programming of computers by means of natural selection, Biosystems, MIT Press Cambridge, p.p. 69-73.
https://doi.org/10.1016/0303-2647(94)90062-0
Q. Tao, L.T. Yan, X. Jun, L. Hang (2010), LETOR: A benchmark collection for research on learning to rank for information retrieval, Information Retrieval, Volume 13, No. 4, p.p. 346–374.
https://doi.org/10.1007/s10791-009-9123-y
C. Zhai, J. Lafferty (2001), A study of smoothing methods for language models applied to Ad Hoc information retrieval, Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR ’01, p.p. 334–342.
https://doi.org/10.1145/383952.384019
T.G. Lam, T.H. Vo, C.P. Huynh (2015), Building Structured Query in Target Language for Vietnamese – English Cross Language Information Retrieval Systems, International Journal of Engineering Research & Technology (IJERT), Volume 4, No. 04, p.p. 146–151.
https://doi.org/10.17577/ijertv4is040317
Downloads
Published
How to Cite
Issue
Section
License
I assign to Informatica, An International Journal of Computing and Informatics ("Journal") the copyright in the manuscript identified above and any additional material (figures, tables, illustrations, software or other information intended for publication) submitted as part of or as a supplement to the manuscript ("Paper") in all forms and media throughout the world, in all languages, for the full term of copyright, effective when and if the article is accepted for publication. This transfer includes the right to reproduce and/or to distribute the Paper to other journals or digital libraries in electronic and online forms and systems.
I understand that I retain the rights to use the pre-prints, off-prints, accepted manuscript and published journal Paper for personal use, scholarly purposes and internal institutional use.
In certain cases, I can ask for retaining the publishing rights of the Paper. The Journal can permit or deny the request for publishing rights, to which I fully agree.
I declare that the submitted Paper is original, has been written by the stated authors and has not been published elsewhere nor is currently being considered for publication by any other journal and will not be submitted for such review while under review by this Journal. The Paper contains no material that violates proprietary rights of any other person or entity. I have obtained written permission from copyright owners for any excerpts from copyrighted works that are included and have credited the sources in my article. I have informed the co-author(s) of the terms of this publishing agreement.
Copyright © Slovenian Society Informatika