Abstract
We present a filtered retrieval technique for structural information on internet-scale knowledge such as large-scale XML data in the Web. The technique evaluates XML standard queries on heterogeneous XML documents using information retrieval technique based on the relational tables in relational database management systems. The XML standard queries, XPath queries, in their general form are partial match queries, and these queries are particularly useful for searching documents of heterogeneous schemas. Thus, our technique is geared for partial match queries expressed as the queries. This indexes the elements in label paths, which are sequences of node labels, like keywords in texts, and finds the label paths matching a given query.
This Research was supported by the Sookmyung Women’s University Research Grants 2006.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aboulnaga, A., Alameldeen, A.R., Naughton, J.: Estimating the Selectivity of XML Path Expressions for Internet Scale Applications. In: VLDB. Proc. the 27th Int’l Conf. on Very Large Data Bases, Rome, Italy, pp. 591–600 (2001)
Altinel, M., Franklin, M.J.: Efficient Filtering of XML Documents for Selective Dissemination of Information. In: VLDB. Proc. the 26th Int’l Conf. on Very Large Data Bases, Cairo, Egypt, pp. 53–64 (September 10-14, 2000)
Al-Khalifa, S., Jagadish, H.V., Koudas, N., Patel, J.M.: Structural Joins: A Primitive for Efficient XML Query Pattern Matching. In: ICDE. Proc. the 18th Int’l Conf. on Data Engineering, San Jose, California, pp. 141–152 (2002)
Bremer, J.-M., Gertz, M.: XQuery/IR: Integrating XML Document and Data Retrieval. In: WebDB 2002. Proc. the Fifth Int’l Workshop on the Web and Databases, Madison, Wisconsin, pp. 1–6 (2002)
Clark, J., DeRose, S.: XML Path Language (XPath), W3C Recommendation (November 1999), http://www.w3.org/TR/xpath
Cooper, B.F., Sample, N., Franklin, M.J., Hjaltason, G.R., Shadmon, M.: A Fast Index for Semistructured Data. In: VLDB. Proc. the 27th Int’l Conf. on Very Large Data Bases, Rome, Italy, pp. 341–350 (September 11-14, 2001)
Halverson, A., Burger, J., Galanis, L., Kini, A., Krishnamurthy, R., Rao, A.N., Tian, F., Viglas, S., Wang, Y., Naughton, J.F., DeWitt, D.J.: Mixed Mode XML Query Processing. In: VLDB. Proc. the 29th Int’l Conf. on Very Large Data Bases, Berlin, Germany, pp. 225–236 (September 9-12, 2003)
Jiang, H., Lu, H., Wang, W., Xu Yu, J.: An Efficient RDBMS-Based XML Database System. In: ICDE. Proc. the 18th Int’l Conf. on Data Engineering, San Jose, California, pp. 335–336 (February 26 – March 1, 2002)
Jiang, H., Wang, W., Lu, H., Yu, J.X.: Holistic Twig Joins on Indexed XML Documents. In: VLDB. Proc. the 29th Int’l Conf. on Very Large Data Bases, Berlin, Germany, pp. 273–284 (September 9–12, 2003)
Mandreoli, F., Martoglia, R., Tiberio, P.: Searching Similar (Sub)Sentences for Example-Based Machine Translation. In: Proc. SEBD 2002, Isola d’Elba, Italy (June 2002)
Ramanan, P.: Covering Indexes for XML Queries: Bisimulation - Simulation = Negation. In: VLDB. Proc. the 29th Int’l Conf. on Very Large Data Bases, Berlin, Germany, pp. 165–176 (September 9–12, 2003)
Yoshikawa, M., Amagasa, T., Shimura, T., Uemura, S.: XRel: A Path-based Approach to Storage and Retrieval of XML Documents using Relational Databases. ACM Transactions on Internet Technology(TOIT) 1(1), 110–141 (2001)
ReGet Deluxe 3.3 Beta (build 173), http://deluxe.reget.com/en/
Teleport Pro Version 1.29, http://www.tenmax.com/teleport/pro/home.htm
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Park, YH., Yoon, YI., Lee, JW. (2007). A Filtered Retrieval Technique for Structural Information. In: Szczuka, M.S., et al. Advances in Hybrid Information Technology. ICHIT 2006. Lecture Notes in Computer Science(), vol 4413. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77368-9_22
Download citation
DOI: https://doi.org/10.1007/978-3-540-77368-9_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-77367-2
Online ISBN: 978-3-540-77368-9
eBook Packages: Computer ScienceComputer Science (R0)