Search:
Computing and Library Services - delivering an inspiring information environment

Retrieving Information from Compressed XML Documents According to Vague Queries

AlHamadani, Baydaa (2011) Retrieving Information from Compressed XML Documents According to Vague Queries. Doctoral thesis, University of Huddersfield.

[img]
Preview
PDF - Accepted Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.

Download (1764kB) | Preview

    Abstract

    XML has become the standard way for representing and transforming data over the World Wide Web. The problem with XML documents is that they have a very high ratio of redundancy, which makes these documents demanding large storage capacity and high network band-width for transmission. Because of their extensive use, XML documents could be retrieved according to vague queries by naive users with poor background in writing XPath query. The aim of this thesis is to present the design of a system named “XML Compressing and Vague Querying (XCVQ)” which has the ability of compressing the XML document and retrieving the required information from the compressed version with less decompression required according to vague queries.

    XCVQ first compressed the XML document by separating its data into containers and then compress these containers using the GZip compressor. The compressed file could be retrieved if a vague query is submitted without the need to decompress the whole file. For the purpose of processing the vague queries, XCVQ decomposes the query according to the relevant documents and then a second decomposition stage is made according to the relevant containers. Only the required information is decompressed and submitted to the user.

    To the best of our knowledge, XCVQ is the first XML compressor that has the ability to process vague queries. The average compression ratio of the designed compressor is around 78% which may be considered competitive compared to other queriable XML compressors. Based on several experiments, the query processor part had the ability to answer different kinds of vague queries ranging from simple exact match queries to complex ones that require retrieving information from several compressed XML documents.

    Item Type: Thesis (Doctoral)
    Subjects: Z Bibliography. Library Science. Information Resources > ZA Information resources > ZA4050 Electronic information resources
    Schools: School of Computing and Engineering
    Depositing User: Carol Doyle
    Date Deposited: 11 Aug 2011 09:55
    Last Modified: 11 Aug 2011 09:55
    URI: http://eprints.hud.ac.uk/id/eprint/11179

    Document Downloads

    Downloader Countries

    More statistics for this item...

    Item control for Repository Staff only:

    View Item

    University of Huddersfield, Queensgate, Huddersfield, HD1 3DH Copyright and Disclaimer All rights reserved ©