Search agents based information retrieval
Abstract
The ability to semantically link all online information in a given
domain, that is, create a semantic web, may lead to the
realization of the simultaneous achievement of high precision and
high recall. The storage method is semantic hypertext, in which
conventional hypertext links are enriched with semantic
information. We use emerging standards and tools such as the
Extensible Markup Language (XML) and the Resource Description
Framework (RDF). To take advantage of the potential speed increase
due to scalability, research into algorithms for high performance
parallel and grid computing are being explored.
Description
Information retrieval is an issue that touches at the heart of all
discovery and research. The ability to semantically link all
online information in a given domain, that is, create a semantic
web, may lead to the realization of the holy grail of information
retrieval, the simultaneous achievement of high precision and high
recall. And high speed, with the deployment of large numbers of
software scouts capable of cooperatively traversing the semantic
web in parallel. During this year in Padova, I will extend some
previous research into methods that link chunks of information
within and among documents based on semantic relationships and use
those connections to efficiently retrieve all the information that
closely matches the user?s request. The storage method is semantic
hypertext, in which conventional hypertext links are enriched with
semantic information that includes the strength and type of the
relationship between the chunks of information being linked. Then,
a set of cooperating software agents, called scouts, traverse the
connections simultaneously searching for requested information. By
communicating with each other and a central controller to
coordinate the search, the scouts are able to achieve high recall
and high precision and perform extremely efficiently.
The quality of the resultant delivered information depends
entirely on the quality of the semantic links. Because linking a
large number of documents manually would be prohibitively
laborious, the construction of these links must be automatic. It
would be helpful if, during the creation of electronic
documents, semantic information that described the various
segments of the documents was inserted. This information would
not be displayed to someone reading the document, but would be
available to assist in the construction of enriched hypertext
links with other similarly enhanced doucments.
Techniques for specifying the semantic relationships that go
beyond simple link types and weights will be investigated. The
goal here is to give information retrieval agents as much
meta-information as possible to enhance their information
gathering ability. Such emerging standards and tools as the
Extensible Markup Language (XML) and the Resource Description
Framework (RDF) are potential useful methods for describing the
content and content relationships available within documents for
intelligent software agents to use in facilitating knowledge
sharing and exchange and creating semantic hypertext links.
Another goal of the current research is to further develop the
capability of the link-traversing scouts. As the amount of
accessible information reaches staggering proportions, it may be
necessary to deploy large numbers of scouts for some
applications. High network traffic rates cause communications
bottlenecks that can degrade the performance of large numbers of
scouts. To take advantage of the potential speed increase due to
scalability, research into algorithms for high performance
parallel and grid computing are being explored.
Essential Bibliography
- Rehder John J.; Semantic Software Scouts for Information
Retrieval, Ph.D dissertation, The College of William and Mary,
May 2000.
- Rehder John J.; Cooperating Scouts for Information
Retrieval. in 1st Pacific Rim International Workshop on
Intelligent Information Agents, at 6th Pacific Rim International
Conference on Artificial Intelligence (PRICAI 2000), Melbourne,
Australia August 28 - September 1, 2000
Joe Rehder
Last modified: Thu Oct 17 10:46:17 CEST 2002