|
|
|
![]() |
|
|
|
|
| |
|
|
|
|
With the goal of transforming documents into “meaningful spaces”, the main focus has to be semantics. Semantics is everywhere, hidden in completely different types of documents (e.g. text, images, videos, programs and audio) and at different levels (e.g. document content, document structure). Because most of the “semantics” that is nowadays accessible in documents lies in texts, we concentrate on the semantic content analysis of the textual parts of documents. This textual part also includes document structure (for instance information already encoded into tags and user profiles). Our goal is not to investigate the fundamental nature of meaning, so we concentrate on the linguistic meaning.
A unifying theme in the ongoing research in the ParSem area is an emphasis
on the role of context in determining meaning. We are particularly
interested in theoretical models of communication, language, dialogue,
computation, and inference which take into account the context in which
these activities are occurring. Our current research themes include:
We build tools that can discover that two concepts are related somehow, by noticing that expressions denoting those concepts are frequently linked together syntactically in a corpus. We explore the idea that the range of syntactic constructions that can be used to link two concepts may provide information about the nature of the relationship(s) that can exist between those concepts. This information could subsequently be used to enrich the representation of a document's content with entities and relations that are implied, but not explicitly stated. The determination of all different senses for every word relevant at least to the text or discourse under consideration. Precise de definition of what a sense is a matter of debate but much of recent approaches rely on predefined senses such as a list of senses given in a dictionary, associated words, entries in a transfer dictionary, etc. The assignment of word to senses is done using 2 sources of information: The linguistic context of the word to be disambiguated (and maybe some extra-linguistic knowledge about situation, etc.) External knowledge sources including lexical, encyclopedic, etc. All disambiguation processes involve matching the context of
an instance of the word to be disambiguated with information from
an external knowledge source (knowledge-driven WSD) or information
about the contexts of previously disambiguated instances of the word
derived from corpora (data-driven WSD or corpus-based WSD).
• Domain specific normalization
Search the XRCE Publications database |
|