| |
Terminology Extractor: g-EXT
Principle
The principle of the Terminology Extractor is to locate automatically, in any flow of textual information, the original and important vocabulary, words and expressions.
Complementary to g-MIL, this technology is based on the frequency of appearance of terms to determine their interest and level at which they are taken into account.
Developing a system for learning words is pretty straight forward but automating the process of either full machine or semi-automatic learning of expressions is altogether a far more difficult proposition. g-EXT does this.
In addition, the located expressions are categorised under the following sections :
· Names of people
· Locations
· Organisations
· General concepts
The result of the learning capability , beyond the applications described below, can allow the enrichment of the KDB as described in the g-MIL, Requests Generator, section.
Applications
Enrichment of the Knowledgebase
From a flow of textual information(together with manual search results, stored documents etc), the learning feature can allow selective insertion into the KDB of vocabulary specific to the Organisation using g-REQ, Request Generator
Posting of Linked themes
With the completion of a search the expressions learned, starting with the list of pages returned, are inevitably linked to the results retrieved and the question submitted and as such it is possible to make them available to the user under a "linked” or “related” topics heading.
Learning of Acronyms
Statistically, an expression or phrase and an acronym that describes it often appear together. The first time, both expression and acronym are read, thereafter only the acronym need be used with the capability of automatically expanding it at the point it subsequently first appears. It is the same with Noun Phrases. This co-occurrence of the terms makes it possible to detect, automatically or semi-automatically, the presence of acronyms in a text (see example below).
Automatic extraction of Glossary
Creating a glossary from a large set of documents can be a very complex and time consuming task. g-EXT can automate the process completely or partially.
Request example
Security Council of the United Nations
Expressions and vocabulary
- General concepts
- human rights
- Member States
- official site
- rights of man
- secretary-general
- charter of the United Nations
- international terrorism
- program of the United Nations
- Security Council of UNO
- international organizations
- members of the Security Council
- Security Council Resolution of the United Nations
- Name people
- Places
- Organizations
- the United Nations
- united nations
- European union
- council of Europe
Detected acronyms
- unac=United Nations article charter
- unci=United Nations centre of information
- unsc=United Nations security council
- uno=United Nations office
- et=Eastern timor
- eu=European Union
|
|