Comparação do processo de categorização de documentos utilizando palavras-chave e citações em um domínio de conhecimento restrito

Magali  Rezende Gouvêa MEIRELES; Beatriz  Valadares CENDÓN; Paulo Eduardo  Maciel de ALMEIDA

Authors

Magali Rezende Gouvêa MEIRELES
Beatriz Valadares CENDÓN
Paulo Eduardo Maciel de ALMEIDA

Abstract

The categorization process requires the extraction of representative elements from a document so that its essence can be used to identify similarities among documents and generate categories. The objective of this study was to analyze the difficulties and results from two different processes of document categorization in a restricted knowledge domain. The first one was based on the use of keywords and the second was based on the use of citations for document representation. To illustrate the use of different attributes in document representation, two experiments were conducted. The first one used a categorization algorithm based on keywords. The second experiment generated categories, using Artificial Neural Networks, from the citations of the articles. In the restricted knowledge domain, as used in this study, it was difficult to form groups that use keywords as attributes of the categorization process due to the great similarity of keywords used by the authors. The citations can be, as shown in the second experiment, an alternative and more efficient attribute for the categorization process of these documents. The formation of a set of articles with significant bibliographic coupling and a strong semantic relationship validated the method proposed. The article details the methodology used in the experiments, showing the importance of careful pre-processing phase for the reliability of the databases. This study may contribute to the research related to the representation of documents in categorization processes and information retrieval.

Downloads

Download data is not yet available.

Comparação do processo de categorização de documentos utilizando palavras-chave e citações em um domínio de conhecimento restrito

Authors

Abstract

Downloads

Downloads

Published

How to Cite

Issue

Section

License

Make a Submission

ISSN

Qualis

Indexes

Plagiarism check

Access policy

Digital preservation

Social medias

Information