Publication Search Form




We found publication with these paramters.

Traitement automatique pour la Migration de Documents Numériques vers XML

Jérôme Fuselier, Boris Chidlovskii
More and more companies are migrating their legacy document management systems toward XML format, the industrial standard for data exchange. In order to reduce the migration cost we propose an approach aimed at automating the conversion of layout-oriented documents to semantic-oriented annotations. The conversion module uses supervised machine learning technique to learn a conversion model for a collection. The conversion is achieved through a semantic annotation of the document content and structuring the annotation, accordingly to a XML schema that specify the class of target documents.
To appear in Document Numérique


fuselier_chidlovskii.pdf (355.86 kB)