Data Formats of Linguistic Resources |
|||||||||||
|
TUSNELDA-XMLXML format for the representation of treebanks, collections of sentences and lexicons. Developed in project C1. See TUSNELDA documentation. NEGRA ExportPlain text column-based format for representing treebanks. Originally developed in the NEGRA project of the Collaborative Research Center 378 (Saarland University). Export-XMLXML version of NEGRA Export, developed in project A1. Additional information and a toolset for converting NEGRA Export into Export-XML and back is available here. An extension of Export-XML, Anaphora-XML, supports the representation of referential relations between the nodes in a treebank. DEREKO-XMLXML representation originally developed at Tübingen University as part of the DEREKO project. The format is similar to Export-XML, but is designed to minimize storage overhead and thus especially suitable for very large corpora. Furthermore, it supports the ambigous annotation of POS tags and morphological analyses. Last update 03/11/2009 |