Project A1:
Representation and Automatic Acquisition
of Linguistic Data

  SFB 441 homepage
  Projects of SFB 441
  Seminar für Sprachwissenschaft

Head of the project

Prof. Dr. Erhard Hinrichs
Seminar für Sprachwissenschaft
Universität Tübingen
Wilhelmstr. 19
72074 Tübingen
Phone: +49/7071/29-75446
Fax: +49/7071/29-5214


Project staff:

Kathrin Beck
Office: Wilhelmstr. 19 / 3.25
Phone: +49/7071/29-74156

Yannick Versley
Office: Wilhelmstr. 19 / 3.21
Phone: +49/7071/29-77352

Holger Wunsch
Office: Wilhelmstr. 19 / 3.23
Phone: +49/7071/29-73972


Dr. Heike Telljohann

Former staff:

  Eva Klett

  Sandra Kübler

  Heike Zinsmeister

  Karin Naumann

  Tylman Ule

  Julia Trushkina

  Frank Henrik Müller

  Beata Kouchnir

  Martina Liepert

  Laura Kallmeyer

  Manfred Sailer

  Ilona Steiner


The relationship between empiricism and theory in linguistics, which was a central point of interest in the SFB 441, was investigated by project A1 with regard to electronically available text corpora. The availability of large electronically accessible corpora opens up a new source of data, which is an important means for developing linguistic theories. The usefulness of data pooled in corpora mainly depends on the method with which they are being prepared. For this reason, the aim of this project was twofold: First, theory neutral modes of representing the data were be defined. Second, the accessibility of data was investigated. These two aims were pursued in the three sectors of the project. These sectors cluster around
  • (a) providing complex automatic annotations
  • (b) defining a query language for corpora with complex annotation schemes
  • (c) providing theory dependent representations for corpora with complex annotation schemes.

Project proposal 1999-2001 in German (Postscript)    Final Report 1999-2001 in German (Postscript)

Project proposal 2002-2004 in German (Postscript)    Final Report 2002-2004 in German (PDF)

Project proposal 2005-2008 in German (PDF)   



Cooperations inside of SFB 441

  • A2:
    • foundations of systems of representation
  • A3:
    • theory-independent versus theory-specific annotation
    • annotation of suboptimal structures
  • A5:
    • queries for distributional idiosyncracies in the TüPP-D/Z
  • B3, B13, B15:
    • representation of German data
    • Participation in the workshops Coordination and Complex Clauses - Linguistic, Psycholinguistic, and Computational Perspectives
  • B11:
    • language-independent research on anaphoric relations and linguistic relations above sentence level
  • C1:

Other cooperations