Project B11:
Semantic roles, case relations, and cross-clausal reference in Tibetan

SFB 441
SFB subprojects
  B11 project page
  Old Tibetan Chronicle I
  Translation of OTC I
  Classical Tibetan corpus: TVP
  Ladakhi corpus: LLV Kesar I
  Documentation
  Metadata
  Searchable tree
  CLaRK trees (overview)
  Statistics
(in progress)
  Fieldwork
  B11 publications

Warning:
due to technical problems with the host server, the dynamic and searchable cocoon representation of our annotations (text plus translation plus various information) as developed by Frank Müller-Witte is currently out of work. We, nevertheless, keep the description of what has been possible in the past and what could perhaps become functional again some time in the future.


The non searchable and static
tree views (png) can still be visited.
They are provided with an interlinear version, instead of a translation. These representations had to be segmented and prepared separately for a) the reference and frame structure and b) the clause structure with full element descriptions (see introductiory note).

example tree

BZ 20.12.2009.

Visible corpora

screenOTCAs one can see from our two annotation examples, one for Old Tibetan and one for Ladakhi, the mere xml-structure is hardly informative and of little use for a person not acquainted with computer linguistics. Even when the tags and their content can be differentiated (e.g. by colours), the annotated texts gets completely lost in the tag forest. As the available linguistic representation tools could not handle our complex text structures, we experimented with some alternative representations. One of the main problems we faced is that most solutions, especially those based on java tools, are far too slow for the (little) amount of text that we have so far produced.

The main representation, developed by Frank Müller-Witte with the help of Fabian Kliebhahn, runs under a cocoon application on an auxiliary server without exposing the xml-data to the viewer. It shows a field for the Tibetan text (unstructured or with brackets for four consecutive embedded structures), a field for the translation, additional information on clause structures, or a representation of the tree structure.

screenOTC01The translation becomes visible when clicking on the red verb number. All verbs in the translation are linked to their Tibetan counterparts in the original text, which makes it easy to navigate. Green brackets indicate ntNodes (argument NPs or adverbial phrases, AvPs), clauses are indicated by blue brackets. Information concerning the clause structure becomes visible when clicking on the blue opening bracket. The clause type is indicated directly after the closing blue bracket, and clicking on this category will open up the tree display for this clause. The page is can be searched through in the normal search mode, an x-path-search is, for the time being, not possible. For the Tibetan text you may need the following special signs:  ·, ŋ, ñ, ž, š, , in Names: Ŋ, Ñ, Ž, Š, , for words of Indian origin, additionally , , , ś, , Ś.


screenOTC02Two additional minor fields on the right side should ideally provide lexical information from the accompanying text- specific dictionaries, when clicking on a word in the Tibetan text. Notes to the annotation or the context are indicated by an asterisk, which can be clicked upon to access the information in the form of a system message (see also the small graphic above). Time constraints did not allow optimising the representation technically and aesthe- tically, and we thus apologise for any instance, where these fields do not function properly, due to either a minor fault in programming (FMW) or in the annotation (BZ).

As the application can no longer be hosted on the SFB server; the set up on the new host server is under preparation. The contact person for this application will be Frank Müller-Witte (email: frank.mueller-witte[  ]uni-tuebingen.de). For details of the annotation you may contact Bettina Zeisler (email: zeis[  ]uni-tuebingen.de). Note that the representation has been adapted only to Firefox and may not function properly in other browsers.

Four small corpora are presently available:

Old Tibetan:
OTC: The Old Tibetan Chronicle Chapter I
RAMA: Fragments of the Tibetan Rāmāyaa (preliminary annotation)

Classical Tibetan:
TVP: Die tibetische Version des Papageienbuchs

Contemporary Ladakhi:
LLV: A Lower Ladakhi version of the Kesar epic

see also the metadata below.


top
project page
field work
publications
documentation of the annotation scheme 
presentation
note on translations
metadata
tree views
CLaRK tree representations (overview)



A note on the translations


All corpora are supplied with translations for those readers who are not well acquainted with Tibetan. It was not our aim to provide new translations, nor did we have enough time for this task. For this reason we provide the original translations, which, in two cases, happen to be in German. While the translation of the LLV by Anna Theodora Francke needed only small changes to fit into the annotation scheme, the translation of the TVP by Silke Herrmann turned out to be quite problematic and we had to interfere more often than we wished. We have nevertheless kept as much of her translation as possible, respecting the freedom of the author, without, however, underwriting all her solutions. Our changes are marked by square brackets.

Similarly, we had planned to use Bacot et al.'s French translation of the OTC. Nathan W. Hill, however, whose main task was the annotation of OTC, was eager to provide a new translation, and since the OTC constitutes a particularly difficult text, this was accepted on the condition that the translation reflects the annotation (or vice versa) so that the translation could be a useful tool in the process of annotating. Unfortunately his translation (published 2006 in the Revue d'Etudes Tibétaines 10: 89-101) does not reflect the annotation. According to the intentions of the author (cf. p. 89, note 2), it also does not, with only one exception, reflect any of the discussions in the project. Since the earlier, pioneering translations, were, at crucial points, also not much better, we eventually decided to provide yet another translation, a translation, however, which does not strive for literary elegance and originality, but is as faithful to the structure of the original as possible. Thus no attempt was made to smoothen out the long chains of intertwining non-finite clauses. We think, however, that this representation has at least the benefit to immediately show the different strategies of representation, such as the mere enumeration of (possibly historical) facts in short simple sentences in § 6, which stands in sharp contrast to the more condensed and complex mythological narration in § 5, which consists of only few sentences, but a lot of embedded structures. Like in literary German, complex sentences may be helpful to represent complex situations, but they may also be used to veil facts and reasons (or the absence of these). And they may be prone to linguistic accidents.

Despite, or perhaps because of, sticking slavishly to the text and the grammatical rules, we came in several cases to quite different interpretations than the previous translators. Since our linguistic as well as historical insights might be of some interest to the students of Tibetan history, we shall provide here in advance a version of the annotated translation as pdf.


top
project page
field work
publications
documentation of the annotation scheme 
presentation
note on translations
metadata
tree views
CLaRK tree representations (overview)




Metadata
(If diacritics are not properly displayed, select the UTF-8 charset on your browser.)

Language

Siglum

Text

Annotators

Old Tibetan ca. mid 7th – mid 11th century

OTC

Fonds Pelliot Ms. 250 / Pelliot tibétain 1287,
commonly known as the Old Tibetan Chronicle, ca. mid 9th century, found in Dunhuang (Central Asia), the original is kept in the Bibliothèque Nationale de France.

Responsible for the annotation: Bettina Zeisler (BZ),
pre-annotation until v155: Nathan W. Hill (NWH),
revision and final section: Bettina Zeisler,
translation: Bettina Zeisler, on the base of Bacot & al. 1940: 123-128, Haarh 1969: 402-406, and Hill: 2006: 89-99.

Sources

Annotated text

First publication: Bacot & al. 1940: 97-122.
Faxsimile edition: Spanien & Imaeda 1979: pl. 557-577.
Digital text:
http://otdo.aa.tufs.ac.jp/archives.cgi?p=Pt_1287.
Printed version of the digital edition: Imaeda & al. 2007: 200-229.
Further editions consulted:
Wang [Dbaŋrgyal] & Bsodnams Skyid. 1992: 34-66, Gñagoŋ Dkonmchog Tshebrtan. 1995: 16-23.

Chapter I (l. 1-62; Bacot & al. pp. 97-100; Macdonald & Imaeda, pl. 557-559, Imaeda & al.: 200-202)
    6 divisions (paragraphs)
101 sentences
232 clauses (225 verbs)
733 tokens ('words')

RAMA

Text

Annotators

Short version of the Indic epos Rāmāyaa several fragments of basically two recensions (Bibliothèque Nationale de France and British Library), probably late 9th or 10th century

Preliminary annotation: Bettina Zeisler with input from Nicola Westermann

 

Sources

Annotated text

Text and translation:
de Jong, Jan Willem. 1989.

92 clauses of fragment E

Language

Siglum

Text

Annotators

Classical Tibetan ca. 11th – 19th century

TVP

Die tibetische Version des Papageienbuches, a 15th century adaptation of the Indian narrations of the parrot, styled as stories about previous reincarnations of Atisha and his disciples.

Responsible for the annotation: Bettina Zeisler (BZ),
pre-annotation: Kristin Meyer (KM)
revision of annotation and published translation: Bettina Zeisler with input from Frank Müller-Witte (FMW)

Sources

Annotated text

Text and translation:
Herrmann, Silke. 1983. (HE)

ca. 40% (fol. 261v1 - 268v5, pp. 43-48)
    11 divisions plus 13 subdivisions
  415 sentences
  903 clauses (849 verbs)
2669 tokens ('words')

Language

Siglum

Text

Annotators

contemporary Ladakhi, West Tibetan

LLV

Gšamyulna bšadpai Kesargyi sgruŋs bžugs. A Lower Ladakhi version of the Kesar saga, collected around 1900.

Digitalisation (typing): Namgyal Nyima Dagkar (Bonn)
Responsible for the annotation: Bettina Zeisler

Sources

Annotated text

Text:
Francke, August Hermann. 1905-41.

Translation:
Francke, Anna Theodora. 1992.

Chapter I (pp. 1-16)
    12 divisions
  589 clauses (585 verbs)
  295 sentences
1926 tokens ('words')


References


top
project page
field work
publications
documentation of the annotation scheme 
presentation
note on translations
metadata
tree views
CLaRK tree representations (overview)




Tree views

We include some tree graphics (png) generated with the help of the CLaRK tool, which Bettina Zeisler (BZ) used for the annotation. Here again, we faced the problem that CLaRK is not able to open up larger trees, thus we had to divide the text according to the division structure, and sometimes even into smaller parts. Links that go beyond these sections cannot be represented. Further more, the graphics are no longer searchable or dynamic; the dynamic representations in CLaRK itself (very useful for smaller structures) are usually minimised to absolute illegibility. On the other hand, CKaRK allows to redefine the trees for special purposes and to highlight individual properties. BZ has thus designed two colourful sets of trees, one showing the basic information (full structure of sentence, clause with clause categories, ntNode with ntNode categories, case, token, text plus interlinear version, part of speech), the other showing a somewhat reduced structure (sentence, clause, ntNode, token, text and interlinear version) plus the argument structure and the reference-relation of empty or ordinary anaphoric elements to their antecedent. The following colours are currently used for the reference links:

red: empty (obligatory) arguments
orange: omitted (non-obligatory) arguments
blue: demonstrative pronouns
dark green: personal pronouns
purple: emphatic pronouns
yellowish green: pronominal use of adjectives
dark purple: empty argument referring to an implied antecedens
dark green: omitted argument referring to an implied antecedens
black: invalid reference empty argument (the antecedens cannot be decided upon)
grey: NP-internal reference
example tree



top
project page
field work
publications
documentation of the annotation scheme 
presentation
note on translations
metadata
tree views
CLaRK tree representations (overview)





CLaRK tree representations (overview)

OTC argument structure

OTC clause structure

TVP argument structure

TVP clause structure

div 1.01           v0001-v0027
div 1.02           v0028-v0073
div 1.03           v0074-v0086
div 1.04           v0087-v0119
div 1.05           v0120-v0182
div 1.06           v0183-v0225
div 1.01           v0001-v0027
div 1.02           v0028-v0073
div 1.03           v0074-v0086
div 1.04           v0087-v0119
div 1.05           v0120-v0182
div 1.06           v0183-v0225

Narrative structure index:
green:
outer narrative frame
yellow: inner narrative frame
uncoloured: narrations

div 01 part 1    v0001-v0015
div 01 part 2    v0016-v0069
div 01 part 3    v0070-v0135

div 02 part 1    v0136-v0184
div 02 part 2    v0185-v0221
div 02 part 3    v0222-v0263
div 02 part 4    v0264-v0331

div 03              v0332-v0347

div 04              v0348-v0406

div 05 part 1    v0407-v0454
div 05 part 2    v0455-v0489
div 05 part 3    v0490-v0549

div 06              v0550-v0594

div 07 part 1    v0595-v0626
div 07 part 2    v0627-v0681
div 07 part 3    v0682-v0737

div 08              v0738-v0746

div 09              v0747

div 10              v0748-v0750

div 11 part 1    v0751-v0781
div 11 part 2    v0782-v0843
div 11 part 3    v0844-v0849
(incomplete)

Narrative structure index:
green: outer narrative frame
yellow: inner narrative frame
uncoloured: narrations

div 01 part 1    v0001-v0015
div 01 part 2    v0016-v0069
div 01 part 3    v0070-v0135

div 02 part 1    v0136-v0184
div 02 part 2    v0185-v0221
div 02 part 3    v0222-v0263
div 02 part 4    v0264-v0331

div 03              v0332-v0347

div 04              v0348-v0406

div 05 part 1    v0407-v0454
div 05 part 2    v0455-v0489
div 05 part 3    v0490-v0549

div 06              v0550-v0594

div 07 part 1    v0595-v0626
div 07 part 2    v0627-v0681
div 07 part 3    v0682-v0737

div 08              v0738-v0746

div 09              v0747

div 10              v0748-v0750

div 11 part 1    v0751-v0781
div 11 part 2    v0782-v0843
div 11 part 3    v0844-v0849
(incomplete)

RAMA argument structure

RAMA clause structure

div 01 v0001 (title)
div 02  v0002-v0010
div 03  v0011-v0062
div 04  v0063-v0092
(incomplete)

div 01 v0001 (title)
div 02  v0002-v0010
div 03  v0011-v0062
div 04  v0063-v0092
(incomplete)

LLV argument structure

LLV clause structure

div 0.01           v0001 (ch. title)
div 1.01           v0002-v0039
div 1.02           v0040-v0071
div 1.03           v0072-v0138
div 1.04           v0139-v0162
div 1.05 part 1 v0163-v0190
div 1.05 part 2 v0191-v0267
div 1.06           v0268-v0336
div 1.07 part 1 v0337-v0377
div 1.07 part 2 v0378-v0447
div 1.08           v0448-v0462
div 1.09           v0463-v0467
div 1.10           v0468-v0492
div 1.11           v0493-v0575
div 1.11           v0576-v0585

div 0.01           v0001 (ch. title)
div 1.01           v0002-v0039
div 1.02           v0040-v0071
div 1.03           v0072-v0138
div 1.04           v0139-v0162
div 1.05 part 1 v0163-v0190
div 1.05 part 2 v0191-v0267
div 1.06           v0268-v0336
div 1.07 part 1 v0337-v0377
div 1.07 part 2 v0378-v0447
div 1.08           v0448-v0462
div 1.09           v0463-v0467
div 1.10           v0468-v0492
div 1.11           v0493-v0575
div 1.11           v0576-v0585



top
project page
field work
publications
documentation of the annotation scheme 
presentation
note on translations
metadata
tree views
CLaRK tree representations (overview)



Layout: Christoph Singer. Responsible for the content: B. Zeisler. Last modified: 20.12.2009