CCAT MorphGNT
This file is derived from the morphologically parsed GNT provided by UPenn's CCAT. James Tauber reformatted it for easier text processing, converted it to UTF-8 and corrected many errors he found over the last ten years while performing a number of linguistic analyses.
This file is made available under a Creative Commons by-nc-sa license. For attribution purposes, please credit CCAT and James Tauber.
Downloads
Each is about a megabyte.
- version 5.07
- version 5.06
- version 5.05
- version 5.04
- version 5.03
- version 5.02
- version 5.01
- version 5.00
Explanation of Format
First column is the book/chapter/verse. (Note that shorter ending of Mark appears after longer)
Second column is the part of speech:
- A- adjective
- C- conjunction
- D- adverb
- I- interjection
- N- noun
- P- preposition
- RA article
- RD demonstrative
- RI interrogative/indefinite pronoun
- RP personal/possessive pronoun
- RR relative pronoun
- V- verb
- X- particle
Third column has eight slots for parse codes:
- Person: 1, 2, 3
- Tense: Aorist, Future, Imperfect, Present, X-perfect, Y-pluperfect
- Voice: Active, Middle, Passive
- Mood: D-imperative, Indicative, N-infinitive, O-optative, P-participle, S-subjunctive
- Case: Accusative, Dative, Genitive, Nominative, Vocative
- Number: Plural, Singular
- Gender: Feminine, Masculine, Neuter
- Degree: Comparative, Superlative
Fourth column is the form that appears in the UBS3/NA26 text.
Fifth column is the lemma or dictionary form.